
This week’s pod is episode two of our mini-series on artificial intelligence in international law. We’ve covered how AI performs on the battlefield and are now shifting focus to what comes after: how AI tools are being used in international criminal investigations.
It’s been three years since the International Criminal Court’s Office of the Prosecutor introduced ‘Project Harmony’—an AI-integrated, cloud-based evidence management platform aimed at streamlining the investigation process. Partly funded by the European Union, Project Harmony embeds the American AI-powered legal platform ‘RelativityOne’ in the court. It stores and analyses evidence, and is in turn reliant on Microsoft’s cloud and AI infrastructure. It is yet another example of the ICC’s dependence on US Big Tech, and a pointed one, given last year’s US sanctions on individuals at the court. With the ability to parse through vast amounts of documents and data, AI is being used for facial and object recognition, pattern identification, translation, as well as processing, analysing, and categorising evidence. The ICC’s adoption of Project Harmony is just a continuation of a broader trend in domestic jurisdictions. Ukrainian law enforcement, for instance, is already using AI tools supplied by US company Palantir Technologies to document and investigate potential war crimes committed by Russian troops. It’s tempting to dismiss AI advancements as mere optimisation, or even to praise them uncritically for their efficiency. The OTP has a duty to establish the truth and to investigate both incriminating and exonerating evidence. But what happens to this duty when the evidence is being sifted through by AI tools?
To help us understand the promises and pitfalls of AI-powered investigations, we spoke with Marta Bo and Benjamin Thorne. Marta is a Senior Researcher in International Law, Artificial Intelligence, and Military Technologies at the TMC Asser Institute, co-President of the Antonio Cassese Initiative, and a friend of the pod via her work on ‘Legal Autonomous Weapons: 10 things we want to know’, which we helped produce. She gave us insight into the types of AI tools we’re really talking about, as well as the current ethical concerns regarding automation bias, errors of omission, and greater questions of transparency and intelligibility. Benjamin Thorne is an Associate Professor in Criminal Law at the University of Reading and a critical voice on the use of technology in international criminal justice. Benjamin introduced what he terms the ‘Digital Accountability Ecosystem,’ drawing out the different investigators relevant in this space, and providing an explainer on systems such as RelativityOne.
The overarching conclusion? As with anything AI-related, concerns mount when human oversight and verification are removed from the equation. For further reading on AI in investigations, check out Marta’s paper on Artificial Intelligence in the Prosecution of International Crimes and Benjamin’s blog post on ‘Artificial Sanctions: Potential Implications of US Sanctions on the ICC’s use of AI and Digital Evidence.‘ As for recommendations, while concerned about the OTP’s AI-reliance, Marta also cares about our listeners; she recommends reading Will Knight’s recent article on WIRED on how ‘Using AI for Just 10 Minutes Might Make You Lazy and Dumb.’ For “lighter” reading, check out Benjamin’s recommendation: The AI Mirror: How to Reclaim Our Humanity in an Age of Machine Thinking, by Professor Shannon Vallor.


read a transcript of this episode
Disclaimer: Asymmetrical Haircuts is produced as a podcast, meaning it is meant to be listened to and not read. Because of this, we recommend that you listen to the episode while reading, because the written word does not do justice to the emotion or tone used by our speakers. However, because we recognise there might be bandwidth issues or you might be using a hearing aid, we have provided written transcripts for all our available episodes.
[INTRO TUNE]
Steph 01:20 Hi, Janet.
Janet 01:21 Hi, Steph. So, do you spend any time reading those really glossy annual reports that come out from the International Criminal Court? You know, the ones from the Office of the Prosecutor that get launched at the Assembly of StateParties every year?
Steph 01:38 Boy, do I ever. I do read them, and I try to parse them for every little bit of information that I can possibly glean. And it’s not very much.
Janet 01:48 Yeah, well, good luck with that. Yeah, they’re so glossy that they just gloss over everything, in my view.
But do you remember that there was—I think it was about two years ago—there was this line in one of them that said that the Office of the Prosecutor (OTP) was going to become a “leader in technology,” and they were making a big investment in a particular tech called ‘Project Harmony,’ as far as I remember. And that this was a way they were going to improve the management of evidence. Did that-?
Steph 02:18 Absolutely. I actually did a deep dive into it because I wanted to do a story about how they were going to use AI in parsing through evidence. But what I remember from it was that it was very general, and that they didn’t want to tell me very much more in detail about what they were doing. And so, I couldn’t, in a way, sell it to my editors to do a story about it. So, I tried and failed.
Janet 02:41 Yeah, well, here’s a chance for redemption. How little we know; we will put on show. But yeah, welcome to our lives as journalists, asking questions and getting only minimal answers.
So, that in itself has been at the back of my mind for a little while. And then when somebody, about six months or so ago, mentioned a thing called ‘Relativity’ to me.That’s a system that’s apparently being used by some organisations. It belongs to Microsoft, and we’ll put a link to that in the show notes. And that’s all about how AI—Artificial Intelligence—is being used in connection with legal data. So, that started me really thinking about what are the systems that everybody is using? What are the ways that AI is being used? What do we need to know about? And how is machine learning coming about?
Steph 03:27 One of the things I’m really interested in is how do you tag these databases? How much does this cost? How effective is it? They made a big deal of kind of overhauling and modernising the OTP evidence presentation also during trials. But of course, we also know that now there are issues with Microsoft and the ICC, and they’re moving away from it. So, what are they going to do with these systems is another kind of more current point that I’m sure we will get to.
Janet 03:54 Yeah. We’re actually recording this podcast as the second part of our tiny little toe-in-the-water series about Artificial Intelligence and International Criminal Justice.
Steph 04:04 Yeah, I really enjoyed our previous episode where we talked with Jessica Dorsey and Elke Schwarz about AI targeting.
Janet 04:13 So, this time, we’re going to talk specifically about investigations and AI.
Steph 04:18 To help us out, we have a friend of the pod and former podcast producer herself, Marta Bo of the Asser Institute. Hi, Marta.
Marta Bo 04:47 Hi, Janet. Hi, Steph. Thank you for having me.
Janet 04:30 And we have a new person on the podcast, but very journalist-friendly, Benjamin Thorne, who’s from Reading. Hi, Benjamin.
Benjamin Thorne 04:37 Hi, both of you. Thank you for the very kind invitation to participate in this conversation.
Steph 04:43 The way to kick this off, probably, is to start the same way we did our previous podcast, to ask you, Marta, to define what we mean by AI? What is AI in this context?
Marta Bo 04:56 That’s a great starting point, Steph, because AI is very much an umbrella term. If we don’t unpack it, we risk talking past each other.
So, on the one hand, we have rule-based systems. They operate on the basis of an ‘if-then’ logic. So, if we imagine an AI system that is programmed, for example, to identify words like ‘attack’ or ‘orders’ or ‘weapons’ in documents, and then, based on this identification, it would flag a certain document as relevant for a war crime investigator.
So, this is very different from machine learning systems. A machine learning system,in this context, would be trained instead on the basis of documents that are labelled as relevant for war crimes investigators or not relevant for war crimes investigators. So, it would learn from patterns in this document. So, instead of identifying keywords, it would also learn that, for example, ordering to carry out an operation and not only an attack would basically be the same thing. So, it would identify such a document as relevant. Machine learning systems are very good at pattern recognition, and they’re much more efficient than rule-based systems. At the same time, they’re more opaque, and I’m sure this is something that will come up more as a problem of these systems in this podcast.
Now, on top of machine learning systems, we also have LLMs, Large Language Models. So, the technology that is behind ChatGPT, for example. And these models, they analyse language, and they provide analysis of legal language, for example.Based on LLMs, we also have an increasing role of agentic AI, so AI assistants that can help in a variety of tasks. So, they not only react to prompts, but they can actually perform planned steps, and this could potentially be very useful for investigations as well.
Janet 07:17 Okay. So, there’s a lot of different details in there. What’s the most important thing for us to understand as to how this relates to international criminal justice investigations?
Marta Bo 07:30 Yeah, I think what is important is to what tasks in criminal proceedings these, or different types of AI that can often be combined, are applied to.
So, international criminal proceedings, as you know, are very complex proceedings.They involve hundreds of victims, witnesses, hundreds of pages of witness testimonies, video footages, intercepted communications. And investigators do not need to only prove that a single murder has been committed by an individual soldier. On top of the underlying crimes, they have to prove contextual elements, as you know, namely that crimes were part of widespread or systematic attacks, or that they were carried out in the context of an armed conflict.
So, these are very much linking exercises. You need to link several crimes, victims of crimes in different geographic parts of certain countries. So, you have to connect the patterns of crimes and you have to connect individual perpetrators also to high-level perpetrators. In these linking exercises, AI can be potentially very useful. AI is very good at pattern recognition in these linking exercises and in processing, analysing, and categorising huge amounts of evidence. These are not just hypothetical applications, but this is how AI is being used right now.
So, in Ukraine, the Office of the Prosecutor General has been using a platform provided by Palantir to integrate different sources of evidence, satellite imagery, and documents. So, this is what this platform is very good at, integrating data. And in the context of international criminal prosecution, this can be very helpful, for example, to map command structures or to map the patterns of crimes.
Steph 09:40 So, Marta mentioned some of the investigators of some of the offices who are working on things with these models and with this AI-assisted kind of searching through the evidence. Benjamin, are there any other investigators worth mentioning who would you like to lift out?
Benjamin Thorne 09:58 Sure. So, I think obviously the ICC, like many things to do with international criminal justice, takes up a lot of space and perhaps oxygen, and perhaps rightfully or wrongfully. But I think in these conversations, when you’re asking about sort of different types of investigators, we are talking about core international crimes and international criminal justice, but I think this also overlapswith the connected field of transitional justice as well, because there are a number of organisations who are using different types of AI. And some of that might be in terms of criminal prosecutions, but some of it might also be to document human rights violations that also might relate to core crimes, such as crimes against humanity. And I refer to this in some of my current work as this ‘digital accountability ecosystem.’ And I think when we’re talking about investigators, there are a number of different actors with sometimes shared purposes, perhaps also sometimes competing agendas as well.
So, I think when we’re talking about this ecosystem of investigators and those who are interested in AI technologies and digital evidence, I think it’s not necessarily a homogeneous whole, but I think it’s perhaps more useful to think about it as these different component parts, one being investigators within the OTP, but they’re not all moving in the same direction. And this ecosystem is more fragmented, perhaps certainly diverse, and at times contested. So, in terms of more narrow criminal investigators, we have people like I just mentioned at the ICC, but also Eurojust, also UN-relevant units such as the IIMM and the IIMM, and up until 2024, UNITAD as well, which is where Karim Khan came from before he took up his post as Chief Prosecutor at the ICC.
But we also have domestic war crimes units in different European countries, that to varying degrees, may be using some sort of AI tools. But then we also have, as we all well know, OSINT organisations and broadly related human rights organisations that have their own investigative teams, definitely some overlaps. Just to give some examples, OSINT for Ukraine, Bellingcat’s ‘Just Accountability’ unit, although that closed, I think it was late 2024 or 2025.
There’s ‘Witness.’ ‘Airwars’ is an interesting one, particularly in this context, as they gather a lot of geolocation data, for want of a better word, but they don’t gather it for a specific purpose in terms of a trial. My understanding is they gather it more to gather it, and then it could be for a variety of processes. But they quite publicly talk about how they’ve made a choice not to use AI. And I was at a talk where someone from Airwars was, I think it was last autumn, and they were saying that because these potential violations are very much a human process, the process of analysing that data should also be a human process, partly linking that to dignity. So, I think it’s quite interesting how there’s a lot of AI going on in these kinds of investigation processes by a variety of institutions and organisations.
However, there are a few, and I think Airwars is one of those that’s kind of resisting in some sense, the use of AI. We also have organisations like ‘Eyewitness to Atrocity,’which refer to themselves as a closed-source investigation organisation. Another one that I’ve become aware of relatively recently called Earshot. So, we often think of digital investigations and the use of AI around the visual, kind of what we see, whether it’s through geolocation data or a social media post, and around kind of visual verification of that. But also audio is a really big part of that. It’s an organisation called Earshot that does ballistics, for example. So, I think that’s kind of an interesting component of that, and they use AI within some of their work.
I think another part of this that just kind of often sits at the margin is, in quotations, consultants. So particularly human rights organisations who are involved in a lot of this work directly, they will sometimes, for want of a better word, outsource some of the analysis to external consultants.
So, whether we think of those consultants, perhaps or perhaps not as part of who we consider investigators. Then we have organisations such as what I refer to in some of my writing as ‘activist archives.’ You may well know the group ‘Mnemonic,’ which is kind of an umbrella for archives such as the Syrian Archive, Yemen Archive, etc.
And interestingly, we think of OSINT organisations as kind of this big thing that’s happening in terms of documenting and contributing to evidence for possible trials. But organisations such as the Syrian Archive kind of predated a lot of the work that a lot of the OSINT groups are doing. But groups such as the Syrian Archive have been—for more than a decade—documenting and now using AI within that documentation process to prepare that material for a justice yet to come.
And I don’t know if we can even include, perhaps, journalists as investigators. So, I think you might agree or disagree with some of that. But I think it is useful to try and nuance what we mean by investigators and also the kind of work that they’re doing.
Steph 15:31 I think for this podcast, we’re probably going to focus mainly on the kind of courts that we follow in direct investigation of war crimes, just because we need to narrow it down. I wanted to ask Marta specifically because you referred to that the Ukraine Office of the Prosecutor General was using AI, and specifically Palantir for some of their war crimes investigations, which is remarkable to me because Palantir, we talked about it also in our other podcast about AI targeting, it’s been criticised for its kind of deep involvement in the military. So, I wonder if there are concerns around some organisations more than others with AI and what you should use as a kind of diligent war crimes investigator if we’re talking about that group of people.
Marta Bo 16:20 I think Palantir is a very interesting example because it has been involved in the science context, the law enforcement context and being heavily criticised, as we all know, for its involvement in many of the ICE operations. I think what this example also illustrates is how this technology can move from a military context to a law enforcement and criminal justice context, because the way it started to enter Ukraine was primarily for military purposes. So, for military targeting, and then it transitioned also to other uses.
But this, of course, in terms of legal safeguards, presents some issues, because the guardrails in terms of human rights, etc, that you would have in a law enforcement context would be different compared to a defence context. But if I can go back a bit to what Benjamin also highlighted, which I thought was extremely interesting, is that because of its inherent characteristics, international crimes involve so many institutions, because they involve so many communities, because of their gravity impacts. So ideally, it would be desirable to have more coordination, also in terms of how AI is implemented within this context, because you would then havestandardised procedures for accessibility, reliability checks, etc, etc.
Instead, the tendency is very much the opposite. Every organisation, every national authority, is developing bespoke AI systems that work within their environment. This is also to protect confidentiality, of course, because in criminal law context, this is a paramount concern.
So, we have this tendency to develop AI systems working only in Italy, for example, and within the Italian judiciary. And of course, when evidence is processed through one institution’s AI pipeline, they’re categorised, translated, summarised, they may arrive in another institution in a format that is incompatible, or partially processed, or maybe metadata is stripped. So, this just creates an enormous amount of compatibility problems and admissibility problems in proceedings for which, by definition, institutions should be cooperating and exchanging evidence as much as possible.
Janet 19:19 Steph picked up Palantir. I wanted to pick up the thing that I mentioned at the top of the pod, which was this thing called ‘Relativity,’ because I’m wondering whether that’s becoming some kind of standard, understanding that it’s been developed very specifically for judicial purposes. And the context in which it was spoken about to me was me asking: “Where is all the UNITAD evidence?” All the stuff out of Iraq that was gathered, which you know, some of it is going for various trials across Europe. And somebody said to me: “Oh, don’t worry, it’s in New York, it’s been entered into Relativity.” The problem is that there’s no person to push the buttons in to extract it from Relativity and to send it around in its different formats to the different spaces where it’s needed. You need a budget to be able to run Relativity, but the evidence exists there.
So, I was wondering, Benjamin, since Marta did Palantir, do you want to do Relativity for us on any other systems that you think we need to know about?
Benjamin Thorne 20:22 So there’s Relativity, there’s also Accenture as well. So, if you think about Project Harmony, it’s this kind of pyramid, if you like, main organisations that are, that partnered with ICC. So, we have Microsoft, we have Accenture, and then ‘Avanade.’ Avanade is kind of consumed within both of those. It was partly funded and funded by Accenture. But then within those kind of three partners of Project Harmony, you do, as you rightly say, a big part of that is Relativity.
Maybe it’s just helpful, very briefly, to say that Project Harmony has three components. It has the OTP link, which is this kind of portal, comes under kind of Article 15 communications, where in principle, any stakeholders or anyone can upload information. That’s increasingly being kind of potential evidence, if you want to call it that.
So that’s why I think we were talking about investigations a few moments ago, why organisations like OSINT and Human Rights are quite important, because they are doing that. So, they are uploading information to the OTP link. In terms of relativity, they kind of, as I understand it, and again, as Janet alluded to at the very top of the podcast, it’s difficult to understand what’s happening, because it’s very secretive, is also my experience.
In addition to OTP link, there’s two parts. There’s the E-Vault, which is more of the Microsoft part, the storage that people started panicking about a lot when there werethe sanctions, and kind of access. And it was reported that within the OTP office, there were apparently, some people printing off dossiers because of fear they might not be able to access that. That’s the Microsoft part.
Janet 22:06 The sanctions that you’re referring to, Benjamin, are the US sanctions against certain individuals at the ICC, with Stephanie’s amazing reporting out of Reuters to suggest that there is actually a sort of Damocles held over the whole institution, and who knows. So, I assume that’s what you’re referring to.
Benjamin Thorne 22:23 Exactly that. And what I think is increasingly coming, perhaps maybe kind of a web of sanctions that people are fearful to trip, and so they’re kind of over compliant. Yes.
Interestingly, that’s kind of died down a little bit—those conversations. It’s not saying they’re not going to come back, but it has kind of settled down a little bit, maybe since the very end of last year, in terms of actual sanctions coming out of the US administration. That certainly is a big issue.
And then you have eDiscovery, which is where some of the AI tools are in terms of analysis, and that’s, as I understand it, where Relativity comes in. Relativity, as kind of alluded to by Marta, some of this stuff comes from a domestic criminal justice setting. So this, for example, Relativity tools are used within the criminal justice system in the UK, for example, relating to police body cameras, also in-court video technology as well.
So, I think, again, going back to one of your comments earlier, Janet, when you were saying there’s this big claim that the ICC will become a global leader in accountability technologies, which was part of its strategic goals of 23-25. I mean, that’s just the usual ICC chest-beating, for want of a better word, perhaps. But I think it does also have real-world implications, a bit like sometimes it might over-claim to do things around victim participation.
I think, yes, you can just say, well, it’s just the ICC being the ICC, but it does make these big claims and people do interpret that. And maybe that is kind of, again, over-promising to go back to old issues with the ICC. But going back to Relativity, there are these connections between the domestic criminal justice system and that being kind of used. So, it’s not to say the ICC in the future won’t perhaps develop its own technologies or adapt those. But at the moment, a lot of what’s happening, from my understanding, is kind of being taken from the domestic criminal justice and being adapted or adopted in a variety of ways, perhaps slightly more bespoke ways going forward.
And in terms of Marta mentioning military connections, my understanding of Accenture is a beast of a technology company. If you have a look at it, it’s always different. It has metaphorically lots of fingers and lots of pies, and they have a lot of contracts in the US military. I think my understanding is also NATO. So they’re involved in this. It also kind of sees that connection between military application and justice application as well.
A few conversations I’ve had around Relativity with people inside the OTP, and again, these are informal conversations rather than formal research interviews, but it is that the systems they use are a bit more fragmented. So if we’re thinking around story analysis and disclosure, there are a few different systems, and they’re not all compatible with things such as relativity.
Steph 25:11 I still have, although I understand what you’re saying, my very basic question is, what does relativity do? What kind of programme is it? Should I envision it as a kind of- Is it supposed to be like an operating system or like this kind of one-stop shop, what Microsoft does with Office? Is it the idea with Relativity that they offer some kind of suite of programmes that you as a legal officer or an investigator could use, and it includes, I don’t know, a word processor or a filing system and all these things, or is it AI specifically? Because I still don’t know what Relativity does. I understand the court management system. If you have a court case, I know there’s all those e-court things and you share documents and all those, but what is the product that Relativity is in a very basic sense?
Benjamin Thorne 26:03 It is a kind of e-discovery, so I think that has a variety of tools. How it’s used at the ICC, again, I’m not certain from my conversations, it’s used to do with things such as facial recognition and object recognition, so if it sees a particular object in an image, then it can compare that to a similar object in a different image as well. As I understand it, it also has some text-based analysis. So again, with transcripts, for example, it can then map certain words as well. So those, as I understand it, in the ICC, that’s what it is. But yeah, to do with visual facial recognition, identifying words within text and transcript as well.
So the ICC use this, for want of a better word, acute language of evidence management platform, but even I think in that process, before we get to what we might think of the conventional analysis, such as facial recognition, there is also, my understanding is, there is AI involved in the sorting and cataloguing, and then that might potentially create issue or have issues relating to bias. Or also just things, if a document’s been titled with an odd title by whoever did it externally, is there issues there as well? So I think AI, yes, in the analysis, using tools provided by Relativity, but also the platform that also has AI, and I think we need to think about that as part of the kind of the AI terrain as well.
Janet 27:39 Mata, quite a lot of what we’re discussing sounds to me very mundane, to be perfectly honest. I mean, I think we need to put something on the front page of our podcast to say we use AI, for example, for transcription. Sometimes even when I’m transcribing something in Ukrainian, it does an automatic translation for me, which I don’t necessarily think the translation’s perfect, but thank you. That’s good, isn’t it? So at least I know I’ve got the basics of what somebody has to say.
So I kind of think of this as like, yeah, that’s normal, and it’s just like an assistance tool in that sense. But I understand that like the last podcast that we did on AI, we specifically discussed the ethical issues. So, are there a lot of ethical perils, let’s say? I understood that one of the judges at a meeting that you were at described them as “ethical perils”—Julia Motok used that term. So, what do you think? What are the problems with this?
Marta Bo 28:40 I think there are both ethical and legal problems. Because what we have been discussing so far, for example, are still uses of AI that require some sort of human verification, right? So, we’re thinking about facial recognition systems that helps skim through thousands of photos. And basically, at the end of this process, you would in principle have also someone verifying the whole process.
So for me, the first problem in relation to the idea of human verification, that we would still use AI with human verification is basically that, often to maintain this human judgement and this human capability to understand what these systems are producing and are recommending to us, it’s quite difficult. Because there are proven studies that prove that there is cognitive bias when we use the systems that might lead to over-reliance on AI output. Now, this means that the errors that AI itself can produce, such as, for example, missing key information, missing exonerating evidence or exonerating information that the prosecutor has the duty to investigate, might be missed.
So, there are inherent errors in what AI can do and can produce, and if we couple this issue with the cognitive problems and the automation bias issues, this might lead to critical errors in investigations and decision-making. So, on the one hand, AI can help process massive amounts of evidence and a large volume of data, but my core argument is that these tools also introduce a serious risk in terms of missed identifications, misrepresentation of data, failures to detect crucial evidence, including exonerating evidence. Under Article 54 of the Rome Statute, the prosecutor has the duty to establish the truth and to investigate both incriminating and exonerating evidence equally. So, if an AI system leads investigators to overlook evidence or materials that might be exonerating, this obligation might not be fully met. Now, this is not a fully fledged violation of disclosures but could still have serious consequences for the investigations.
Now, if we think about this and if we consider the transparency problem around what type of AI is being used, for which task, for which purposes, for which uses, basically, AI has been implemented, and if we consider that AI is often difficult to understand, unexplainable, it might be extremely difficult for the defence to understand and detect, for example, if something has gone wrong, if something has gone missing, and if there was a violation of the duty to investigate exonerating evidence.
So, the main takeaway for me is that we should not be rejecting the use of AI, but its use needs transparency around what has been used, and only after that we can start thinking about safeguards, guardrails, disclosure protocols, etcetera, etcetera, in order to safeguard the rights of the defence.
Steph 32:35 My question is, can we ever expect transparency from AI systems as a kind of general rule? Because they are commercial systems, that kind of secret sauce is hidden and not public. It’s run by commercial companies. It’s not an open-source situation. So, Benjamin, take it away.
Benjamin Thorne 32:54 I think these are still human decisions, whether there is transparency or not. Whether it’s in the source or in the tools and how those tools are being used, it’s still always, these are just systems and algorithms, if we’re going to boil it down to its essence. So, I think it’s always a human decision, choices made or not made, when it comes to transparency and other ethical-related issues. And I think a lot of this conversation around transparency comes to legitimacy as well. And if we don’t have transparency, it really potentially impacts on legitimacy. Legitimacy in two interconnected ways is the main way I see it. Legitimacy of the process of how AI is used, but then also legitimacy of the outputs that that process produces.
So, if we don’t have transparency of what exactly those tools are, how they’re being used, and also the algorithms were embedded within those tools and systems, I think it really raises legitimacy concerns. And so, I think, yes, we can say, oh, these are not open-source systems, but I do think, to some extent, there is a responsibility of institutions such as the ICC. Sadly, they’re not the only ones. But, I mean, there is a responsibility for them to engage more with transparency. These are kind of ethical-related questions, as I see them. And transparency is one of those ethical-related questions. And I think they do have a responsibility. I don’t think it’s good enough just to say: “Oh, we can’t discuss some of this because it could affect investigations.”Maybe there are instances where they can’t talk exactly about what they’re using or the algorithms behind it. There are times when I think they should be talking about that. And I think they hide behind this secrecy.
I would just add to that they are not the only organisation that is quite secretive around how they use AI tools. And I think this goes back to, like I said, near the beginning of the podcast, that there is some kind of competition between different institutional organisations that are, in some ways, directly involved with documenting human rights violations that relate to core international crimes. So, that secrecy and lack of transparency go beyond the ICC.
In terms of conversations I’ve had with a couple of defence councils working in the ICC, one of their concerns is that they don’t know how the OTP is using AI. So, again, it goes back to transparency. So, that is a concern for them. But they feel very much out of the loop as well. And that does raise questions to do with procedure and human rights. Also, things we talked about, that content itself that is AI-generated or modified, and then how does institutions such as the ICC own AI tools recognise AI-generated content. So, again, this relates partly back to something like the OTP link platform where external organisations or stakeholders can submit potential evidence, some of which may be AI-modified or generated. Then how accurately can the tools that the OTP is using identify that? I mean, that’s one issue, saying that Marta very accurately talked about how we have these different organisations, institutions, this ecosystem, as I put it. And she was talking about how the technologies they use are not always compatible with other organisations and institutions. So, that is a big concern.
I think another related concern to that, where evidence moves between different organisations and institutions, is what I refer to as the layers of AI in terms of transparency. Let’s say Eurojust, who are using their relatively new system, analysed some potential evidence and have used some AI tools within that, and then it gets submitted to the OTP, for example. There is no audit of any kind showing the different types of AI that were used, say, at Eurojust and at the ICC.
So, I think we have these layerings or layers of AI, but very little transparency about that in the specific context of how evidence moves between and across different related criminal justice, international criminal justice institutions and organisations. And I think a part of that, a big ongoing issue, is this lack of a more joint-up approach. There’s also, just in a very practical sense, duplication of efforts and resources as well, that we might have two or more organisations or institutions doing the same sort of analysis, which I think is problematic as well, particularly in terms of the resources, arm and efforts that these things take to do anyway.
I just want to pick up something that Marta said around human verification, which I completely agree with. Just to add to that a little, I think there’s one conversation around, there always has to be human verification. We hear this language of “human in the loop.” There has to be space for verification, human intervention, etc. But my understanding is that some of these institutions, keeping the ICC as an example, aresometimes a first layer of initial analysis that gets done with AI tools. Then the human investigator looks at that, but the human investigator isn’t necessarily looking at all of that digital information. They might look at all of the digital information that AI has identified as potentially relevant, but there’s sometimes a first layer that a human hasn’t looked at all the potential evidence. So, I think when we’re talking about verification, I think, for me, AI tools are useful when a human is involved throughout the whole process. Yes, this means it’s a long process, but then I think there are risks relating to biases that come within the AI tools, but also biases that potentially become embedded within the patterns of the AI analysis itself.
When investigators are looking at digital evidence using AI tools, but they need to see all the evidence, which some might say is an unrealistic call, but I think otherwise we’re having an initial layer of analysis where the human might not see a piece of potential evidence, for example, a piece of evidence that AI has disregarded because it doesn’t meet the algorithm or patterns it has. But that might be useful, whether for the prosecutor or perhaps for disclosure reasons. And I think in terms of transparency, I’ve asked, and I had a few conversations around, are there any intentions to have an OTP policy on the use of AI and the ethical use of AI within its work, which I think is much needed.
And my understanding is that there is not. And I think that continues to be, for me, deeply problematic. There’s not a willingness to have a policy on the ethical use of AI. If the OTP is going to continue to use AI, increasingly so, I think that is something that should be addressed. All of this, I think, is to do with transparency, lack of transparency, and to also not assume that AI always needs to be used in these processes. I think sometimes they have a very useful role to play. But I think also it’s important for investigators to always be asking, why use AI for this particular task?
Steph 40:42 We’ve talked about a lot already, and we have touched on a lot of different aspects, but the kind of journalist in me is always looking for the more newsy bits. So I’m really-
Janet 40:54 Shame on you. Shame on you, Stef.
Steph 40:56 And I’ve been hammering away at these-
Janet 40:57 This is fascinating material, and I’m listening to it with great interest, but you go ahead and be a journalist. Go on.
Steph 41:03 I want to ask, though, very specifically about these sanctions, because we’re still, we’re talking very specifically about this, much of this AI, if not all, is US-based. There is some China-based, but that has its own problems. With these ICC sanctions, how realistic is it that the ICC and other kinds of European justice institutions can keep using this Relativity software or these Microsoft-based systems and store potential evidence on US, you know, US servers of commercial companies that are, in part, beholden to what the US government orders them to do?
Janet 41:43 And I’d say it’s not only relevant to the ICC, it’s also relevant to all of the other actors that Benjamin has run through, because they’re all having to think about whether they use US-based stuff. So, who wants to take this as a last question?
Benjamin Thorne 41:57 I think some of this is that there is like an appetite, it appears in Europe, to try and move away from some of this. I think it does seem perhaps that, and again, perhaps talking within, but also slightly outside of the criminal justice context, but I think these things relate in terms of how we might be seeing organisations like the ICC moving away from these things. But there is an appetite to take more risks. I think some of the systems, and I guess I hear I’m talking a bit more about cloud systems rather than the tools, but those things are sometimes interconnected anyway.
Some organisations and also domestic and regional governments are looking to take risks with untested, at-scale products because of what’s happening. I think also, as Janet’s alluded to, there are a number of organisations that have to consider this, but I think sometimes, particularly some OSINT organisations who are working specifically on gathering evidence for criminal trials relating to atrocity crime. They’re well aware of the issues with engaging with US and perhaps Chinese tech. However, their priority is to gather and analyse this information. That is their primary objective. And sometimes this comes down to really practical things like resources, and maybe they haven’t got the resources to look at and invest at this moment alternatives, which I think is ideal to say everyone should be moving away from US and Chinese tech, perhaps. But I think if you’re a small-scale OSINT organisation, it’s a very different kind of calculation compared to the ICC.
Just on the ICC and Relativity, a lot of Project Harmony, a big part of it was funded by the European Commission, I think it’s around seven million euros, and also some voluntary donations by state parties. So there is a question there: If the ICC is going to move away from Relativity, how are they going to pay for it? I mean, maybe they’ll find a magic money tree or something similar, as they sometimes do, but that is a consideration, I think.
So even for big organisations like the ICC is how they would fund the move away. From my understanding, the contract with Relativity ran from 22, and it’s due to expire 27. I’m not sure if that’s been renewed. And again, maybe one of you knows more about that. And also my understanding is that the money that came from the European Commission and voluntary donations from states was not enough for Project Harmony. So, they had to kind of dip into the kind of the regular budget a bit for that as well. I think they are both at the small scale, there’s different kinds of considerations whether to move away from US tech. But also for the ICC, they’re also, I think, perhaps slightly different, but also financial considerations. But we don’t know, I mean, I think is the bottom line here, when it comes to ICC, at least, because it is so closed. We talk about transparency in terms of the tool, but just transparency, the ICC, I think, is maybe something we would maybe all to some extent agree upon the lack of transparency. And that, I think, becomes a big issue when we’re talking to things around tools which have direct impact on the admissibility of evidence that could affect the prosecution.
Janet 45:19 Marta, why don’t you take us a bit inside the black box of the ICC, or wherever, and tell us more about what we need to know.
Marta Bo 45:26 I think a bit at the more higher level, what is important to think when institutions start incorporating these technologies is that they don’t really buy a product, a thing, but they buy a service. And this creates long-term dependencies, in terms of also, if we think about the updates, reconfigurations that are necessary, and the expertise that must be continuously supplied by AI suppliers to the end consumer. So over time, this really creates structural reliance on private companies.So this is also, you know, like blending what a criminal justice system is, and who has control over core tasks of criminal justice and investigations. So, of course, one solution could be European technology, technological autonomy. And I think we are hearing this a lot in the defence context. Europe has to rearm to develop. AI is now at the core of rearmament efforts within Europe. And you could make the same argument for the judiciary, but this would require an enormous amount of expertise, a massive amount of investment.
So I think to think a bit more on the shorter term, what is needed is internal knowledge, internal capacity within institutions, such as the ICC, to have internal expertise, in-house expertise, to understand the limitations of this system, and to understand at least the basic elements of how they work. Okay, explainability is a core problem of these systems. But in basic understanding of what are the inputs, what are the data that has been fed into the systems, how output are produced, so that users have a meaningful understanding of it, I think it’s necessary also to kind of decrease reliance on industry, especially since we are going more and more towards co-development of these systems, which then requires a relationship of trust with industry and training of the end users, which means ICC staff and investigators on how these tools should be used.
Janet 47:56 We’re definitely going to have to bring you both back in because we had another series of questions that we’re not going to get to. But we’re just going to ask our final question. We’re not going to ask all of them. The last one always is, is there anything that you have been reading, or listening to, or watching in the last wee while, either within the field or something completely different to get away from it that you would like to share with the audience that you think it would be interesting for them to know about? Marta, why don’t you kick us off?
Marta Bo 48:26 So my final point, since we’ve been discussing many applications and very limitedly AI assistance, for me it’s just a final comment on how cautious we should be, investigators should be, about the use of chatbots and AI systems, because all the risks that we have been discussing so far, especially around over-reliance, are more and more present with this agentic AI new trend, and judiciaries around Europe have started using chatbots and AI systems for quite a trivial task, like a draft part of the judgement, but there should be much more awareness and training around the use of these systems.
I think my final point is about a risk that is often not discussed enough, but it’s de-skilling, essentially, and it is highly concerning. To me, there are already many studies around how the use of AI assistance basically de-skills us, and slowly we lose our brain capacity and brain capability to solve even extremely easy tasks, so I think the problem of de-skilling should be put there.
There is a WIRED article from today, actually, that is very much about how actually it would be much better if AI assistance would tell us how to do things instead of giving us fully cooked replies, because this has an effect on cognitive functions, and if we think about the older generation of investigators and judges, yes, they can still perform tasks that they’ve been learning from here, but if we think about the young generation of legal officers at the ICC that have just joined an institution and they start using these tools without having first built their own knowledge on how to do investigation, how to do the mapping of cases, how to do these things, I think this is very dangerous.
Janet 50:41 What about you, Benjamin? Do you want to add in?
Benjamin Thorne 50:44 Just to say, I think Marta’s point about de-skilling is one of the most important things, I think, going forward. I’ll just quickly respond to what I’ve been reading in this context.
In terms of AI, and maybe it’s come across in my responses to some of your questions, a lot of the literature I engage with is the kind of more critical literature on AI, both in the context of international justice but also outside, and then thinking about some of those arguments as made more broadly around ethics and AI and seeing how they might apply to what we are talking about, international criminal law and justice.
There’s one book I’ve read recently and would recommend to anyone who’s interested in AI in this context or outside, is a book by Shannon Valler, who’s an ethics and AI philosopher based at the University of Edinburgh and her book, The AI Mirror. I don’t know if any of you have read it or not, but it’s AI Mirror: How to Reclaim Our Humanity in the Age of Machine Thinking. She talks a lot about choices when it comes to how governments and institutions are using AI, kind of at the design stage and how AI is designed, and that’s about choices for it to act and be certain ways, and there’s other choices that could be made.
So in terms of thinking perhaps critically about some of these conversations we’re having today, that’s a book I’ve read and I would recommend.
Marta Bo 52:02 Which is another way of saying, let’s be concerned about bias. And I think we have spoken about it, but not so explicitly during the podcast, but a lot can be reconducted to the bias problem. So what Benjamin, you discussed at the beginning about the bunch of investigators and people involved in investigation; we have also to think about how all this preliminary material that has been documented, collected by NGOs, then is also used as training data for these systems, and how then the bias can be essentially fed into the systems and be replicated over the course of the proceedings.
Janet 52:52 We’ll just end it by saying thank you so much, and sorry for having to cut off the conversation. I mean, there’s so many aspects to this, but really Marta and Benjamin, thank you so much for coming on and giving up your time and explaining some of the issues to us.
Benjamin Thorne 53:02 Thank you both very much.
Marta Bo 53:03 It was a pleasure.
[OUTRO MUSIC]
This was Asymmetrical Haircuts, your international justice podcast, created and presented by Janet Anderson and Stephanie van den Berg, in partnership with the Hague Humanity Hub. Music is by Audionautix.com. You can find show notes and everything about the podcast on asymmetricalhaircuts.com. This show is available on every major podcast service, so please subscribe, give us a rating, and spread the word.Stay safe and enjoy your day.
Disclaimer: This transcript was generated using online transcribing software, and checked and supplemented by the Asymmetrical Haircuts team. Because of this we cannot guarantee it is completely error free. Please check the corresponding audio for any errors before quoting.
