Skip to main content

Predictive coding: how technology could help to streamline cases

Posted by: , Posted on: - Categories: Competition Act 1998 and cartels

Document with highlighted text and warning signs

Predictive coding has been trialled on some of our recent investigations to help us spot relevant evidence quickly. Complex cases involve reviewing vast numbers of electronic documents and this is where predictive coding can prove to be a useful tool.

How does predictive coding work?

Put, simply predictive coding is a computer assisted process, programmed to search a company's electronic documents to find those that are relevant to the case.

The predictive coding system works by labeling or "coding" documents as potentially relevant or not; this way our case team is able to prioritise reviewing the "relevant" documents. But for the system to be able to carry out this process, we first need to teach it to do so.

First, we have to upload all of the documents onto our system, so that we can manually review a sample set of files for specific words, phrases and structures that may show whether or not the law has been broken.

We will then label those sample documents as relevant or irrelevant, to teach the predictive coding system to do the same for the rest of the documents. The system identifies and labels the documents as relevant or not within the remaining set of files.

By reviewing the success of the programme after every repetition, the process can be refined, allowing the system to improve its ability to accurately label the documents. This process allows us to prioritise the documents that the system has labelled "relevant" for early review.

Document files with warning signs

Building our capability

Competition act cases are complex and take a long time to resolve, partly because thousands of documents need to be reviewed to assess the evidence. As technology, improves, firms are generating more and more digital information by sending out a greater amount of electronic communications every day.

This was certainly the case in several of our recent musical instruments investigations where we collected over 10 million items of digital evidence. Even after filtering, this left us with over half a million documents to review. So, the team looked for innovative ways to enhance the review of the digital evidence.

As a result, working with in-house Digital Forensics and Data Technology and Analytics colleagues, we trialled predictive coding on some of the cases for the first time. In one case, predictive coding helped us to identify relevant evidence within a file, reducing the number of items that needed to be reviewed manually - speeding up the review process of the case.

We are now looking at how we can use predictive coding more in the future, to potentially help streamline the review process of other cases. This will allow us to complete them more quickly and move onto new investigations.

Sharing and comments

Share this page

1 comment

  1. Comment by Martin Solis posted on

    This is very informative. Thank you for sharing this article. Hope to read more like this.


Leave a comment

We only ask for your email address so we know you're a real person

By submitting a comment you understand it may be published on this public website. Please read our privacy notice to see how the GOV.UK blogging platform handles your information.