header-mask
Insights / January 17th, 2024

Navigating AI Discovery Best Practices

The document review process in litigation is ripe for automation through use of AI. Courts have considered and approved predictive coding approaches proposed by litigants in place of traditional manual document review.

The OECD defines an Artificial Intelligence system as a machine-based system that can, for a given set of human-defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments.

Third party providers and lawyers have sought to lighten the burden of document analysis in advance of making discovery through automation of tasks such as document storage, numbering and indexing, identification of duplicates and identification by reference to word searches. These tasks, however, do not touch upon the decision making process which needs to be made as to whether a particular document is relevant for the purposes of making discovery.

The rise of predictive coding

Software seeking to go the further step has more recently developed, which is referred to by several names including “predictive coding” and “technology assisted review”. The AI feature of these applications is that the software is trained by the user (an experienced litigation lawyer) to make decisions about the relevance of documents and over time the software learns to make decisions quickly and efficiently about the relevance of documents.

In broad terms, as discussed in the English case of Pyrrho Investments Ltd v MWB Property Ltd & Ors [2016] EWHC 256, the software operates as follows:

  • The parties agree on a protocol, dealing with matters such as which electronic records need to be reviewed

  • A sample of documents (perhaps hundreds or thousands) within the body of records is reviewed by a lawyer involved in the litigation for relevance and that lawyer’s decisions as to the relevance of each document are recorded within the system

  • The software then considers, based on the training, whether each document within the whole of the body of records is relevant

  • A lawyer then considers at random documents identified by the software as either relevant or not relevant, indicating for each document whether relevance is established

  • The lawyer’s indications both affirming and overturning decisions made on relevance are used to train the software further, the hope being that the number of “false positives” reduces over time

  • Once a pre-determined level of validity has been reached the documents identified as relevant within the body of records stand as the discovery within that class of records

Whilst the attraction of such a process – which has been shown in certain circumstances to be cheaper than conventional manual review - is obvious to a person paying for legal services it is not possible for a litigant or lawyer to certify that discovery made on such a basis is correct and complete in a conventional way as neither will have reviewed every candidate document.

The Pragmatic Approach of Courts

Courts, however, are taking a pragmatic approach to this question. To an extent, this reflects allowing parties to determine questions between themselves. Increasingly, however, Courts may also be called upon to impose predictive coding obligations where the size of a dispute demands it over the objection of a party. An example of the pragmatic approach is where the Court permits discovery on the basis of achievement of particular relevance false-positive thresholds, rather than requiring parties and lawyers to certify comprehensiveness.

Whilst this is the case, the few published decisions considering predictive coding in discovery to date demonstrate that the software is at best a useful tool and not a foolproof solution. By way of examples, reported cases have needed to deal with situations in which parties initially following a collaborative model have ceased following that model1 and where evidence is given as to the difficulty in training software to determine the relevance of documents which considered complicated scientific questions.2

The promise of predictive coding is that software may be taught to consistently determine relevance for every document at the ability of an experienced lawyer aware of the issues in the litigation, though at a much lower overall cost than having that person review every document.

To date, however, this promise has only partially been realised. Whilst this is the case, the technology is nevertheless useful in reducing the burden on litigants in appropriate cases and points the way towards a new and more efficient way of doing discovery in larger cases.

What next?

Further developments in large language models and other AI advances may also be “just around the corner”; allowing for a larger step-change in discovery processes.

For further information regarding this topic please get in touch with Peter Leech of our Dispute Resolution team.


1 McConnell Dowell Constructors (Aust) Pty Ltd v Santam Ltd & Ors (No 2) [2017] VSC 640.

2 ViiV Healthcare Company v Gilead Sciences Pty Limited (No 2) [2020] FCA 1455, [142]-[143].


This publication has been prepared for general guidance on matters of interest only and does not constitute professional legal advice. You should not act upon the information contained in this publication without obtaining specific professional legal advice. No representation or warranty (express or implied) is given as to the accuracy or completeness of the information contained in this publication and to the extent permitted by law, Cowell Clarke does not accept or assume any liability, responsibility or duty of care for any consequences of you or anyone else acting or refraining to act in relation on the information contained in this publication or for any decision based on it.

Related Expertise