A Different Take on TAR
February 23, 2015
The e-discovery method known as continuous active learning (CAL) enables technology assisted review systems to continually learn and improve results. The process starts with a seed set actively selected through keyword search. The seed set is reviewed and coded and used to train the learning algorithm to score documents by their likely relevance. The highest-scoring documents that have not already been coded are reviewed and coded and then used to create the training set.
Based on this training set, the highest-scoring documents are again coded and fed back into the system. The system uses that added input to refine its rankings. This process is repeated until the review is finished.
Thus, there are two aspects to CAL. The first is that the process is continuous. Training doesn’t stop until the review finishes. The second is that the training is active. That means the computer feeds documents to the review team with the goal of making the review as efficient as possible and thereby minimizing the total cost of review.
The author describes a method of reinforcement learning that does not require training by a high-level attorney, and uses judgmental seeds and relevance feedback to continuously learn and rank throughout the review process, and is said to avoid problems of bias or incomplete coverage through its use of contextual diversity sampling. This allows the review to get started sooner while accommodating rolling document uploads, which is the norm in document review.
Read full article at:
Daily Updates
Sign up for our free daily newsletter for the latest news and business legal developments.