Batch Set Review of Text Duplicate Documents

IN THIS ARTICLE:

This use case document provides details on how to use MA's text duplicate identification output to the group and batch duplicative documents, based on an extracted text comparison, to help expedite review and bolster the consistency of document review assessments.

WHO CAN PERFORM:

You must have the following MA permissions:

Create and manage sets—can create and edit analytics sets. When disabled, it hides the Overlay and Edit buttons.

Overlay results—for overlaying results into Relativity. When disabled, it disables the auto-overlay toggle and the manual overlay buttons.

In addition to the above permissions, Relativity Object and other Relativity permissions must be enabled for full functionality.

What is Text Duplicate Identification?

"Text duplicates" are documents that have been identified and grouped based on the matching of text information, even if the native documents themselves would not hash identically due to native file formatting differences or white space.  This makes text duplicates more reliable for use with coding propagation and isolating pivots to review the full amount of unique information a text duplicate group has to offer. Each document belongs to only one text duplicate set and each text duplicate set contains at least one document.  Unlike the settings for the Near-duplicate Identification algorithm, settings for the Text Duplicate algorithm do not require any decisions regarding similarity or file comparison. There is one primary feature in text duplicate identification that is important for batching documents to keep text duplicate groups together: Text Duplicate Group (TDGroup). A Text Duplicate group contains the full set of documents that have been identified as containing matching text.

Batching rules for grouping text duplicate content

These are our recommended settings for a source saved search for a text duplicate batch set intended to group text duplicate documents during the review. All other settings may be configured according to the needs of the project.

Saved Search Criteria:

  • TDGroup is set

Relativity Search Criteria for Text Duplicate Documents

Sort Criteria:

  • TDGroup (asc)
  • Control Number (asc)

Relativity Sort Criteria for Grouping Text Duplicate Documents

NOTE: Consider using your "Last Modified Date" instead of your Control Number to put your text duplicate documents in chronological or "version" order.

Batch Set Grouping to group the text duplicate documents together:

  • Family Field: TDGroup
Back to top