Near Duplicate and Text Duplicate Fields

Near Duplicate fields

Below is a list of the fields created during a Near Duplicate algorithm run that can be overlaid automatically or manually into a workspace.  Fields with double colons, :: , are the fields associated with a document. They are associative fields (or reflexive fields) that show the data in a related object. Fields without double colons are at the document level.

Near Duplicate Fields

_ND

A "Single Object" field that contains the reference to the rest of the results. The similarly named :: fields below are those results drawn from the results table.

_ND::Count

Number of documents in a Near Duplicate group

_ND::ls Pivot

The first document encountered in that group

_ND Group

The ID of the ND group in the results table

_NDGroup

This field exists at the document level and is the relational field that binds the entire Near Duplicate group together so that multiple documents within this group contain the same value as in this field. Use this field to propagate coding decisions or group documents together for review.

Text Duplicate (TD) fields

Below is a list of the fields created during a Text Duplicate algorithm run that can be overlaid automatically or manually into a workspace.

Text Duplicate Fields

_TD

A "Single Object" field that contains the reference to the rest of the results. The similarly named :: fields below are those results drawn from the results table.

_TD::Count

Number of documents in a Text Duplicate group

_TD::ls Pivot

The first document encountered in that group

_TD Group

The ID of the TD group in the results table

_TDGroup

This field exists at the document level and is the relational field that binds the entire Text Duplicate group together so that multiple documents within this group contain the same value as in this field. Use this field to propagate coding decisions or group documents together for review.

Back to top