Near Duplicate and Text Duplicate Fields
Near Duplicate fields
Below is a list of the fields created during a Near Duplicate algorithm run that can be overlaid automatically or manually into a workspace. Fields with double colons, :: , are the fields associated with a document. They are associative fields (or reflexive fields) that show the data in a related object. Fields without double colons are at the document level.
Near Duplicate Fields
_ND |
A "Single Object" field that contains the reference to the rest of the results. The similarly named :: fields below are those results drawn from the results table. |
_ND::Count |
Number of documents in a Near Duplicate group |
_ND::ls Pivot |
The first document encountered in that group |
_ND Group |
The ID of the ND group in the results table |
_NDGroup |
This field exists at the document level and is the relational field that binds the entire Near Duplicate group together so that multiple documents within this group contain the same value as in this field. Use this field to propagate coding decisions or group documents together for review. |
Text Duplicate (TD) fields
Below is a list of the fields created during a Text Duplicate algorithm run that can be overlaid automatically or manually into a workspace.
Text Duplicate Fields
_TD |
A "Single Object" field that contains the reference to the rest of the results. The similarly named :: fields below are those results drawn from the results table. |
_TD::Count |
Number of documents in a Text Duplicate group |
_TD::ls Pivot |
The first document encountered in that group |
_TD Group |
The ID of the TD group in the results table |
_TDGroup |
This field exists at the document level and is the relational field that binds the entire Text Duplicate group together so that multiple documents within this group contain the same value as in this field. Use this field to propagate coding decisions or group documents together for review. |