<< Click to Display Table of Contents >> Navigation: CloudNine Explore Web > Using CloudNine Explore Web > Explore > Near Duplicate Document Viewer |
Near duplicates are documents that have duplicate content, but are not necessarily exact duplicates of each other.
The near-duplicate analysis is run for a case in CloudNine™ Explore. You can then view and compare near-duplicate documents on the EXPLORE tab using the Near Duplicate Document Viewer in CloudNine™ Explore Web. The near-duplicate analysis in CloudNine™ Explore identifies documents with redundant text to help ensure you are only processing documents essential to the case. The near-duplicate analysis includes analyzing documents without text. For email messages, the near-duplicate analysis only compares the email message body content between email messages. It does not compare email headers.
For more information, see Near-Duplicate & Email Thread Analysis.
Starting from the EXPLORE page for a case, using the existing result set, or any configured search you want, display a document in the document details pane on the right that has near-duplicates (identified by having a number displayed on the Near Dupes tab). From there you can click the View near duplicates icon from the toolbar at the top of the document details pane. The Near Duplicate Document Viewer displays.
The top left pane displays a list of near-duplicate documents associated with the file currently selected on the EXPLORE page. When comparing, the master document is the file that was selected on the EXPLORE page and is displayed at the top of the list. There are four columns in the list:
Near-Duplicate List Columns |
|
---|---|
Column |
Description |
Name |
File name of the file in the near-duplicate family. |
Similarity |
Percentage of similarity between the near-duplicate document and the master document. |
Tag |
If a tag is applied to a document in the near-duplicate list, a blue tag is displayed. If no tag is applied to a document in the near-duplicate list, a gray tag is displayed. Click the tag icon to view the Tagging dialog and modify the file's associated tags. You can also apply tags to all files in the list by clicking the tag icon in the list column header to access the Bulk Tagging dialog. |
Compare |
Click the Compare icon to bring up the associated document comparison pane and compare near-duplicate documents. See the Comparison Pane section below for more details. to compare a document with the document currently selected in the near-duplicate list pane. When you click the near duplicate icon for a document in the list, the Compare icon is displayed next to the near duplicate icon to indicate which file is currently being compared with the selected document. |
The bottom left pane displays document metadata. This metadata includes a Tags field, that displays all the tags currently applied to the document.
The right pane displays file contents for the currently selected document from the Email Thread Items list. You can use the Download document icon at the top of the pane to download the corresponding native file.
You can compare near-duplicate documents with the master document by clicking the Compare icon, next to a near-duplicate document. When you click the comparison icon, the document details pane is replaced by the comparison pane. The comparison pane displays the contents of the document you selected along with highlighting the differences between the selected document and the master document.
In the comparison details pane:
•Underlined blue text indicates added text. (Example)
•Red text with a strike through indicates deleted text. (Example)
•Plain black text indicates unchanged text. (Example)
A caution icon (exclamation mark) will be displayed in the upper right corner of the comparison pane if you are comparing one or more truncated documents. Comparisons for truncated documents are only displayed for the text up to the point where the truncation occurs.
The comparison pane toolbar at the top of the comparison pane is used to navigate between near-duplicate documents and navigate the differences between the near-duplicate documents. The comparison pane toolbar also contains the comparison key for understanding the near-duplicate differences.
Comparison pane toolbar features |
||
---|---|---|
Icon |
Name |
Description |
Previous Document |
Navigates to the previous document in the near-duplicate document list and displays that document in the comparison pane. |
|
Next Document |
Navigates to the next document in the near-duplicate document list and displays that document in the comparison pane. |
|
First Difference |
Navigates to the first difference. |
|
Previous Difference |
Navigates to the previous difference. |
|
Current and Total Differences |
Indicates the current difference number and the total number of differences. |
|
Next Difference |
Navigates to the next difference. |
|
Last Difference |
Navigates to the last difference. |