Near Duplicate Document Viewer

<< Click to Display Table of Contents >>

Navigation:  CloudNine Explore Web > Using CloudNine Explore Web > Explore >

Near Duplicate Document Viewer

Near duplicates are documents that have duplicate content, but are not necessarily exact duplicates of each other.

The near-duplicate analysis is run for a case in CloudNine™ Explore.  You can then view and compare near-duplicate documents on the EXPLORE tab using the Near Duplicate Document Viewer in CloudNine™ Explore Web.  The near-duplicate analysis in CloudNine™ Explore identifies documents with redundant text to help ensure you are only processing documents essential to the case. The near-duplicate analysis includes analyzing documents without text. For email messages, the near-duplicate analysis only compares the email message body content between email messages. It does not compare email headers.

For more information, see Near-Duplicate & Email Thread Analysis.


Near Duplicate Document Viewer

Starting from the EXPLORE page for a case, using the existing result set, or any configured search you want, display a document in the document details pane on the right that has near-duplicates (identified by having a number displayed on the Near Dupes tab).  From there you can click the View near duplicates icon from the toolbar at the top of the document details pane.  The Near Duplicate Document Viewer displays.


Near-Duplicate List

The top left pane displays a list of near-duplicate documents associated with the file currently selected on the EXPLORE page.  When comparing, the master document is the file that was selected on the EXPLORE page and is displayed at the top of the list.  There are four columns in the list:

Near-Duplicate List Columns




File name of the file in the near-duplicate family.


Percentage of similarity between the near-duplicate document and the master document.


If a tag is applied to a document in the near-duplicate list, a blue tag is displayed.  If no tag is applied to a document in the near-duplicate list, a gray tag is displayed. Click the tag icon to view the Tagging dialog and modify the file's associated tags.  You can also apply tags to all files in the list by clicking the tag icon in the list column header to access the Bulk Tagging dialog.


Click the Compare icon to bring up the associated document comparison pane and compare near-duplicate documents.  See the Comparison Pane section below for more details.

to compare a document with the document currently selected in the near-duplicate list pane. When you click the near duplicate icon for a document in the list, the Compare icon Compare icon is displayed next to the near duplicate icon to indicate which file is currently being compared with the selected document.


Document Metadata

The bottom left pane displays document metadata.  This metadata includes a Tags field, that displays all the tags currently applied to the document.


Document Details

The right pane displays file contents for the currently selected document from the Email Thread Items list.  You can use the Download document icon at the top of the pane to download the corresponding native file.  


Comparison Pane

You can compare near-duplicate documents with the master document by clicking the Compare icon, next to a near-duplicate document.  When you click the comparison icon, the document details pane is replaced by the comparison pane.  The comparison pane displays the contents of the document you selected along with highlighting the differences between the selected document and the master document.

In the comparison details pane:

Underlined blue text indicates added text. (Example)

Red text with a strike through indicates deleted text. (Example)

Plain black text indicates unchanged text. (Example)

A caution icon (exclamation mark) will be displayed in the upper right corner of the comparison pane if you are comparing one or more truncated documents.  Comparisons for truncated documents are only displayed for the text up to the point where the truncation occurs.


Comparison Pane Toolbar

The comparison pane toolbar at the top of the comparison pane is used to navigate between near-duplicate documents and navigate the differences between the near-duplicate documents. The comparison pane toolbar also contains the comparison key for understanding the near-duplicate differences.

Comparison pane toolbar features




Previous compare button

Previous Document

Navigates to the previous document in the near-duplicate document list and displays that document in the comparison pane.

Next compare button

Next Document

Navigates to the next document in the near-duplicate document list and displays that document in the comparison pane.

First difference button

First Difference

Navigates to the first difference.

Previous difference button

Previous Difference

Navigates to the previous difference.

Difference of field

Current and Total Differences

Indicates the current difference number and the total number of differences.

Next difference button

Next Difference

Navigates to the next difference.

Last difference button

Last Difference

Navigates to the last difference.