<< Click to Display Table of Contents >> Navigation: Using CloudNine LAW > Near-Duplicate & Email Thread Analysis |
Near-Duplicate/Email Thread Analysis is the process of scanning the extracted or OCR text of individual records within a Case Database and flagging any Near-Duplicates and/or Email Threads. This is done by subjecting the content (text) to a hashing process, which yields unique numerical (hash) values to be compared against a specified Threshold of similarity. Records found to have content hashes at or above the specified Threshold are flagged as either Near-Duplicates or Email Threads within case records.
To analyze the content of all records within a single Case File, use the Near-Duplicate & Email Thread Analysis Utility.