You use the Deduplication Utility to help identify and flag duplicate items after they have been imported into a case. Using it can help you to significantly reduce the time it takes to analyze and process documents.
This utility may be used if deduplication was not performed by ED Loader or Turbo Import during the import process, or if deduplication was not done by using the Inter-Case Deduplication utility. See the Deduplication Information topic for more information.
Note the following important considerations before you deduplicate records: •If you use the internal Deduplication Utility after the case has been deduplicated against other cases using the Inter-Case Deduplication, a combination of internal and external duplicates could result. Combining internal and external duplicates can cause problems when purging, filtering, or reviewing duplicate records. •Proceeding with the internal deduplication after clicking the Yes I understand and wish to continue button will result in the external deduplication database being placed in Rebuild/Flush mode. At this point, the current case should be removed from the external database. •Before you run the internal deduplication, it is recommended that you reset deduplication flags. See To reset deduplication log status section below. Run this procedure to prevent the mixture of internal and external duplicates. For more information, see Inter-Case Deduplication. |
1.In the main window, on the Tools menu, click Deduplication Utility. The Info tab of the Deduplication Utility dialog box opens.
2.If the case has not been already deduplicated using the Inter-Case Deduplication utility, then click the Load button.
Deduplication statistics for the LAW case are displayed, including the number of duplicates at the global- or custodian-level and the number of root duplicate records. •If deduplication has not yet been performed on the records, the values for each displayed item will be zero. •If records had been deduplicated externally, two additional rows show the number of records deduplicated externally and the name and location of the external deduplication database.
|
1.In the main window, on the Tools menu, click Deduplication Utility. 2.Click the Tools tab. 3.In the Deduplication Status Reset area, click Run. The deduplication log and deduplication-related fields for all the case records are reset. The case returns to a state as if deduplication has never been performed.
|
1.In the main window, on the Tools menu, click Deduplication Utility. 2.Click the Tools tab. 3.In the Verify Duplication Log area, click Run. The CloudNine™ LAW deduplication fields that were updated as a result of inter-case deduplication are reset. See Inter-Case Deduplication for more information. However, iitems from external deduplication databases are not reset. As mentioned above, it is recommended to run this command on a case before running the deduplication process via the Deduplication Utility if a case has already been deduplicated using the Inter-Case Deduplication utility. The Verify Deduplication Log tool will verify that all entries in the log exist in the LAW case. This tool is included for troubleshooting purposes and does not check external deduplication databases.
|
The Apply Duplicate Relationships command populates the following fields for the original files that have duplicate files in a case: •DupCustNames •DupCustPaths •DupParentName •DupParentPath The fields indicate the custodian name and location of the duplicate files and the parents of the duplicate files. For more information about these fields, see Field Descriptions.
1.In the main window, on the Tools menu, click Deduplication Utility. 2.Click the Tools tab. 3.In the Apply Duplicate Relationships area, click Run. Clicking Run starts the command. When the process is completed the Duplicate Relationship Update Status dialog box opens. The Duplicate Relationship Update Status dialog box indicates whether duplicate custodian and path relationships were successfully updated, and indicates the total number of document families that were updated, the total number of files updated, and the amount of time it took for the process to complete. 4.Click OK to close the Duplicate Relationship Update Status dialog box.
|
The Settings tab contains processing and processing range options. •Working digest - This setting is used to select the hash key to be used for determining duplicates. The hash values are obtained through metadata fields (e-mail) or by hashing the entire file (e-docs). CloudNine™ LAW provides two hash keys to choose from: MD5 (128-bit output) and SHA-1 (160-bit output). •Test for duplicate against (Scope) - This setting pertains to the scope in which duplicates are tested. Deduplication can be performed at one of two levels: Case Level (globally deduplicates against all records in the database) or Custodian Level (deduplicates against records with the same custodian value). •Only test untested records - When enabled, this option will force LAW to only process records that have not been tested previously in the deduplication process. This feature may be useful when a case has been deduplicated previously and then new records are added (and deduplication was not enabled during the import). If the "Only test records with selected custodians" option is also enabled, only untested records with the specified custodian values will be tested. •Only test records with selected custodians - When enabled, this option allows the specification of one or more custodians and forces LAW to process only records with those custodians during deduplication. •Click the Select button to launch the Custom Value Selection [Custodian] dialog. •Check the boxes beside the custodians to include them in the deduplication process. •The Reset button can be used to reset any options that were modified in the current session. Click Start to initialize the deduplication process.
|