<< Click to Display Table of Contents >> Navigation: Using CloudNine LAW > Near-Duplicate & Email Thread Analysis > Near-Duplicate & Email Thread Fields |
Once Near-Duplicate analysis has been performed, LAW updates the following Metadata Fields for all records in the Case Directory:
•ND_ClusterID - Used to identify near-duplicate clusters. Each document belonging to the same cluster will display a matching ID. •ND_FamilyID - Used to identify near-duplicate families. Documents belonging to the same family will display a matching ID, which is based on the master document ID (padded to fill 8 digits). •ND_IsMaster - Flags master documents of near-duplicate families with a Y. Other documents belonging to the same family will display an N. All other documents not belonging to any near-duplicate family will also display a Y. •ND_Similarity - Displays the percentage of similarity between this document and its family master document. |
•ND_ResultSet - Used for internal tracking purposes by the Near-Duplicate & Email Thread Analysis Utility. Indicates the near-duplicate index revision for the current record. •ND_ContentHash - Content hash values are stored here. Documents with identical values contain identical text, but may still have different metadata or file formats. •ND_Sort - Displays a sorting ID for each document. Documents are assigned this ID based on their similarity to each other. |
Once Email Thread analysis has been performed, LAW updates the following Metadata Fields for all records in the Case Directory:
•ET_IsMessage - Flags email messages with a Y. All other documents display an N. •ET_Conversants - Displays the names of all senders and recipients found within email messages. Names can be located within email headers (From, To, CC, BCC), previous quoted messages, or the main body of the message. •ET_MessageID - Displays a unique ID assigned to each message. Messages with matching IDs are recognized as separate copies of the same message. •ET_ParentID - Displays the ID of the root message being responded to or forwarded by this message. •ET_Inclusive - Flags messages containing the entire conversation of an email thread with a Y. This is typically the last message in a thread. Attachments are not flagged. All other messages in an email thread display an N. •ET_InclusiveReason - Indicates the reason for an email being flagged as Y within the ET_Inclusive field: oMessage - This email contains body text not found in other emails of the thread. oAttachment - This email contains attachments not found in other emails of the thread. oMessage, Attachment - This email contains both body text and attachments not found in other emails of the thread. |
•ET_MetaUpdate - Flags messages whose metadata was populated from analyzed text via the Near-Duplicate & Email Thread Analysis Utility with a Y. All other messages display an N. •ET_ThreadModified - Displays the date/time of the most recent email thread analysis performed on this document. •ET_ThreadID - Used to identify email threads. Each message belonging to the same email thread will display a matching ID. •ET_ThreadSize - Indicates the number of unique messages within an email thread. •ET_ThreadIndex - Identifies individual messages and their attachments within an email thread using the following format: "[ET_ThreadID].[message #].A.[attachment #]". The underlined portion only appears for messages with attachments, and the root message within an email thread will only display the ET_ThreadID portion. •ET_ThreadSort - Displays a sorting ID for each message in an email thread, indicating a position in the overall chain of conversation (including any branches). •ET_Indent - Displays an incremental number for each message of an email thread, starting with 0 for the root message, and increasing by 1 for each reply in the chain. |