|
<< Click to Display Table of Contents >> Navigation: Using CloudNine LAW > Acquiring Documents > Turbo Import > Configuring Turbo Import |
Buttons for Subtopics
Return to Topic ![]()
1.Start from the Menu of the Main User Interface by selecting File > Import > Turbo Import.
2.The Main User Interface will close, and the Turbo Import utility will open, with the Import Settings overlay on top.
|
The Import Settings overlay window automatically opens the first time you launch Turbo Import for a specific case. You can return to the Import Settings overlay by clicking the Settings button located in the top-right corner of the Turbo Import utility. Most Turbo Import Settings are locked and cannot be changed during and after the initial ingestion of data. |
|---|
3.From here, you can configure your Turbo Import Settings for this case based on the options shown below for each tab.
|
Only Passwords (in the Content tab) can later be changed. |
|---|
Tabs:
4.Once you have the desired settings configured, click on OK at the bottom right of the overlay to close it.
5.You are now ready to Start a Turbo Import Session.
There are seven settings tabs available within the Import Settings overlay of the Turbo Import utility, as shown and described below.
|
|
The Turbo Import Password Bank is also used during Batch Process - Turbo Imager |
|---|
•Language Analysis:
oIdentify language content during analysis - All languages used within the files will be identified and analyzed. The first 200 MB of each file is analyzed during this process, or the entire file should the overall size be less than 200 MB. For more information on the languages please visit this site.
oRestrict language identification to common languages - Limits analysis to only the most recognizable languages to help improve accuracy. For more information on the restricted languages please visit the bottom of this site.
•Time Zone - Select your desired time zone from the drop-down menu here. Default is Coordinated Universal Time (UTC). All files and folders imported into the Case File will use the selected time zone for their records.
|
•Microsoft Word/RTF |
•Microsoft Excel |
•Microsoft PowerPoint |
•Adobe Acrobat PDF |
•SnapShot |
•Microsoft Visio |
•Microsoft Outlook.FileAttach (Word-authored e-mail with inline attachments, generally stored in RTF) |
•Microsoft Project |
•Package* |
|
*A Package is a general type of embed; it can be a text file or a zip file, for example. Any of the above types may also be embedded as a package type depending on the software installed when a user embeds the file. For example, if a user were to embed an Excel spreadsheet into a Word document, and Excel is not installed, the spreadsheet will be embedded as Package.
Supported containers file types and embedded files
The following table lists common embedded file types that LAW supports for extraction:
Description |
Detection |
Extraction |
|---|---|---|
Non-Microsoft Office Formats |
||
Adobe Acrobat (pdf) |
Y |
Y |
Rich text format (rtf) *Converted to Word format for extraction. Original file is preserved. |
Y |
Y |
Excel Spreadsheet (OpenXml, xml) & Excel *Compound documents not supported in xml format |
Y |
Y |
MS Office Data File (OpenXml) |
Y |
Y |
PowerPoint Presentation (OpenXml) and PowerPoint *Compound documents not fully supported in OpenXml format |
Y* |
Y* |
Word (OpenXml, xml) and Word |
Y |
Y |
OneNote |
N |
N |
Project (xml) and Project *Compound documents not fully supported in xml format |
Y |
Y |
Publisher |
**Y |
Y |
Visio (xml) and Visio *Currently not recognized by file engine |
*Y |
N |
**Detection of embeds in these types is limited to the types of files supported for extraction (see above list).
|
|
In addition to deduplicating prior to the import process, LAW also allows you to deduplicate at these other times in a postF-discovery workflow: •After the import against other records in the case by using the Deduplication Utility. •After the import against other records in the case and other LAW cases by using Inter-Case Deduplication. |
|---|
•NIST (NSRL) - The National Institute of Standards and Technology (NIST) maintains and publishes a database of known computer file profiles referred to as a Reference Data Set (RDS), which is compiled by the National Software Reference Library (NSRL). The NIST uses this RDS to compare files against known sets of software applications. NIST filtering is to used to remove file types that are unlikely to have useful data. Examples of such file types include system files, executable files, and application logic files.
oEnable NIST (NSRL) detection - Requires a NIST database to be provided through the LAW Configuration Utility.
oIf hashes match, then - Select either Include or Exclude from the drop-down menu to determine what happens with NIST items detected during import.
|
Changes in the NIST list are global and will apply to new imports in other cases. Both Turbo Import and Electronic Discovery cases |
•File Type - This section is for manual filtering of files based on specified file types. Filtering is targeted to top-level (parent) files within a Case Database, thus applying automatically to any embedded (child) files contained within. LAW supports the import of all file types (recommended).
oEnable file type filtering - Turns on manual file type filtering based on settings established within the File Type Manager.
oFile Type Manager - This button opens a separate window dedicated to specifying file types for filtering. Changes made here apply globally to all cases using manual file type filtering. Certain file types may be Included, Excluded, both (Exclude takes preference), or neither (determined below). You can also assign default applications for opening each file type within LAW.
oTreat file types not specifically included or excluded as - Select either Include or Exclude from the drop-down menu to determines how to handle file types not specified within the File Type Manager.
|
At present, CloudNine LAW is unable to conduct NIST and file type filtering when processing UFDR files. |
|
1.Select Enable file type filtering. 2.Select File Type Manager.
3.The Manage File Types opens in a new window. 4.Configure file inclusion and exclusion lists, and other options: oInc. selected - all documents and database records with Inc. will be written to the LAW database. oExc. selected - the file, its metadata, and its associated text are not written to the LAW database. oBoth Inc. and Exc. selected - exclude takes precedence over the include option and the file, its metadata, and its associated text are not written to the LAW database. oNeither Inc. nor Exc. are selected - the status is determined by the setting selected in Treat file types not specifically included or excluded as oAssign default source applications for each file type. |
|
Changes in the File Type Manager are global and will apply to new imports in other cases. Both Turbo Import and Electronic Discovery cases |
•Date Range Filtering - This section allows for filtering based on specified date ranges for files. This filtering is overly inclusive, so entire families of files will be included if even a single embedded file falls within the specified range. Add date ranges by clicking the Add(+) button to the right of the first range, and remove them by clicking the Remove(-) button to the right of the unwanted range.
oFrom - Select a start date for each range by clicking on the appropriate calendar button located in this column.
oTo - Select an end date for each range by clicking on the appropriate calendar button located in this column.
CloudNine™ LAW supports import of all file types. Even if a file type is not supported for printing or conversion, metadata and text may still be extracted. A full list of Supported File Types that are recognized by both CloudNine™ LAW and CloudNine™ Explore during Import, can be found here. Supported File Types
|
|
The metadata extracted will be populated in Extended Property fields in LAW. The extended property field names will start with EP followed by the name of the field as it exists in the source document. •Example: if a Word document is imported that contains a custom metadata field called Typist, LAW creates a metadata field during the import called EPTypist. |
oExtract EXIF metadata properties for Image documents - Exchangeable Image File Format (EXIF) is a standard that specifies the format for image, sound, and ancillary tags used by systems that handle the metadata for those files. For example, many image files have EXIF tags for geolocation embedded within them. When enabled, these properties will also be extracted.
|
The metadata extracted will be populated in Extended Property fields. The extended property field names will start with EP followed by the name of the field as it exists in the source document. •EX: if a PNG document is imported that contains a custom metadata field called Colors, LAW creates a metadata field during the import called EPColors. |
oAuto-assign suspect extensions - If the file extension for a source file does not match the file type detected by LAW, then selecting this option will place the detected extension in the DocExt field and the source file extension in the OrigExt field during import.
oIdentify hidden text - Detects specific forms of text hidden within Word, Excel, and PowerPoint documents. If found, the hidden text will be bracketed in-between <<<Start Hidden Content>>> and <<<End Hidden Content>>> within the extracted text. Associated records will also have the HiddenText field set to Y. These types of hidden text can be extracted:
▪Text hidden inside shape controls, such as text boxes.
▪Text specifically formatted as hidden.
▪Hidden spreadsheets, columns, and cells.
▪Hidden slides.
|
Custom and EXIF metadata extraction, as well as the detection of hidden text, can also be performed after records are ingested through Batch Process – Document Processing / Analysis. |
|
|
For folder example please see this page LAW Folder Structure Examples |
•Email and Communication Sort Order - Use the drop-down to choose how emails and communication records are organized, either Oldest to Newest or Newest to Oldest.
•Preview - Displays the resulting folder structure to be expected in the Case Directory based on the levels established above.
|
|
The number of Ingestion Agents available is based on the number of Turbo Import licenses purchased. Each turbo agent uses one turbo license and one machine core. |
•Enable Turbo OCR on Ingestion - This setting will turn on the Turbo OCR application. Enabled PDFs and image-based documents to automatically go through OCR.
|
The Turbo OCR application uses the ABBYY Fine Reader engine. Each Turbo workstation in the Turbo pool must have the ABBYY ENGINE installed and have an active license. |
oMaximum Turbo OCR Agents on Ingestion - This engine is multi-threaded and will engage with all the cores on a workstation. If you are going to set this to a number, it is best to select a number on a workstation level.
•Maximum Turbo Agents used in the Native Extraction and Populating the LAW case stages - Control the agents in the native extraction and populating stage.
|
This stage will consume one available core, but is not tied to any licenses. Each environment is different and should be configured accordingly. |