Managing Data Files

<< Click to Display Table of Contents >>

Navigation:  Concordance > Concordance Databases >

Managing Data Files

Files can be received from a client, data processing vendor, or third party. When receiving data, you should always review all files on the disk, prior to loading the data, to ensure they have the proper formats for Concordance.

Load files or delimited text files are one of the files used to construct your Concordance database. These files typically have extensions ending in .dat, .csv, or .txt. Each file contains record metadata, but some may also include body text. We recommend having your OCR separated into individual text files and imported separately using a CPL script.

Administrators should always make a practice of opening and reviewing the delimited text files when you receive them, as the files are not always prepared perfectly and may need to be modified.

When reviewing your data load files, always check for the following:

Field names – each line of metadata is one record, check each header column to verify data

Delimiters – unique characters that appear in the delimited text file and do not exist in your actual data.  Concordance delimiters are:

oComma - Field break indicator, default is □ (ASCII 20), customizable, avoid these characters in data

oQuote - Keeps text together, default is þ (ACSII 254) and is only required around fields that have text and spaces, customizable, avoid these characters in data

oNew Line - Manual line break and text wraps within a field, default is ® (ASCII 174), customizable, avoid these characters in data

oNew Record - Starts a new record, final carriage return loads the last record, cannot be changed, industry standard

Date format – date fields are an 8-character maximum with slashes. If dates include slashes, you can import any format. If slashes are not used, then you must use the universal date format of YYYYMMDD or the mm-dd-yyyy date format with dashes.

Carriage return – a final carriage return ensures that the last record will load into the database.



Delimiters are customizable for an organization's internal database design, but many organizations ask vendors to use Concordance default delimiters.  If your case records contain the registered trademark symbol, you may want to consider changing the ® to another symbol in the load file.