Data Fields

<< Click to Display Table of Contents >>

Navigation:  Concordance > Concordance Databases >

Data Fields

Before you begin your database construction, it’s important to understand the types of data you will receive, and how that information can be categorized into the data format of Concordance fields.

There are two basic field categories available in the Concordance system: defined length fixed fields and variable length paragraphs. The full-text paragraph fields are the most flexible and most commonly used type in a Concordance database.

Fixed fields are always a pre-defined length and contain numbers, dates, or short amounts of text.  Documents in a Concordance database can be sorted by the contents of both fixed fields and paragraph fields, but fixed length fields sort much faster.  Data in fixed fields is typically searched using relational comparison search operators such as =, >, and <.  Fixed fields can be fully indexed and searched using full-text techniques as well.

Fixed length fields have unique applications when compared to full-text paragraphs. While paragraph fields are flexible and variable in size, they do not sort quickly or search easily by comparison.

Fixed-length fields sort the fastest.

Fixed fields display faster in Table view than Paragraphs. This is useful in long document databases.

Relational searches for dates and numbers work best when they are stored in Date and Numeric fields.

Numeric and date math using the Report Writer or using Concordance’s programming language work best with Date and Numeric fields.

Concordance Data Field Types

Data Type

Capacity

Type

Notes

Date

8 bytes

Fixed

Just for dates, keyed by default

Numeric

1-20 digits

Variable

Currency, zero-filled, comma, keyed by default

Text

1-60 characters

Variable

Alphanumeric, keyed by default

Paragraph

12 million characters

Fixed

Alphanumeric, indexed by default, allows rich-text format

Important

If you want a field to be full-text searchable, make it a paragraph type.  Reviewers in Browse view will see that fields with an = sign are for relational searches and fields with a : sign are full-text searchable.

 

Understanding Field Structure and Applying Properties

Concordance does not have a predefined field structure and there are no required fields.  This means that preliminary planning in how you construct fields and apply properties to each field is essential.  The general Concordance field profile includes:

No predefined structure, no required fields

Maximum of 250 fields per database

Field Name maximum of 12 characters, including: Letters, Numbers at middle or end, Underscores in the middle only

Field names are stored in capital letters

Field name saves when the data type is selected

Warning

Media (image) key field names with Unicode characters are not supported.

 

Concordance Field Properties

Image
(Concordance Image Only)

Indicates which field contains the image key or alias, only selected once per database and is usually the BEGNO field.

Media
(Concordance Native Viewer Only)

Indicates the field that contains the media key or alias, only selected once per database and is usually the BEGNO field.

Key

Speeds sorting and relational searching (adds values to database .key file), keying everything dilutes the value

Accession

Copies UUID in system table into visible field, good for sorting by load order, great for tracking gaps

System

Field cannot be seen by users, Concordance creates these for replication/synchronization information

Indexed

Enables full-text searching (adds values to .ivt and .dct files)

 

Guidelines for Creating Fields

Bates Number

A unique serial number used to identify a record. When setting up the numbering system, think of the longest number you might need given your case record load. Designate a length that is large enough so data does not truncate. Choose text as the field’s properties in case you need an alphanumeric system implemented.

Media field
(Image field)

Select the Media (Image) Key setting to indicate the media key or alias field.  You can click the View image (camera) button to launch and link the imagebase files with the corresponding Concordance database records. Only make this selection once per database for a unique field, this is generally the BEGNO field.

Keyed fields

Select the Key setting to improve sorting and relational searching speed.

Indexing

Indexing puts data into the dictionary and index. You can index any data type. Paragraph fields are indexed by default. Full-text searching only works on indexed fields. Avoid indexing unique values, serial, Bates numbers, and dates to optimize the full-text searching speed. Use relational searching for non-indexed fields.

OCR2 field

This field is created as an overflow field for OCR1, just in case the 12 million character limit for OCR1 is exceeded. This field name must have the same alpha prefix as the primary field and the numeric suffix must be a consistent width and start at 1. Fields must be entered in order by suffix. Use the ReadOCR_v[version #].cpl if you need this overflow field.

Punctuation

Your customizable Punctuation list designates what punctuation you can use for full-text searches, as long as the characters are embedded between alphanumeric characters and are within quotes. You only need to set punctuation once for each database.

Dictionaries/Indexes

Big dictionary and index files slow search processing. Do not bloat these files with unnecessary entries; build stopword lists to exclude them.

Field Validation

Some fields require additional attributes for tracking purposes, like EDITTRAIL and CREATIONDATE date fields. These attributes are set in the Data Entry Attributes and must be added before importing load files in order to the capture information.

Table view

Field column width in Table view is determined in the Types setting in the Modify dialog box. The default column width is the field length identified in the Data Types table, and as set in field properties. And by default, paragraph fields aren’t included in Table view to keep display speed optimal. You can customize Table view to include paragraph fields at any time.

Common Database Fields

Field

Type

Length and Option Settings

BEGNO

Text

30, Key, Image

ENDNO

Text

30

DOCDATE

Date

Key, MM/DD/YYYY

DOCTYPE

Paragraph

Indexed

DOCTITLE

Paragraph

Indexed

AUTHOR

Paragraph

Indexed

RECIPIENT

Paragraph

Indexed

PAGES

Numeric

Length 20, Places 0, Plain Format, Key

TEXT01

Paragraph

Indexed

TEXT02

Paragraph

Indexed

Common Administrative Fields

Field

Type

Settings

ACCESSID

Numeric

20, 0, Accession, Use to track delete documents. Generates a record load order number.

CREATEDATE

Date

Set properties using the Validation command on the Edit menu, normally not keyed or indexed

EDITTRAIL

Paragraph

Set properties using the Validation command on the Edit menu, includes YYYYMMDD, time, time zone, [session # -- assession #] per user

PRODBEG1

Text

Captures beginning Bates numbers for first production. Using number suffixes avoids having to rename fields later for additional productions.

PRODEND1

Text

Captures ending Bates numbers for first production. Using number suffixes avoids having to rename fields later for additional productions.

PRODNOTES1

PRODNOTES2

Paragraph

Enter pertinent information here about the person who ordered the production, add this on every single record in a production set.

PRODDATE1

Date

Enter the production date here with global edit or AppendTextToField_v<version>.cpl.

PRODTAGS1

Paragraph

Used to capture tags as they existed at the time of production using the Tag To Field command, Tools > Manage Tags/Issues.

TAGS

Paragraph

Used to hold the names of checked tags generated by the Tag To Field command, Tools > Manage Tags/Issues.

TAGINFO

Paragraph

Used to hold activity generated by the Tag History&Store It_v<version>.cpl.

CUSTODIAN

Paragraph

Original owner of the data.

REVIEWSTATUS

Paragraph

Name of person who reviewed the file.

ATTYNOTES

Paragraph

Used to hold attorney notes.

ADMIN1

ADMIN2

ADMIN3

ADMIN4

Paragraph

Extra fields to hold data generated later for any reason, make several of them.  When you make a new field, you must run a full index so these fields save valuable time.  Make some paragraph and some for dates.

 

Additional Administrative Fields and Naming Conventions

Field

Type

Settings

Serial Numbers

Text

Can be used to link to image files if BEGNO is not used.  May have many serial numbers during the life of a database, usually matched with the .tif file name as the first field in the database.

BEGBATES

BEGDOC

STARTPAGE

Text

Alternative field names for BEGNO.

ENDBATES

ENDDOC

ENDPAGE

Text

Alternative field names for ENDNO.

DOCNO

Text

Document number, an alternative to using serial numbers key to the page, more common in e-documents that are not .tif files.

BEGATTACH

Text

Used to denote attachment range.

ENDATTACH

Text

Used to denote attachment range.

INCLUDES

Text

Holds Bates or control numbers of all pages inside the document, facilitates searching for middle page .tif files.

PARAGRAPH

Paragraph

Alternative name for DOCTYPE field.

Document metadata fields

Paragraph

Also called bibliographic metadata fields. Typically includes author, recipient, custodian, dates (sent, received filed, etc.), subject, title, etc. Generally this information is visible on the face of the document.

System metadata fields

Paragraph

Typically includes information captured automatically by the computer like last access date, modification date, and print date.  This information may not appear on the face of the document.

TEXT1

TEXT2

Paragraph

Alternative naming convention for body text fields like OCR1, OCR2, OCR3.  Used by ReadOCR_v<version>.cpl to hold imported OCR text.   If these fields are named with the same alpha prefix (TEXT for example) and have numeric suffixes that start with 1, are the same numeric length, and are in order ascending by suffix, the CPL script overflows additional text into subsequent fields, as needed.

BODY, TEXT or MESSAGE

Paragraph

Naming conventions for e-documents that are not OCR scanned, fields names are usually with numeric suffixes.

ATTACHMENT or FILEPATH

Paragraph

Customarily used to hold a clickable hyperlink to the native document in electronic format. If this field contains a file path or web address, running the CreateHyperlinks_v<version>.cpl converts it to a hyperlink.  Another use is for this field to hold the Bates number of records in Concordance, which are attachments to the current record. When populated with Bates numbers, the FindAttachments_v<version>.cpl locates the attachments.  Or, this field may contain Bates numbers of any records that are attached to this document.

DOCCOND

Paragraph

Holds notes about the document condition, like Marginalia.

PRIVCALL

Paragraph

Holds the reason a document or redacted sections are marked as privileged, like Attorney-Client, Priest-Penitent, Medical, etc.

MENTIONS

Paragraph

Keywords included in the document.

LOADDATE

Date

Alternative name for CREATEDATE field.  Used to record the date the record was created, date data type.

AUDITTRAIL

Paragraph

Alternative name for EDITTRAIL field.  Records the date, time zone, computer session ID, and user name any time a record is changed in Edit mode, and paragraph data type.

DISCSOURCE or SOURCE

Text

Used for entering the name of the disk the data is loaded from or physical media it is delivered on, includes disk number, case number, client number, name of person who loaded it, etc.

Administrative fields with different data types

Paragraph or Date

Consider creating ADMIN1PARA, ADMIN1TEXT, ADMIN1DATE, etc. to hold specific types of data.

Template for administrative fields

(not applicable)

Consider creating this template. Each time you make a new database, you can insert fields above them to accommodate data.

To create a template: on the Documents menu,  point to Export, and then click Structure.  Locate and save the existing database structure in the Templates folder that is located in the same folder as the Concordance .EXE file.

REVIEWEDBY

Paragraph

Alternative name for REVIEWSTATUS. Name of the attorney who reviewed the file.

TAGSATPROD

Paragraph

Alternative name for PRODTAGS1. Tags as they existed for production.

 

Handling Empty Data Fields

Empty fields are those that contain no values.  You can configure a database so empty fields are not visible in the field listing. Hiding unused fields can help to improve the readability of database records for end users.  Showing empty fields is generally a preference for Concordance administrators, who typically want to have all database fields visible whether they contain data or not.

By default empty fields are not visible. This means that when you create a new database but before you import data, no fields are visible since none of the fields contain any data. After you import data, any field that does not contain data will not be visible.

To Show or Hide Empty database fields:

1.Open the Concordance database you want to view.

2.From the Tools menu, click Empties.  A check mark next to the Empties command indicates that empty fields are visible, a missing check mark indicates that empty fields are not visible.

Note

If you have the option to view empties enabled, you may still see fields with no data on individual records. This situation occurs if data exists in that field for any other record in the case.