<< Click to Display Table of Contents >> Navigation: Concordance > Concordance Databases > Data Fields |
Before you begin your database construction, it’s important to understand the types of data you will receive, and how that information can be categorized into the data format of Concordance fields.
There are two basic field categories available in the Concordance system: defined length fixed fields and variable length paragraphs. The full-text paragraph fields are the most flexible and most commonly used type in a Concordance database.
Fixed fields are always a pre-defined length and contain numbers, dates, or short amounts of text. Documents in a Concordance database can be sorted by the contents of both fixed fields and paragraph fields, but fixed length fields sort much faster. Data in fixed fields is typically searched using relational comparison search operators such as =, >, and <. Fixed fields can be fully indexed and searched using full-text techniques as well.
Fixed length fields have unique applications when compared to full-text paragraphs. While paragraph fields are flexible and variable in size, they do not sort quickly or search easily by comparison.
•Fixed-length fields sort the fastest.
•Fixed fields display faster in Table view than Paragraphs. This is useful in long document databases.
•Relational searches for dates and numbers work best when they are stored in Date and Numeric fields.
•Numeric and date math using the Report Writer or using Concordance’s programming language work best with Date and Numeric fields.
Data Type |
Capacity |
Type |
Notes |
---|---|---|---|
Date |
8 bytes |
Fixed |
Just for dates, keyed by default |
Numeric |
1-20 digits |
Variable |
Currency, zero-filled, comma, keyed by default |
Text |
1-60 characters |
Variable |
Alphanumeric, keyed by default |
Paragraph |
12 million characters |
Fixed |
Alphanumeric, indexed by default, allows rich-text format |
If you want a field to be full-text searchable, make it a paragraph type. Reviewers in Browse view will see that fields with an = sign are for relational searches and fields with a : sign are full-text searchable. |
Concordance does not have a predefined field structure and there are no required fields. This means that preliminary planning in how you construct fields and apply properties to each field is essential. The general Concordance field profile includes:
•No predefined structure, no required fields
•Maximum of 250 fields per database
•Field Name maximum of 12 characters, including: Letters, Numbers at middle or end, Underscores in the middle only
•Field names are stored in capital letters
•Field name saves when the data type is selected
Media (image) key field names with Unicode characters are not supported. |
Concordance Field Properties |
|
---|---|
Image |
Indicates which field contains the image key or alias, only selected once per database and is usually the BEGNO field. |
Media |
Indicates the field that contains the media key or alias, only selected once per database and is usually the BEGNO field. |
Key |
Speeds sorting and relational searching (adds values to database .key file), keying everything dilutes the value |
Accession |
Copies UUID in system table into visible field, good for sorting by load order, great for tracking gaps |
System |
Field cannot be seen by users, Concordance creates these for replication/synchronization information |
Indexed |
Enables full-text searching (adds values to .ivt and .dct files) |
Guidelines for Creating Fields |
|
---|---|
Bates Number |
A unique serial number used to identify a record. When setting up the numbering system, think of the longest number you might need given your case record load. Designate a length that is large enough so data does not truncate. Choose text as the field’s properties in case you need an alphanumeric system implemented. |
Media field |
Select the Media (Image) Key setting to indicate the media key or alias field. You can click the View image (camera) button to launch and link the imagebase files with the corresponding Concordance database records. Only make this selection once per database for a unique field, this is generally the BEGNO field. |
Keyed fields |
Select the Key setting to improve sorting and relational searching speed. |
Indexing |
Indexing puts data into the dictionary and index. You can index any data type. Paragraph fields are indexed by default. Full-text searching only works on indexed fields. Avoid indexing unique values, serial, Bates numbers, and dates to optimize the full-text searching speed. Use relational searching for non-indexed fields. |
OCR2 field |
This field is created as an overflow field for OCR1, just in case the 12 million character limit for OCR1 is exceeded. This field name must have the same alpha prefix as the primary field and the numeric suffix must be a consistent width and start at 1. Fields must be entered in order by suffix. Use the ReadOCR_v[version #].cpl if you need this overflow field. |
Punctuation |
Your customizable Punctuation list designates what punctuation you can use for full-text searches, as long as the characters are embedded between alphanumeric characters and are within quotes. You only need to set punctuation once for each database. |
Dictionaries/Indexes |
Big dictionary and index files slow search processing. Do not bloat these files with unnecessary entries; build stopword lists to exclude them. |
Field Validation |
Some fields require additional attributes for tracking purposes, like EDITTRAIL and CREATIONDATE date fields. These attributes are set in the Data Entry Attributes and must be added before importing load files in order to the capture information. |
Table view |
Field column width in Table view is determined in the Types setting in the Modify dialog box. The default column width is the field length identified in the Data Types table, and as set in field properties. And by default, paragraph fields aren’t included in Table view to keep display speed optimal. You can customize Table view to include paragraph fields at any time. |
Field |
Type |
Length and Option Settings |
---|---|---|
BEGNO |
Text |
30, Key, Image |
ENDNO |
Text |
30 |
DOCDATE |
Date |
Key, MM/DD/YYYY |
DOCTYPE |
Paragraph |
Indexed |
DOCTITLE |
Paragraph |
Indexed |
AUTHOR |
Paragraph |
Indexed |
RECIPIENT |
Paragraph |
Indexed |
PAGES |
Numeric |
Length 20, Places 0, Plain Format, Key |
TEXT01 |
Paragraph |
Indexed |
TEXT02 |
Paragraph |
Indexed |
Field |
Type |
Settings |
---|---|---|
ACCESSID |
Numeric |
20, 0, Accession, Use to track delete documents. Generates a record load order number. |
CREATEDATE |
Date |
Set properties using the Validation command on the Edit menu, normally not keyed or indexed |
EDITTRAIL |
Paragraph |
Set properties using the Validation command on the Edit menu, includes YYYYMMDD, time, time zone, [session # -- assession #] per user |
PRODBEG1 |
Text |
Captures beginning Bates numbers for first production. Using number suffixes avoids having to rename fields later for additional productions. |
PRODEND1 |
Text |
Captures ending Bates numbers for first production. Using number suffixes avoids having to rename fields later for additional productions. |
PRODNOTES1 PRODNOTES2 |
Paragraph |
Enter pertinent information here about the person who ordered the production, add this on every single record in a production set. |
PRODDATE1 |
Date |
Enter the production date here with global edit or AppendTextToField_v<version>.cpl. |
PRODTAGS1 |
Paragraph |
Used to capture tags as they existed at the time of production using the Tag To Field command, Tools > Manage Tags/Issues. |
TAGS |
Paragraph |
Used to hold the names of checked tags generated by the Tag To Field command, Tools > Manage Tags/Issues. |
TAGINFO |
Paragraph |
Used to hold activity generated by the Tag History&Store It_v<version>.cpl. |
CUSTODIAN |
Paragraph |
Original owner of the data. |
REVIEWSTATUS |
Paragraph |
Name of person who reviewed the file. |
ATTYNOTES |
Paragraph |
Used to hold attorney notes. |
ADMIN1 ADMIN2 ADMIN3 ADMIN4 |
Paragraph |
Extra fields to hold data generated later for any reason, make several of them. When you make a new field, you must run a full index so these fields save valuable time. Make some paragraph and some for dates. |
Field |
Type |
Settings |
---|---|---|
Serial Numbers |
Text |
Can be used to link to image files if BEGNO is not used. May have many serial numbers during the life of a database, usually matched with the .tif file name as the first field in the database. |
BEGBATES BEGDOC STARTPAGE |
Text |
Alternative field names for BEGNO. |
ENDBATES ENDDOC ENDPAGE |
Text |
Alternative field names for ENDNO. |
DOCNO |
Text |
Document number, an alternative to using serial numbers key to the page, more common in e-documents that are not .tif files. |
BEGATTACH |
Text |
Used to denote attachment range. |
ENDATTACH |
Text |
Used to denote attachment range. |
INCLUDES |
Text |
Holds Bates or control numbers of all pages inside the document, facilitates searching for middle page .tif files. |
PARAGRAPH |
Paragraph |
Alternative name for DOCTYPE field. |
Document metadata fields |
Paragraph |
Also called bibliographic metadata fields. Typically includes author, recipient, custodian, dates (sent, received filed, etc.), subject, title, etc. Generally this information is visible on the face of the document. |
System metadata fields |
Paragraph |
Typically includes information captured automatically by the computer like last access date, modification date, and print date. This information may not appear on the face of the document. |
TEXT1 TEXT2 |
Paragraph |
Alternative naming convention for body text fields like OCR1, OCR2, OCR3. Used by ReadOCR_v<version>.cpl to hold imported OCR text. If these fields are named with the same alpha prefix (TEXT for example) and have numeric suffixes that start with 1, are the same numeric length, and are in order ascending by suffix, the CPL script overflows additional text into subsequent fields, as needed. |
BODY, TEXT or MESSAGE |
Paragraph |
Naming conventions for e-documents that are not OCR scanned, fields names are usually with numeric suffixes. |
ATTACHMENT or FILEPATH |
Paragraph |
Customarily used to hold a clickable hyperlink to the native document in electronic format. If this field contains a file path or web address, running the CreateHyperlinks_v<version>.cpl converts it to a hyperlink. Another use is for this field to hold the Bates number of records in Concordance, which are attachments to the current record. When populated with Bates numbers, the FindAttachments_v<version>.cpl locates the attachments. Or, this field may contain Bates numbers of any records that are attached to this document. |
DOCCOND |
Paragraph |
Holds notes about the document condition, like Marginalia. |
PRIVCALL |
Paragraph |
Holds the reason a document or redacted sections are marked as privileged, like Attorney-Client, Priest-Penitent, Medical, etc. |
MENTIONS |
Paragraph |
Keywords included in the document. |
LOADDATE |
Date |
Alternative name for CREATEDATE field. Used to record the date the record was created, date data type. |
AUDITTRAIL |
Paragraph |
Alternative name for EDITTRAIL field. Records the date, time zone, computer session ID, and user name any time a record is changed in Edit mode, and paragraph data type. |
DISCSOURCE or SOURCE |
Text |
Used for entering the name of the disk the data is loaded from or physical media it is delivered on, includes disk number, case number, client number, name of person who loaded it, etc. |
Administrative fields with different data types |
Paragraph or Date |
Consider creating ADMIN1PARA, ADMIN1TEXT, ADMIN1DATE, etc. to hold specific types of data. |
Template for administrative fields |
(not applicable) |
Consider creating this template. Each time you make a new database, you can insert fields above them to accommodate data. To create a template: on the Documents menu, point to Export, and then click Structure. Locate and save the existing database structure in the Templates folder that is located in the same folder as the Concordance .EXE file. |
REVIEWEDBY |
Paragraph |
Alternative name for REVIEWSTATUS. Name of the attorney who reviewed the file. |
TAGSATPROD |
Paragraph |
Alternative name for PRODTAGS1. Tags as they existed for production. |
Empty fields are those that contain no values. You can configure a database so empty fields are not visible in the field listing. Hiding unused fields can help to improve the readability of database records for end users. Showing empty fields is generally a preference for Concordance administrators, who typically want to have all database fields visible whether they contain data or not.
By default empty fields are not visible. This means that when you create a new database but before you import data, no fields are visible since none of the fields contain any data. After you import data, any field that does not contain data will not be visible.
1.Open the Concordance database you want to view.
2.From the Tools menu, click Empties. A check mark next to the Empties command indicates that empty fields are visible, a missing check mark indicates that empty fields are not visible.
If you have the option to view empties enabled, you may still see fields with no data on individual records. This situation occurs if data exists in that field for any other record in the case. |