Running full-text searches

<< Click to Display Table of Contents >>

Navigation:  Document Review > Advanced Searching >

Running full-text searches

A full-text search is designed for text and words, and is the most flexible search feature provided by Concordance Desktop. A full-text search uses the database’s index to quickly sift through every word entry (of every record) entered in the database index dictionary – similar in design to the Google search engine, which is also based on indexing. Your Concordance Desktop index is generated after your record collection is entered into the database. The dictionary is created during the indexing process and includes every word in indexed fields, except for stopwords, to improve searching speeds.

Terms entered in the search are highlighted in the record (red is the default color setting) and are referred to as a hits. You can quickly navigate to every hit in a given record by using the Previous Hit and Next Hit buttons on the Dynamic toolbar.

There are three sets of operators and one wildcard you can use for full-text searches:

Boolean (document level)

Boolean operators are based on the binary logic used in computers today, producing strict true or false results. In Concordance Desktop, Boolean operators search at the document level. The most common Boolean operators are: AND, OR, and NOT. The search is not case sensitive, so you do not have to enter all caps for the Boolean operators.

Operator

Query

Results

AND contains both words

milk AND coffee

All documents with both the words: milk and coffee

OR contains either word

milk OR coffee

All documents with either milk or coffee, or both

NOT contains first word, but not second

milk NOT coffee

All documents with milk, but not coffee

XOR

milk NOT/OR coffee

All documents with milk or coffee, but not both words

Boolean search operators are used in the Simple Search panel, but the operators are hard-coded for you.

Typing spaces between words is important to separate the word you are searching from the operator word (AND, OR, NOT). For example, type spaces where you see the underscore: milk_AND_coffee.

Using XOR saves time in locating documents by combining two separate word searches into one, pulling all documents that either have milk or coffee in them, but not milk and coffee in the same record.

Context (field level)

Context operators are great for locating authors and recipients of records, if these fields are entered in your database. Context operators search at the field level. If you are searching the AUTHOR and RECIPIENT fields for a particular name, the query results will be a list of documents to and from that person. The search is not case sensitive, so you do not have to enter all caps for the context operators.

Context operators include SAME and NOTSAME qualifiers including or excluding fields. The SAME and NOTSAME operators are not commonly used, but field limiters are used frequently to narrow a search for a particular field or to omit fields, saving valuable search processing time.

Operator

Query

Results

SAME

DOCTITLE SAME bank

Both terms exist in the same field

NOTSAME

DOCTITLE NOTSAME bank

Both terms never exist in the same field

Field Limiters

     .FIELD.

    ..FIELD.

cowco.DOCTITLE.

correspondence.doctype.

Looks in the named fields

cowco..ocr1.

memorandum..doctype.

Looks in all other fields except for the named field

Example: cowco.ocr1.

This query searches for cowco in the ocr1 field.

Example: milk SAME coffee

This query searches for documents where milk and coffee are both found in the same field. If they occur in the same document, but not in the same field, the document is ignored.

Typing spaces between words is important to separate the word you are searching for from the operator word (SAME, NOTSAME). However, when you are using the periods/field limiters, you want to skip spaces and type the words and periods together in a tight search string, because the periods are a coding language for specifying the inclusion or omission of a field (milk..ocr1.). If you add a space between milk and ..ocr1, the application will not read your search query.

A context operator search with word..fieldname. locates the word with an exclusion. The word can still exist in database records, it is just not part of your specific search. For example, a search for milk..ocr1. locates milk in all fields, except for ocr1. This search is helpful for locating keywords in metadata fields, excluding a search on the record’s body content.

Proximity (word level)

Proximity operators search at the word level and are useful when looking for content that appears in records either in direct succession or adjacent order, or in close succession to each other within a specified range. You can add a qualifying number from 0 to 255. This number refers to the maximum number of intervening indexed words, excluding stopwords.

Operator

Query

Results

ADJ (default operator)

milk ADJ programs

Both terms are immediately next to each other and in the order specified

Produces the same results as typing milk programs

ADJ25

milk ADJ25 programs

Both terms appear within 25 words of each other

ADJ# operator has a 255 word limit

NEAR

milk NEAR programs

Both terms are within close range of each other regardless of order, i.e. programs for milk

NEAR255

milk NEAR255 programs

Both terms appear within 255 words of each other

NEAR# operator has a 255 word limit

The default Proximity operator is ADJ. So a search for file note is actually searching file ADJ note. This selection can be customized by your Concordance Desktop administrator for your personal searches, as needed.

Typing spaces between words is important to separate the word you are searching from the operator word (ADJ, NEAR). For example, type spaces where you see the underscore: file_ADJ_note.

Proximity operators are hard-coded into the Form Search tool.

Wildcard (field and character masking)

Searching for people’s names can be complicated. Names can be misspelled or include variations of punctuation. You’ll want to investigate the method used at your organization for entering names into your Concordance Desktop databases. This helps you determine how to search to ensure that you do not exclude any possible variations of someone’s name.

If you are searching for a name in an email database, you will notice that there are many email address formats depending on the Internet service provider’s account type. You may miss a critical record in an email address search if you are not careful about how you query it. Using wildcard characters helps ensure that the recipient name is located, regardless of the email format.

Examples:

jsmith@organization.com

J_Smith@organization.com

john.smith@organization.com

jsmith@gmail.com

Names in documents are also easily misspelled and could be overlooked if apostrophes and initials are combined in records. Using a wildcard symbol as a substitute for a character or series of characters, masks the individual character fields in words creating a broader search with stronger results.

Character

Query

Results

Asterisk (*) replaces a character at the beginning or end of a search string

*count*

Finds account, accountant, country, countries, and discount

milk.ocr*.

Searches ocr1, ocr2, ocr3, etc.

When searching for a person, think about all the fields in which this name might be printed or referenced, and also become familiar with the field titles used in your databases, as these can vary. For instance, when searching for Weller in one database, it may be necessary to specify the To and From fields. In another database, you may need to search in the Author and Recipient fields.

Search strings that do not include white space are limited to 64 consecutive characters.