Optical Character Recognition

<< Click to Display Table of Contents >>

Navigation:  Using CloudNine LAW > Processing Documents >

Optical Character Recognition

Optical Character Recognition (OCR) is the process of converting hand-written or printed text from case documents into searchable full-text records (TXT files). This process typically takes place during import via Turbo Import or ED Loader, but may be needed for documents which failed to have their text extracted properly, or documents which were imported via scanner.

The accuracy of OCR can be diminished due to uneven or skewed prints, damaged (folded, torn, etc) pages, faded or distorted text, non-standard fonts, or illegible hand writing. Since OCR is a very CPU-intensive process, and the quality of the original documents can greatly impact performance, it's generally recommended to clean up these documents as much as possible before beginning the process.

RightArrowFor OCR configuration settings, see the OCR Options topic.

 

Metadata Fields:  OCR

LAW uses the following Metadata Fields to keep track of OCR status at both the document and page levels:

OcrAccuracy - Used to indicate a percentage of OCR accuracy (1-100) at the document-level.

OcrFlag - Indicates one of the following OCR statuses at the page-level:

oY - This page is flagged as ready for OCR.

oN - This page will not have OCR performed.

oC - OCR has been completed.

oE - An error occurred during the OCR process.

OcrStatus - Indicates one of the following OCR statuses at the document-level:

oY - One or more pages from this document are flagged as ready for OCR.

oN - No pages from this document are flagged for OCR.

oC - OCR has been completed.

oI - OCR was canceled.

oE - An error occurred during the OCR process.

oP - This document is currently being processed for OCR.

 

OCR Status Icons

The following icons are used for quick visual reference on the OCR status of documents within the Text tab of the Document Viewer pane:

OCRiconText - OCR has been completed for this document, and text is available.

OCRiconGreen - One or more pages from this document are flagged as ready for OCR.

OCRiconYellow - OCR was canceled for this document.

OCRiconRed - An error occurred during the OCR process for this document.