Optical Character Recognition

<< Click to Display Table of Contents >>

Navigation:  Using CloudNine LAW > OCR >

Optical Character Recognition

InfoIconMetadata Fields:  OCR

LAW uses the following Metadata Fields to keep track of OCR status at both the document and page levels:

OcrAccuracy - Used to indicate a percentage of OCR accuracy (1-100) at the document-level.

 

OcrFlag - Indicates one of the following OCR statuses at the page-level:

oY - This page is flagged as ready for OCR.

oN - This page will not have OCR performed.

oC - OCR has been completed.

oE - An error occurred during the OCR process.

OcrStatus - Indicates one of the following OCR statuses at the document-level:

oY - One or more pages from this document are flagged as ready for OCR.

oN - No pages from this document are flagged for OCR.

oC - OCR has been completed.

oI - OCR was canceled.

oE - An error occurred during the OCR process.

oP - This document is currently being processed for OCR.

In many cases, LAW assigns values to OCR fields automatically. For example, if you import electronic discovery with text extraction enabled, records with extracted text should have OcrStatus of N and records with no text extraction should have OcrStatus of Y.  Because you may still want to access the text of a document whose text extraction operation failed, LAW automatically assigns a Y to the OcrStatus field.

LAW uses the processing flag (P) to mark a document in the process of being OCR'd. This allows multiple workstations to OCR the same set of documents simultaneously without having to worry about overlapping OCR. Using multiple stations to share the OCR process provides an extra layer of redundancy. If one machine locks up or crashes during the OCR process, one or more other stations can continue to OCR that set of documents.

 

InfoIconOCR Status Icons

The following icons are used for quick visual reference on the OCR status of documents within the Text tab of the Document Viewer pane:

OCRiconText - OCR has been completed for this document, and text is available.

OCRiconGreen - One or more pages from this document are flagged as ready for OCR.

OCRiconYellow - OCR was canceled for this document.

OCRiconRed - An error occurred during the OCR process for this document.

InfoIconTo Flag Items Manually

1.Select the pages to be flagged.

2.Do one of the following:

On the Page menu, click Flag for OCR.

Or

Right-click on a page thumbnail and then click Flag for OCR.

  Clicking Flag for OCR

 

InfoIconTo Flag Multiple Items

Resetting the OCR flags for multiple documents can be done in either of the following ways, depending on whether the documents are currently viewed in a folder or from a query.

If all of the documents are in the same folder

If the documents span multiple folders and can be logically grouped in a query

1.Select the documents in the document list.

2.On the Edit menu, select Reset OCR Flags and then click ON or OFF.

In the grid display, on the Tools menu, click Reset OCR Flags and then click ON or OFF.

If you use the single-document OCR process, all pages are included regardless of the OcrFlag field value.

 

InfoIconTo Set OCR Flags at Scan Time

On the Scan menu, select Scan Options, and then click OCR All New Pages.

Or

In the status bar at the bottom of the main form, toggle the setting. Double-click OCR(Y)  or OCR(N).

    Status bar

 

InfoIconTo Display Flagged Pages

On the Page menu, select Show OCR Flags.

In the thumbnail display, all pages flagged for OCR are highlighted. If the thumbnail display is not active when this function is selected, it automatically becomes active so that the thumbnails are shown. This function only highlights pages that are flagged for OCR. It will not highlight pages that have already been completed.