Overview
Optical Character Recognition (OCR) is a feature that enables the conversion of text images into machine-encoded text. In EMDK 7.5 (and higher), the Barcode API can configure the device scanner to enables an app to capture various OCR font types as text. This functionality is modeled as decoder types (OCRA, OCRB, MICRE13B and USCurrency) exposed through the Barcode API. The captured OCR data can be retrieved from the data returned to the application from a scan event using the onData callback.
Enable OCR
Before an application can capture using OCR, the decoder that corresponds with the OCR font type (OCRA, OCRB, MICRE13B, USCurrency) must be enabled. To do so, get an instance of a scanner object (see the Barcode Scanning API Programmer's Guide for details). Note: For OCR-A and OCR-B, selecting the most appropriate font variant optimizes performance and accuracy.
Once initialized, modify the scanner configuration as below:
ScannerConfig config = scanner.getConfig();
config.decoderParams.ocrA.enabled = true; //enable OCRA decoder
config.decoderParams.ocrA.ocrAVariant = ScannerConfig.OcrAVariant.FULL_ASCII; //select required variant
scanner.setConfig(config);
Configure Parameters
Set the parameters based on specific app requirements.
Default values for OCR parameters:
ScannerConfig config = scanner.getConfig();
config.ocrParams.inverseOcr = ScannerConfig.InverseOcr.REGULAR_ONLY;
config.ocrParams.checkDigitModulus = 1;
config.ocrParams.checkDigitMultiplier = "121212121212";
config.ocrParams.checkDigitValidation = ScannerConfig.OcrCheckDigitValidation.NONE;
config.ocrParams.ocrLines = ScannerConfig.OcrLines.ONE_LINE;
config.ocrParams.maxCharacters = 100;
config.ocrParams.minCharacters = 3;
config.ocrParams.orientation = ScannerConfig.OcrOrientation.DEGREE_0;
config.ocrParams.quietZone = 50;
config.ocrParams.subset = "!\"#$%()*+,-./0123456789<>ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\\^|";
config.ocrParams.template = "99999999";
scanner.setConfig(config);
OCR Parameters
Inverse OCR
White or light words on a black or dark background.
Options for decoding
- Regular Only - Decode regular OCR (black on white) strings only (default)
- Inverse Only - Decode inverse OCR (white on black) strings only
- Auto-discriminate - Decode regular and inverse OCR strings
OCR Check Digit Modulus
Sets the OCR module check digit calculation. The check digit is the digit in the right-most position in an OCR string and helps improve the accuracy of the collected data. It is the end product of a calculation made on the incoming data. For check digit calculation, for example, Modulus 10 alpha and numeric characters are assigned numeric weights. The calculation is applied to the character weights and the resulting check digit is added to the end of the data. If the incoming data does not match the check digit, the data is considered corrupt. The selected check digit option does not take effect until the OCR Check Digit Validation is set.
Possible values:
- Low - 1 (default)
- High - 99
OCR Check Digit Multiplier
Sets OCR check digit multipliers for the character positions. For check digit validation, each character in scanned data has an equivalent weight used in the check digit calculation.
Possible values:
- Minimum length - 1
- Maximum Length - 100
- Default = 121212121212
OCR Check Digit Validation
Protects against scanning errors by applying a check digit validation scheme.
Possible values:
None - 0 (default)
Product Add Left to Right - Each character in the scanned data is assigned a numeric value. Each digit representing a character in the scanned data is multiplied by its corresponding digit in the multiplier, and the sum of these products is computed. The check digit passes if this sum modulo Check Digit Modulus is zero.
Example: Scanned data numeric value is 132456 (check digit is 6). Check digit multiplier string is 123456
- Digits: 1 3 2 4 5 6
- Multipliers: 1 2 3 4 5 6
- Products: 1 6 6 16 25 36
- Sum of products: 1+6+6+16+25+36 = 90
If the Check Digit Modulus is 10, it passes because 90 is divisible by 10 (the remainder is zero).
Product Add Right to Left - Each character in the scanned data is assigned a numeric value. The check digit multiplier is reversed in order. Each value representing a character in the scanned data is multiplied by its corresponding digit in the reversed multiplier, resulting in a product for each character in the scanned data. The sum of these products is computed. The check digit passes if this sum modulo Check Digit Modulus is zero. Example: Scanned data numeric value is 132459 (check digit is 9). Check digit multiplier string is 123456.
- Digits: 1 3 2 4 5 9
- Multipliers: 6 5 4 3 2 1
- Products: 6 15 8 12 10 9
- Sum of products: 6+15+8+12+10+9 = 60
If the Check Digit Modulus is 10, it passes because 60 is divisible by 10 (the remainder is 0).
Digit Add Left to Right - Each character in the scanned data is assigned a numeric value. Each value representing a character in the scanned data is multiplied by its corresponding digit in the multiplier, resulting in a product for each character in the scanned data. The sum of each individual digit in all of the products is then calculated. The check digit passes if this sum modulo Check Digit Modulus is zero.
Example: Scanned data numeric value is 132456 (check digit is 6). Check digit multiplier string is 123456.
- Digits: 1 3 2 4 5 6
- Multipliers: 1 2 3 4 5 6
- Products: 1 6 6 16 25 36
- Sum of product digits: 1+6+6+1+6+2+5+3+6 = 36
If the Check Digit Modulus is 12, it passes because 36 is divisible by 12 (the remainder is 0).
Digit Add Right to Left - Each character in the scanned data is assigned a numeric value. The check digit multiplier is reversed in order. Each value representing a character in the scanned data is multiplied by its corresponding digit in the reversed multiplier, resulting in a product for each character in the scanned data. The sum of each individual digit in all of the products is then calculated. The check digit passes if this sum modulo Check Digit Modulus is zero.
Example: Scanned data numeric value is 132456 (check digit is 6). Check digit multiplier string is 123456.
- Digits: 1 3 2 4 5 6
- Multipliers: 6 5 4 3 2 1
- Products: 6 15 8 12 10 6
- Sum of product digits: 6+1+5+8+1+2+1+0+6 = 30
The Check Digit Modulus is 10. It passes because 30 is divisible by 10 (the remainder is 0).
Product Add Right to Left Simple Remainder - Each character in the scanned data is assigned a numeric value. The check digit multiplier is reversed in order. Each value representing a character in the scanned data is multiplied by its corresponding digit in the reversed multiplier, resulting in a product for each character in the scanned data. The sum of these products except for the check digit's product is computed. The check digit passes if this sum modulo Check Digit Modulus is equal to the check digit's product.
Example: Scanned data numeric value is 122456 (check digit is 6). Check digit multiplier string is 123456.
- Digits: 1 2 2 4 5 6
- Multipliers 6 5 4 3 2 1
- Products 6 10 8 12 10 6
- Sum of products: 6+10+8+12+10 = 46
The Check Digit Modulus is 10. It passes because 46 divided by 10 leaves a remainder of 6.
Digit Add Right to Left Simple Remainder - Each character in the scanned data is assigned a numeric value. The check digit multiplier is reversed in order. Each value representing a character in the scanned data is multiplied by its corresponding digit in the reversed multiplier, resulting in a product for each character in the scanned data. The sum of each individual digit in all of the products except for the check digit's product is then calculated. The check digit passes if this sum modulo Check Digit Modulus is equal to the check digit's product.
Example: Scanned data numeric value is 122459 (check digit is 6). Check digit multiplier string is 123456.
- Digits: 1 2 2 4 5 9
- Multipliers: 6 5 4 3 2 1
- Products: 6 10 8 12 10 9
- Sum of product digits: 6+1+0+8+1+2+1+0+= 19
The Check Digit Modulus is 10. It passes because 19 divided by 10 leaves a remainder of 9.
Health Industry - HIBCC43 - The health industry module 43 check digit standard. The check digit is the modulus 43 sum of all the character values in a given message and is printed as the last character in a given message.
OCR Lines
Used to select the number of OCR lines to decode.
- 1 Line (default)
- 2 Lines
- 3 Lines
OCR Maximum Characters
Select the maximum number of OCR characters (including spaces) per line to decode.
Possible values:
- Low - 3
- High – 100 (default)
OCR Minimum Characters
Select the minimum number of OCR characters (not including spaces) per line to decode.
Possible values:
- Low - 3 (default)
- High - 100
OCR Orientation
Select the orientation of an OCR string to be read. Setting an incorrect orientation can cause mis-decodes. Options:
Possible values:
- 0 degrees - to the imaging engine (default)
- 270 degrees - clockwise (or 90 degrees counterclockwise) to the imaging engine
- 180 degrees - (upside down) to the imaging engine
- 90 degrees - clockwise to the imaging engine
- Omni-directional
OCR Quiet Zone
Sets the field width of blank space to stop scanning during OCR reading.
Possible values:
- Low - 20
- High - 99
- Default = 50
OCR Subset
Defines a custom group of characters in place of a preset font variant. For example, if scanning only numerals and the letters A, B, and C, create a subset of just these characters to speed decoding. This applies a designated OCR Subset across all enabled OCR fonts.
Possible values:
- Minimum length - 1
- Maximum Length – 100
- Default = !"#$%()*+,-./0123456789<>ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\^|)
OCR Template
Creates a template for precisely matching scanned OCR characters to a desired input format. Carefully constructing an OCR template eliminates scanning errors. The template expression is formed by numbers and letters. The default is 99999999 which accepts any numeric character OCR string. If there are less than 8 '9' characters, the '9' represents only numerical values.
Possible values:
- Minimum length - 3
- Maximum Length - 100
- Default = 99999999
OCR Template Operators
The template operators in the following table can assist in capturing, delimiting and formatting scanned OCR data. OCR template expressions are formed by numbers and letters arranged in a sequence. Refer to the Zebra DS36X8 Reference Guide Chapter 15 for more information.
Name | Description | Template | Valid Data | Invalid Data/ Outgoing Data |
|
---|---|---|---|---|---|
Required Digit (9) | Accepts a numeric character only in this position. | 99999 | 12987 | 123AB | |
Required Alpha (A) | Accepts an alpha character only in this position. | AAA | ABC | 12F | |
Require and Suppress (0) | Any character in this position is suppressed from the output, including space and reject. | 990AA | 12QAB | 12AB | |
Optional Alphanumeric (1) | Accepts an alpha-numeric character, if present. Optional characters are not allowed as the first character(s) in a field of like characters. | 99991 | 1234A | 1234< | |
Optional Alpha (2) | Accepts an alpha character, if present. Optional characters are not allowed as the first character(s) in a field of like characters. | AAAA2 | ABCDE | ABCD6 | |
Alpha or Digit (3) | An alpha-numeric character is required in this position to validate the incoming data. | 33333 | 12ABC | 12AB< | |
Any Including Space and Reject (4) | Accepts any character in this position, including spaces and rejects. Rejects are represented by an underscore (_) character in the output. This is a good selection for troubleshooting. | 99499 | 12$34 34_98 |
||
Any except Space and Reject (5) | Accepts any character in this position except a space or reject. | 55999 | A.123 *Z456 |
A BCD | |
Optional Digit (7) | Accepts a numeric character, if present. Optional characters are not allowed as the first character(s) in a field of like characters. | 99977 | 12345 789 |
789AB | |
Digit or Fill (8) | Accepts any numeric or fill character in this position. | 88899 | 12345 >>789 <<789 |
||
Alpha or Fill (F) | Accepts any alpha or fill character in this position. | AAAFF | ABCXY LMN>> ABC<5 |
||
Optional Space ( ) | Accepts a space, if present. Optional characters are not allowed as the first character(s) in a field of like characters. | 99 99 | 12 34 1234 |
67891 | |
Optional Small Special (.) | Accepts a special character, if present. Optional characters are not allowed as the first character(s) in a field of like characters. Small special characters are - (dash) and . (dot) | AA.99 | MN.35 XY98 |
XYZ12 | |
Other Template Operators. These template operators assist in capturing, delimiting and formatting scanned OCR data. | |||||
Literal String (" and +) | Use either of these delimiting characters surrounding alphanumeric characters to define a literal string within a template that must be present in scanned OCR data. There are two characters used to delimit required literal strings; if one of the delimiter characters is already present in the desired literal string, use the other delimiter. | "35+BC" | 35+BC | AB+22 | |
New Line (E) | To create a template of multiple lines, add an "E" between the template of each single line. | 999EAAAA | 321 BCAD |
XYZW 12 |
|
String Extract (C) | This operator combined with others defines a string of characters to extract from the scanned data. The string extract is structured as follows: CbPe Where: • "C" is the string extract operator • "b" is the string begin delimiter • "P" is the category (one or more numeric or alpha characters) describing the string representation • "e" is the string end delimiter Values for "b" and "e" can be any character that can be scanned and are included in the output stream. |
C>A> | XQ3>ABCDE> ->ATRU>123 |
>ABCDE> >ATRU> |
|
Ignore to End of Field (D) | This operator causes all characters after a template to be ignored. Use this as the last character in a template expression. | 999D | 123-PED 357298 |
123 357 |
|
Skip Until (P1) | This operator allows skipping over characters until a specific character type or a literal string is detected. It can be used in two ways: P1ct Where: • "P1" is the "Skip Until" operator • "c" is the type of character that triggers the start of output • "t" is one or more template characters P1"s"t Where: • P1 is the "Skip Until" operator • "s" is one or more literal string characters that trigger the start of output • t is one or more template characters The trigger character or literal string is included in output from a "Skip Until" operator, and the first character in the template should accommodate this trigger. |
P1"PN"AA9999 | 123PN9876 X-PN3592 |
PN9876 PN3592 |
|
Skip Until Not (P0) | This operator allows skipping over characters until a specific character type or a literal string is not matched in the output stream. It can be used in two ways: P0ct Where: • P0 is the "Skip Until Not" operator • "c" is the type of character that triggers the start of output • "t" is one or more template characters P0"s"t Where: • "P0" is the "Skip Until Not" operator • "s" is one or more literal string characters that trigger the start of output • "t" is one or more template characters The trigger character or literal string is included in output from a "Skip Until Not" operator. |
P0A9999 | BPN3456 X-PN3592 |
5341 No output |
|
Repeat Previous (R) | This operator allows a template character to repeat one or more times, allowing the capture of variable-length scanned data. The following examples capture two required alpha characters followed by one or more required digits: | AA9R | AB3 AB3 32RM52700 |
PN12345 PN12345 No output |
|
Scroll Until Match (S) | This operator steps through scanned data one character at a time until the data matches the template. | S99999 | AB3 PN12345 32RM52700 |
No output 12345 52700 |
|
Multiple Templates
The multiple templates feature sets up two or more templates for OCR decoding, with a capital letter "X" as the separator between strings in the template. For example, setting the OCR Template as "99999XAAAAA" decodes OCR strings of either "12345" or "ABCDE." Additional sample template strings are shown below with descriptions of data that would be valid for each template.
"M99977"- injects a capital letter M followed by three required numerical characters (numerals) and two optional numerals to be acquired.
"X997777X"- begins with a capital X followed by two required numerals, four optional numerals and another X.
"9959775599"- defines two numerals followed by any character, another required numeral, two optional numerals, any two alpha-numerical characters and two additional numerals.
"A55-999-99"- requires an alpha character followed by any two alpha-numeric characters, a dash, three numerals, a dash, and two more numerals.
"33A.99"- defines two alpha-numeric characters followed by a letter a "dot" (period) and two required numerals.
"999992991"- defines five numerals followed by an optional alpha-numeric character plus two numerals and an optional alpha-numeric character.
"PN98"- is a literal field.
Also See
Zebra DS36X8 Reference Guide (PDF) | Chapter 15 covers OCR programming