OCRMaxSettings

External Segmentation, Advanced Classification and Fielding settings for the OCRMax function, which can be adjusted at run-time. This function offers programmatic control of the settings, providing support for adjustments to parameters from a remote device. For more information, see OCRMax.

OCRMaxSettings Inputs

Syntax: OCRMaxSettings(Segmentation.Character Polarity,Segmentation.Character Width Type,Segmentation.Minimum Character Width,Segmentation.Use Maximum Character Width,Segmentation.Maximum Character Width,Segmentation.Minimum Character Height,Segmentation.Use Maximum Character Height,Segmentation.Maximum Character Height,Segmentation.Use Minimum Character Aspect Ratio,Segmentation.Minimum Character Aspect Ratio,Segmentation.Angle Range,Segmentation.Skew Range,Segmentation.Character Fragment Merge Mode,Segmentation.Minimum Character Fragment Overlap,Segmentation.Max Intra-Character Gap,Segmentation.Min Intra-Character Gap,Segmentation.Minimum Character Fragment Size,Segmentation.Minimum Character Size,Advanced Segmentation.Normalization Mode,Advanced Segmentation.Use Stroke Width Filter,Advanced Segmentation.Ignore Border Fragments,Advanced Segmentation.Binarization Threshold,Advanced Segmentation.Character Fragment Constrast Threshold,Advanced Segmentation.Maximum Fragment Distance to Mainline,Advanced Segmentation.Segmentation Analysis Mode,Advanced Segmentation.Minimum Pitch,Advanced Segmentation.Character Pitch Position,Advanced Segmentation.Character Pitch Type,Advanced Classification.Classifier Template Size Width,Advanced Classification.Classifier Template Size Height,Advanced Classification.Maintain Character Aspect Ratio,Advanced Classification.Skip Additional Character Validation,Advanced Classification.Image Preprocessing,Fielding.Ignore Unfielded Characters,Fielding.String Length,Fielding.Maximum String Length,Fielding.Minimum String Length,Fielding.Maximum First Fielded Index,Fielding.Minimum Last Fielded Index)

Parameter Description

Segmentation

Specifies the parameter settings used to perform segmentation.

Character Polarity

Specifies the polarity of the characters in the input image.

Note: To improve the performance of the function, specify the polarity of the text.

1 = Black on White

The polarity of the text is black text on a white background.

2 = White on Black

The polarity of the text is white text on a black background.

4 = Auto (default)

The function will automatically determine the polarity of text and background.

Character Width Type

Specifies how the widths of characters in the font are expected to vary, which determines how character fragments should be merged or split.

Note: The character width is the width of the character region (e.g. the bounding box of the ink), not the ROI (which would typically include padding around the character rectangle).

1 = Auto (default)

The character width is unknown; the font may have either fixed or proportional width. Character fragments that are wider than the specified maximum character width will be split into individual characters that may or may not be of the same width, depending upon where the function determines the best place to split the character fragments.

2 = Fixed

All of the character regions in the font have the same width. Character fragments that are wider than the specified maximum character width will be split into individual characters of the same width.

4 = Variable

The characters in the font may have character regions with different widths. Character fragments that are wider than the specified maximum character width will be split into individual characters that may or may not be of the same width, depending upon where the function determines the best place to split the character fragments.

Minimum Character Width

Specifies the minimum width of a character’s character region, in pixels (1 - 1000; default = 3), that a character must have to be reported. This setting helps to filter background noise or other non-text features.

Use Maximum Character Width

Specifies whether or not the function should account for the maximum allowable width of a character’s character region. This setting may be used to help the function determine if a character fragment should be split or merged. If the character fragment should be split, the Character Width Type parameter will determine the proper technique to be used.

0 = Off (default)

The function will not account for the maximum allowable width of a character's character region.

1 = On

The function will account for the maximum allowable width of a character's region. When enabled, a character wider than the specified value will be split into pieces that are not too wide.

Maximum Character Width

Specifies the maximum allowable width of a character’s character region, in pixels (1 - 5000; default = 100).

Note: This setting will be unavailable unless the Use Maximum Character Width parameter is enabled.

Minimum Character Height

Specifies the minimum height of a character’s character region, in pixels (1 - 1000; default = 3), that a character must have in order to be reported. This setting helps to filter background noise or other non-text features.

Use Maximum Character Height

Specifies whether or not the function should account for the maximum allowable height of a character’s character region. This value is used in two ways: First, this value is used when finding the line, as a whole, e.g. to reject vertically adjacent noise, and/or other lines of vertically adjacent characters. Second, an individual character whose height exceeds this value will be trimmed to meet this height.

0 = Off (default)

The function will not account for the maximum allowable height of a character's character region.

1 = On

The function will account for the maximum allowable height of a character's character region.

Maximum Character Height

Specifies the maximum allowable height of a character’s character region, in pixels (1 - 5000; default = 100).

Note: This setting will be unavailable unless the Use Maximum Character Height parameter is enabled.

Use Minimum Character Aspect Ratio

Specifies whether or not the function will account for the minimum allowable aspect ratio of a character, where the aspect ratio is defined as the height of the entire line of characters, divided by the width of the character’s character region.

0 = Off

The function will not account for the minimum allowable aspect ratio of a character.

1 = On (default)

The function will account for the minimum allowable aspect ratio of a character. When enabled, a character whose aspect ratio is smaller than this value (i.e. whose width is too large) will be split into pieces that are not too wide.

Minimum Character Aspect Ratio

Specifies the minimum allowable aspect ratio (0 - 500; default = 80) of a character. This setting may be used to indirectly set the maximum width of characters by using the overall height of the line of characters, where the maximum width equals the line height, divided by the Minimum Character Aspect Ratio value. Character fragments wider than allowed by this parameter will not be merged, and will be split to form two or more character fragments that are not too wide (which is controlled by the Character Width Type parameter).

Note: This setting is unavailable unless the Use Minimum Character Aspect Ratio parameter is enabled.

Angle Range

Specifies the angel search range (0 - 45; default = 0), in degrees.

Note: If the line of text experiences angular rotation or skew, and that rotation or skew is consistent across images, configure the Region to compensate for those variances. Any value above 0 will increase the execution time of the segmentation process.

Skew Range

Specifies the skew search range (0 - 45; default = 0), in degrees.

Note: If the line of text experiences angular rotation or skew, and that rotation or skew is consistent across images, configure the Region to compensate for those variances. Any value above 0 will increase the execution time of the segmentation process.

Character Fragment Merge Mode

Specifies how the function should merge character fragments when forming characters during segmentation. Determines whether or not two character fragments are required to overlap to be considered as a part of a character, or whether or not character fragments with a horizontal gap between them may be considered as part of a character.

1 = Require Overlap (default)

Character fragments must overlap horizontally, based on the value in the Minimum Character Fragment Overlap parameter.

2 = Set Min Inter-Character Gap

Character fragments with a horizontal gap between them may be merged to form characters, where any two fragments with a gap less than the value defined in the Min Inter-Character Gap parameter will be merged.

4 = Set Min Inter-Character/Max Intra-Character Gap

Character fragments with a horizontal gap between them may be merged to form characters, with the decision to merge two fragments based on the values defined in the Min Inter-Character Gap and Max Intra-Character Gap parameters.

Minimum Character Fragment Overlap

Specifies the minimum fraction (0 - 100; default = 0) by which two character fragments must overlap each other horizontally, in order for the two character fragments to be considered as part of the same character. The default value (0) will merge any two character fragments that overlap by at least 1 pixel; larger values will require that the character fragments overlap more substantially. For applications where some character pairs are intentionally overlapped horizontally (also called kerning), or cases where the Refine stage cannot completely correct for rotation and/or skew (typically because of inconsistent printing), the default value is necessary.

Max Intra-Character Gap

Specifies the maximum horizontal gap size, in pixels (0 - 1000; default = 5), that can occur within a single character, even for damaged characters.

Note: In the most common application scenario, where the gaps that occur between adjacent characters are always wider than the gaps that may occur within a broken character, set the Min Inter-Character Gap parameter to the Max Intra-Character Gap parameter value, plus 1. For example, if characters never exhibit internal horizontal gaps, the Max Intra-Character Gap parameter would be set to 0, and the Min Inter-Character Gap parameter would be set to 1.

Min Inter-Character Gap

Specifies the minimum horizontal gap size, in pixels (0 - 1000; default = 0), that must occur between two character fragments to be belong to different characters. The gap is measured from the right edge of the character region of one character, to the left edge of the character region of the next character. If the gap between two fragments is smaller than this value, then the fragments must be considered to be part of the same character, unless the combined character would be too wide (as specified by the Maximum Character Width and/or Minimum Character Aspect Ratio parameters).

Note: In the most common application scenario, where the gaps that occur between adjacent characters are always wider than the gaps that may occur within a broken character, set the Min Inter-Character Gap parameter to the Max Intra-Character Gap parameter value, plus 1. For example, if characters never exhibit internal horizontal gaps, the Max Intra-Character Gap parameter would be set to 0, and the Min Inter-Character Gap parameter would be set to 1.

Minimum Character Fragment Size

Specifies the minimum number of foreground (i.e. text) pixels (0 - 1000; default = 15) that a character fragment must have in order to be considered for possible inclusion in a character. When setting this value, it must be set so that is small enough to keep real pieces of text, while still being large enough to exclude small fragments from background noise. For dot-matrix text and/or text that contains small characters such as periods, a smaller value may be necessary. For solid stroke text, where the characters do not tend to break up into multiple, smaller pieces, a larger value may be necessary.

Minimum Character Size

Specifies the minimum number of foreground (i.e. text) pixels (0 - 5000; default = 30) that a binarized character must have in order to be reported; characters with too few pixels will be discarded. This setting helps to filter background noise or other non-text features.

Advanced Segmentation

Specifies additional parameters to be applied during the segmentation process.

Normalization Mode

Specifies the mode used to normalize the image, which removes background variation to utilize all 256 greyscale values (0 - 255).

Note: This setting should be the same for both training and run-time; changing this setting after training may cause classification issues.

1 = None

No normalization will be performed.

2 = Global

A global normalization is performed, using information in the whole ROI, not local variations. This option is best when the background color and text are consistent across the entire ROI; also the fastest mode.

4 = Local (default)

A local normalization is performed, using information about each local character region in the ROI to normalize the image. This option removes background gradients, and should be tried if the Global mode does not work.

8 = Local Advanced

A local normalization is performed, using information about each local character region in the ROI to normalize the image, including adjustments for not only the background, but also the contrast of the foreground. This option removes background gradients and also adjusts for inconsistent text contrast; also takes the most execution time of the three options.

Use Stroke Width Filter

Specifies whether or not to remove everything from a normalized image that does not have the same stroke width as the as the rest of the image. This option can remove some types of non-text features, such as thin lines that might otherwise connect nearby characters. This option can also remove some low-contrast background variations that might otherwise exceed the binarization threshold and create false fragments and/or interconnect nearby characters. However, using this option may cause problems if the text does not have a consistent stroke width and/or the text has significant variable contrast (e.g. parts of some strokes are very faint).

Note: This setting should be the same for both training and run-time; changing this setting after training may cause classification issues.

0 = Off

The function will not remove everything in the image that does not have the same stroke width as the rest of the fragments in the image.

1 = On (default)

The function will remove everything in the image that does not have the same stroke width as the rest of the fragments in the image.

Ignore Border Fragments

Specifies whether or not the function will completely ignore any fragments that touch any border of the ROI. Ignoring such fragments can be useful for non-text features, such as the edges of labels, that might be included within the ROI.

Note: If a fragment extends from the border of the character region to the mainline of the text, the fragment will be considered to be a character. The fragment must not extend into the mainline of the text to be excluded when this parameter is enabled.

0 = Off (default)

The function will not ignore border fragments.

1 = On

The function will ignore border fragments.

Binarization Threshold

Specifies a percentage modifier (0 - 100; default = 50) in the range that is used to compute the binarization threshold, in the normalized image, that binarizes the image between text and background. For example, the default setting of 50% would produce a binarization threshold of roughly 128 (on a 0-255 greyscale range), with 0 equaling Black on White text, and 100 equaling White on Black text. Using a value less than 50% alters the binarization threshold so that there is less text and more background, whereas using a value greater than 50% produces less background and more text. Typically, the default value will correctly binarize the image; however, in cases where there is background texture or other background variations, decreasing this value may help compensate for the background variations.

Note: This setting should be the same for both training and run-time; changing this setting after training may cause classification issues.

Character Fragment Contrast Threshold

Specifies the minimum amount of contrast [in grayscale levels (0 - 255; default = 30)] of a fragment as a whole, relative to the threshold in the normalized image, in order to be considered for possible inclusion in a character. Each foreground pixel in a fragment is always guaranteed to meet the threshold, but if all of the pixels in a fragment are very close to the threshold, then the fragment is likely just noise, rather than text. A value of 0 will prevent low-contrast fragments from being rejected.

Maximum Fragment Distance to Mainline

Specifies the distance, as a percentage of character height (0 - 1000; default = 0), that a fragment may be vertically removed from the "mainline" running horizontally through the text. Ordinarily, characters are expected to lie along a straight, horizontal line after being refined (the stage of compensating for angular orientation and skew). However, characters may sometimes exhibit "jitter," where there may be vertical drift of the characters' positions. With the default setting, these character fragments would be rejected because they do not lie along the "mainline" of characters. Increasing this parameter setting allows the function to keep fragments that do not lie along the "mainline" of characters.

Segmentation Analysis Mode

Specifies the type of character analysis mode to perform to determine the optimal character segmentation. This parameter determines whether or not to perform additional analysis beyond the Group stage.

1 = Minimal

Performs basic segmentation analysis; follows the Group stage analysis only.

2 = Standard (default)

Performs additional analysis of the line as a whole, including character spacing, to determine the optimal segmentation. When enabled, the function will be affected by the Minimum Pitch, Character Pitch Type, and Character Pitch Position parameters.

Minimum Pitch

Specifies the minimum pitch, in pixels (0 - 1000; default = 0), that can occur between two characters, where the pitch is computed based on the Character Pitch Position parameter. If the pitch between two fragments is smaller than this, then they must be considered to be part of the same character, unless the combined character would be too wide (as specified by the Maximum Character Width and/or Minimum Character Aspect Ratio parameters).

Character Pitch Position

Specifies how the pitch between two successive characters should be measured. Pitch is defined as the distance between (approximately) corresponding points on adjacent characters, and not the distance from the end of one character to the beginning of the next character (which is called the inter-character gap). For fixed-pitch fonts in which the characters have different widths, typically the pitch will have a consistent value only when using the appropriate pitch metric.

Note: If the Segmentation Analysis Mode parameter is set to Minimal, this parameter will be disabled.

1 = Auto (default)

Specifies that an unknown metric is being used; the appropriate pitch may be any of the other pitch positions, or else there is not a constant pitch position (as may be the case for a proportional-pitch font), and the OCRMax function will determine the appropriate pitch metric.

2 = Left-to-Left

Specifies that pitch is measured as the distance from the left-side of a character's character region to the left-side of the adjacent character's character region.

Note: The terms "left" and "right" are relative to the coordinate axis defined by the ROI, i.e. "right" is equal to the positive X direction.

4 = Center-to-Center

Specifies that pitch is measured as the distance from the center of a character's character region to the center of the adjacent character's character region.

8 = Right-to-Right

Specifies that pitch is measured as the distance from the right-side of a character's character region to the right-side of the adjacent character's character region.

Note: The terms "left" and "right" are relative to the coordinate axis defined by the ROI, i.e. "right" is equal to the positive X direction.

Character Pitch Type

Specifies whether the spacing between successive characters has a single, fixed value, or varies, depending on the particular characters. Pitch is defined as the distance between (approximately) corresponding points on adjacent characters, and not the distance from the end of one character to the beginning of the next character (which is called the inter-character gap).

Note: If the Segmentation Analysis Mode parameter is set to Minimal, this parameter will be disabled.

1 = Auto (default)

Specifies that pitch type is unknown, but the pitch type is expected to be either fixed or proportional, and not variable. The OCRMax function will determine the best type.

2 = Fixed

Specifies that pitch is fixed, which means that the pitch between any pair of characters is constant, e.g. regardless of the width of the character rectangles of the characters. The pitch is measured based on the Character Pitch Position parameter.

4 = Proportional

Specifies that the pitch is proportional, which means that the pitch between any pair of characters depends on the particular characters. For example, the distance between two "i" lowercase letters may be much less than the distance between two "M" uppercase letters.

Note:
  • Although no pitch measurement is constant throughout a string, typically the inter-character gap, which is the distance from the right-side of one character's character region to the left-side of the adjacent character's character region, is approximately constant.
  • The terms "left" and "right" are relative to the coordinate axis defined by the ROI, i.e. "right" is equal to the positive X direction.

8 = Variable

Specifies that no character-to-character distance metric is consistent throughout a string, e.g. character placement is erratic, and the pitch is neither fixed nor proportional. The OCRMax function should not try to determine the type of pitch.

Note: Variable pitch is different from Auto, since Auto assumes that the pitch is either fixed or proportional, but which is not known.

Advanced Classification

Specifies parameters that adjust the Classification process, i.e. how the function classifies characters based on the difference between the trained character and the character in an acquired image.

Classifier Template Size Width

Specifies an acceptable X-scale factor (5 - 50; default = 10) between the trained character rectangle and the run-time character rectangles. When enabled, only instances that satisfy the scale constraints are computed.

Classifier Template Size Height

Specifies an acceptable Y-scale factor (5 - 50; default = 18) between the trained character rectangle and the run-time character rectangles. When enabled, only instances that satisfy the scale constraints are computed.

Maintain Character Aspect Ratio

Specifies whether or not the function should maintain the aspect ratio when resampling the unwrapped image at run-time.

0 = Off (default)

The function will not maintain the aspect ratio when resampling the unwrapped image at run-time.

1 = On

The function will maintain the aspect ratio when resampling the unwrapped image at run-time.

Skip Additional Character Validation

Specifies whether or not the function will perform additional character validation during classification.

0 = Off

The function will perform additional character validation during classification; this reduces the chances for misreads (i.e. false acceptance of a character).

1 = On (default)

The function will not perform additional character validation during classification. However, this option introduces the possibility for misreads.

Image Preprocessing

Specifies an image preprocessing function to perform prior to classification.

0 = None

The function will not perform any image preprocessing prior to classification.

1 = Normalized Histogram (default)

The function will normalize the image based on a histogram.

2 = Subtract Median

The function will subtract based on the median of a local neighborhood.

3 = Normalized Histogram & Subtract Median

The function will perform both a normalized histogram and subtract median image preprocessing during classification.

Note: When this option is selected, the histogram normalization is performed before the median subtraction.

Fielding

Fielding is used to provide information about what characters are expected at different positions in the string.

Ignore Unfielded Characters

Specifies whether or not, at each character position, to constrain the results to only include characters specified by the character's fielding. When enabled, all other characters in the font will be ignored, regardless of their classification score.

0 = Off (default)

The function will not constrain results based on the character's fielding.

1 = On

The function will constrain results based on the character's fielding.

String Length

Specifies whether or not fielding will be run in fixed-length or variable-length mode.

0 = Variable

The function will operate in variable-length mode.

1 = Fixed (default)

The function will operate in fixed-length mode.

Maximum String Length

Specifies the maximum acceptable string length (0 - 100; default = 25).

Note: This parameter is disabled if the String Length parameter is set to Fixed.

Minimum String Length

Specifies the minimum acceptable string length (0 - 100; default = 1).

Note: This parameter is disabled if the String Length parameter is set to Fixed.

Maximum First Fielded Index

Specifies the fielding subsequences to be considered, which must start at a position that is no greater than this index value (0 - 100; default = 100).

Note: This parameter is disabled if the String Length parameter is set to Fixed.

Minimum Last Fielded Index

Specifies the fielding subsequences to be considered, which must end at a position that is no greater than this index value (0 - 100; default = 0).

Note: This parameter is disabled if the String Length parameter is set to Fixed.

OCRMaxSettings Outputs

Returns

A Settings data structure containing the character string that was read, or #ERR if any of the input parameters are invalid.

For more information, see OCV/OCR, OCRMax, OCRMax, or Configure the OCRMax Function.