Blue Read Tool – Labeling

In order to determine the performance of the Blue Read tool on your images, you need to be able to compare the characters identified by the tool with the actual character values in the image. The process of Labeling allows you to specify the positions and values of the characters in your images. Once your image set is partially or completely labeled, you can do two important things:

  • You can compute statistical measures of the performance of the tool on your images, including Recall, Confusion, Precision and F-Score.
  • You can perform incremental training of your tool. (You can improve your tool’s performance by giving it examples of how specific characters appear in your images.)

The images in your image set are either labeled or unlabeled. Labeled images are indicated by the presence of green graphics, either features or strings, on the display.

An extremely important consideration when labeling images is that if an image is labeled, all of the characters in the image should be labeled. If only some characters are labeled, this will invalidate the statistical measurements (since the tool will be finding “spurious” or “unexpected” characters), and it will also cause any incremental training to reduce, rather than improve, the accuracy of the tool (because any characters in an image that are unlabeled are assumed by the tool to not be characters).