Digitization in the corporate back office starts with text recognition. In many companies, invoices, delivery bills, applications and other documents come into the company in paper form. In order for the information from these receipts and documents to be used in data processing, it must be manually captured and entered into the system. Text recognition software, known as OCR (Optical Character Recognition), speeds up this laborious and error-prone process. It converts the information from paper-based scanned documents into machine-readable information. In this form, they can be made available to all departments for further processing, used for evaluations and archived.
HOW DOES OCR WORK?
On closer inspection, the term “text recognition” for OCR is a bit of an exaggeration. It’s actually about recognizing individual characters or letters, and that’s where the process gets its name. Here’s how it works: When paper documents are scanned or digitally photographed, so-called raster graphics are created. No letters are recognized on these, but rather, arranged in rows and columns, only the presence or absence of color dots. The color dots form patterns, and these are compared with patterns in a database so that they can be interpreted as letters, numbers, punctuation marks.
The process is error-prone, because the similarity of characters leads to confusion during interpretation. In addition, external features of the original document, such as an unclean printout, creases in the paper, soiling, etc., influence the recognizability of the patterns and thus their assignment to the database patterns. Initially, the error-proneness of OCR was so high that even special fonts were developed whose characters could not be confused with each other under any circumstances. In this way, errors in sensitive areas such as payment transactions were to be excluded as far as possible. If in other application areas the interpretation appears as “payment”, “order” or “cancellation” , manual reworking is required to arrive at the correct result “payment”, “order” or “cancellation”.
WHAT DOES ICR DO DIFFERENTLY?
As OCR has evolved, processing steps at the pixel level and pattern recognition have improved significantly. But it is only with ICR (Intelligent Character Recognition), the further development of OCR using AI, that recognition reaches a new dimension. At the character level, ICR explores, as it were, the scope for interpreting patterns and checks whether a pattern should be read, for example, as “8” or “B”, “6” or “b”, “m” or rn”. To do this, the procedure takes into account the character’s surroundings and considers the entire group of characters, clarifying the interpretation with the help of statistical methods – how often does a character occur in the context of the surrounding characters – but also reference patterns or dictionaries. The output is the character that the system judges to be most likely error-free; if necessary, it changes the parameters and checks the pattern and character again.
Through contextual interpretation, ICR achieves the highest possible degree of accuracy in the recognition of characters and thus creates the basis for further processing and evaluation of the information almost without the need for human intervention.