AI used to extract financial data from multi-language financial reports

Annual and other financial reports usually are voluminous documents with text, tables, columns, etc… that do not follow any standard format. Extraction of key financial information is typically a manual process which is costly, lengthy and error prone.

Infrrd IDP (Intelligent Data capture) Platform uses Optical Character Recognition (OCR), Computer Vision and Artificial Intelligence to extract and translate the financial data. The 4 step process assures speed and accuracy.

  1. Pre-processing

The documents are imported in a PDF format. The system corrects any misalignments, removes background noise, and identifies tables and columns using Machine Learning algorithms.

  1. Content Extraction.

The IDP OCR engine that supports over 90 languages is used to extract the required text.

  1. Creating a Searchable library.

The extracted text is transformed into a searchable HTML format while preserving the original table format.

  1. Language Translation.

Text translation is performed using Deep Neural Networks (type of Artificial Intelligence). Translations are performed from languages with a wide range of characters such as Hebrew, Russian and Vientamese. 

The financial research company processed over 2 million annual reports written in 37 languages saving 63% of time & costs while delivering 100% accurate data. 

Full story…