DIGI-XTRACT

A fully automated data extraction solution that can eliminate the need for human intervention.

An Optical Characteristic Recognition (OCR) solution built on the base of Machine Learning, Deep Learning technology to perform document classification, data extraction, and quality control applied to various document types.

Processed document types: Invoices, application forms, historical documents, handwritten forms, etc. 

DIGI-XTRACT supports multiple and can also be customized for special document types respective to the client’s business languages. The service can be securely and remotely hosted at DIGI-TEXX’s Data Center or simply deployed at the client’s premises using state-of-the-art technologies.

DIGI-XTRACT COMPONENTS

AUTO CLASSIFY

DIGI-XTRACT recognizes and classifies various document types automatically, using the Auto Classify component. It can accurately detect document types based on vectorization and big data analytics. The system then will route the document to the Auto Extract component to extract data for an optimized accuracy rate.

Common document types from individuals/organizations include:

  • Application Form, Historical Documents (Marriage, Death Certificate,..)
  • Identity Documents: ID Cards, Passport, Birth Certificate
  • Employment Document: Labor Contract, Work Permit, Confirmation Letter
  • Income Document: Bank Statement, Payslip
  • Financial and Accounting Document: Invoice, Contract, Financial Statement,..
  • Other Document types: Examination Sheet, Catalog Book, Land Register Book

AUTO EXTRACT

With predefined data fields, the Auto Extract component picks up the correct data field on the image and processes the extraction securely based on the snipped image. Therefore, full information on client documents will not be seen or shared by any third party.

Auto Extract produces a confidence score for each data field. The score can then be used to determine the quality of the extraction in the set of rules of the Auto QC component.

AUTO QC

Auto QC runs the quality control based on a complex scoring combination:

  • Common rules such as the format of IBAN Number, ID Card, Postal Code, Age, Gender, Date/Time, etc.
  • Business rules based on the client’s business domain
  • Data Field Relationships such as {age, gender, disease}, {title, salary, business}, {hospital, treatment, age, gender}, etc.
  • Image Quality Analytics: clear/unclear, blurred, skewed, flipped, distorted, low resolution

With the traditional quality control approach, there are different methodologies with human involvement. With Auto QC, the process is broken down into data levels and tracked by metadata in various steps. The Auto QC runs through 100 percent processed data and points out potential errors.

With the score, Auto QC can detect the potential error and control the Straight-Through-Rate (STR) so that the system can decide to let the data go through or transfer it to the data correction step for quality enhancement.

AUTO CLASSIFY

DIGI-XTRACT recognizes and classifies various document types automatically, using the Auto Classify component. It can accurately detect document types based on vectorization and big data analytics. The system then will route the document to the AutoExtract component to extract data for an optimized accuracy rate.

Common document types from individuals/organizations include:

  • Application Form, Historical Documents (Marriage, Death Certificate,..)
  • Identity Documents: ID Cards, Passport, Birth Certificate
  • Employment Document: Labor Contract, Work Permit, Confirmation Letter
  • Income Document: Bank Statement, Payslip
  • Financial and Accounting Document: Invoice, Contract, Financial Statement,..
  • Other Document types: Examination Sheet, Catalog Book, Land Register Book

AUTO EXTRACT

With predefined data fields, the Auto Extract component picks up the correct data field on the image and processes the extraction securely based on the snipped image. Therefore, full information on client documents will not be seen or shared by any third party.

Auto Extract produces a confidence score for each data field. The score can then be used to determine the quality of the extraction in the set of rules of the Auto QC component.

AUTO QC

Auto QC runs the quality control based on a complex scoring combination:

  • Common rules such as the format of IBAN Number, ID Card, Postal Code, Age, Gender, Date/Time, etc.
  • Business rules based on the client’s business domain
  • Data Field Relationships such as {age, gender, disease}, {title, salary, business}, {hospital, treatment, age, gender}, etc.
  • Image Quality Analytics: clear/unclear, blurred, skewed, flipped, distorted, low resolution

With the traditional quality control approach, there are different methodologies with human involvement. With Auto QC, the process is broken down into data levels and tracked by metadata in various steps. The Auto QC runs through 100 percent processed data and points out potential errors.

With the score, Auto QC can detect the potential error and control the Straight-Through-Rate (STR) so that the system can decide to let the data go through or transfer it to the data correction step for quality enhancement.

DIGI-XTRACT FEATURES

Automated extraction of unstructured/semi-structured data

Manual data entry elimination

API gateway integration

Web Monitoring Services for real-time tracking and automatic reporting functions

High performance and quality

High availability of back-end processing systems

PROCESS OF THE PRODUCT

STRAIGHT-THROUGH PROCESS (STP)/AUTOMATION PROCESS

GUARANTEED PROCESS/AUTOMATION PROCESS WITH HUMAN TOUCH

ACCURACY RATE

Our accuracy rate calculates a confidence score that measures the certainty of the extracted data from its original image. A higher accuracy rate, which is dependent on the quality of the assessed document, brings better data quality and supports analytic purposes. 

With DIGI-XTRACT, the accuracy rate is equipped with intelligent engines to ensure the quality meets the client’s expectations.

The accuracy rate can be measured by various units such as character, word, field, and line.

CLIENT SUPPORT

DIGI-XTRACT is supported and delivered by an excellent onboarding team partnered with the client experience team.

All projects are monitored 24/7 by our Network Operating Center to ensure optimal service availability.  

DIGI-TEXX provides end-to-end client experience from the first step of the analysis to the final step of implementation and enhancement. On top of that, the Support Team accompanies clients throughout the whole operation phase to ensure a smooth transition and successful delivery.

WHAT MAKES US DIFFERENT?

01

AUTOMATION WITH 24/7 MONITORING

Fully automated solution with no human intervention and a transparent process with Web Monitoring Services that provide data status for each step.

02

EASY INTEGRATION AND QUICK SETUP

Based on the client’s demand any customized transfer methods (Secure Transfer Protocols, API, Email) fit the client’s system. 2-4 week setup time.

03

FLEXIBLE PRICING RATES

Attractive pricing model based on transactions, subscriptions, and fixed volumes to suit all clients’ needs.

CASE STUDIES

Straight-Through Process for Customer Onboarding Background

Straight-Through Process for Customer Onboarding

An automatic solution when it comes to no manual intervention involved and driving operational efficiency
Learn more
Automated Insurance Claims Background

Automated Insurance Claims

Intelligent automation solution to reduce complex claims document processing time from days to minutes.
Learn more
Digital Inspection System Background

Digital Inspection System

Spend less time collecting data from paperwork and more time improving your inspection performance.
Learn more