In the trend of Digital Transformation, there is an emerging need of converting unstructured data from different document formats into structured digital forms in terms of information management

DIGI-XTRACT is a solution developed by the DIGI-TEXX software team using cutting-edge technologies such as Machine Learning, Deep Learning, Computer Vision, Natural Language Processing.

DIGI-XTRACT is an intelligent automation solution that performs document classification, data extraction, and quality control applied on various document types like Invoices, Application Forms, Historical documents, Handwritten forms, etc. DIGI-XTRACT not only supports multiple languages such as English, German, Japanese, Vietnamese (and more) but also can be customized based on special document types and languages according to the client’s business.

The product is built by three main components including

DIGI-XTRACT recognizes and classifies various document types automatically, using the AutoClassify component. It can accurately detect document types based on vectorization and big data analytics. The system then will route the document to AutoXtract component to extract data for an optimized accuracy rate.

Common document types from individual/organization include:

AutoClassify is applied:

With predefined data fields, the AutoExtract component picks up the correct data field on the image and processes the extraction securely based on the snipped image. Therefore, full information of client documents will not be seen or shared by any third party.

AutoExtract produces a confidence score for each data field. The score then can be used to determine the quality of the extraction in the set of rules of AutoQC component.

AutoExtract is applied:

AutoQC runs the quality control based on a complex scoring combination:

  • Common rules such as the format of IBAN Number, ID Card, Postal Code, Age, Gender, Date/Time, etc.
  • Business rules based on the client business domain
  • Data Field Relationships such as {age, gender, disease}, {title, salary, business}, {hospital, treatment, age, gender}, etc.
  • Image Quality Analytics: clear/unclear, blurred, skewed, flipped, distorted, low resolution

With the traditional quality control approach, there are different methodologies with human involvement. With AutoQC, the process is broken down into data levels and tracked by metadata in various steps. The AutoQC runs through 100 percent processed data and points out potential errors.

By the scoring, AutoQC can tell the potential error and control the Straight-Through-Rate (STR) so that the system can decide to let the data go through or transfer it to the data correction step for quality enhancement.

AutoQC is applied:

DIGI-XTRACT Process

Straight-through Process (STP) /  Automation Process

Guaranteed Process / Automation Process with Human Touch

In the trend of Digital Transformation, there is an emerging need of converting unstructured data from different document formats into structured digital forms in terms of information management

DIGI-XTRACT is a solution developed by the DIGI-TEXX software team using cutting-edge technologies such as Machine Learning, Deep Learning, Computer Vision, Natural Language Processing.

DIGI-XTRACT is an intelligent automation solution that performs document classification, data extraction, and quality control applied on various document types like Invoices, Application Forms, Historical documents, Handwritten forms, etc. DIGI-XTRACT not only supports multiple languages such as English, German, Japanese, Vietnamese (and more) but also can be customized based on special document types and languages according to the client’s business.

The product is built by three main components including

DIGI-XTRACT recognizes and classifies various document types automatically, using the AutoClassify component. It can accurately detect document types based on vectorization and big data analytics. The system then will route the document to AutoXtract component to extract data for an optimized accuracy rate.

Common document types from individual/organization include:

AutoClassify is applied:

With predefined data fields, the AutoExtract component picks up the correct data field on the image and processes the extraction securely based on the snipped image. Therefore, full information of client documents will not be seen or shared by any third party.

AutoExtract produces a confidence score for each data field. The score then can be used to determine the quality of the extraction in the set of rules of AutoQC component.

 

AutoExtract is applied:

AutoQC runs the quality control based on a complex scoring combination:

  • Common rules such as the format of IBAN Number, ID Card, Postal Code, Age, Gender, Date/Time, etc.
  • Business rules based on the client business domain
  • Data Field Relationships such as {age, gender, disease}, {title, salary, business}, {hospital, treatment, age, gender}, etc.
  • Image Quality Analytics: clear/unclear, blurred, skewed, flipped, distorted, low resolution

With the traditional quality control approach, there are different methodologies with human involvement. With AutoQC, the process is broken down into data levels and tracked by metadata in various steps. The AutoQC runs through 100 percent processed data and points out potential errors.

By the scoring, AutoQC can tell the potential error and control the Straight-Through-Rate (STR) so that the system can decide to let the data go through or transfer it to the data correction step for quality enhancement.

AutoQC is applied:

DIGI-XTRACT Process

Straight-through Process (STP) /  Automation Process

Guaranteed Process / Automation Process with Human Touch

In the trend of Digital Transformation, there is an emerging need of converting unstructured data from different document formats into structured digital forms in terms of information management

DIGI-XTRACT is a solution developed by the DIGI-TEXX software team using cutting-edge technologies such as Machine Learning, Deep Learning, Computer Vision, Natural Language Processing.

DIGI-XTRACT is an intelligent automation solution that performs document classification, data extraction, and quality control applied on various document types like Invoices, Application Forms, Historical documents, Handwritten forms, etc. DIGI-XTRACT not only supports multiple languages such as English, German, Japanese, Vietnamese (and more) but also can be customized based on special document types and languages according to the client’s business.

The product is built by three main components including

DIGI-XTRACT recognizes and classifies various document types automatically, using the AutoClassify component. It can accurately detect document types based on vectorization and big data analytics. The system then will route the document to AutoXtract component to extract data for an optimized accuracy rate.

Common document types from individual/organization include:

AutoClassify is applied:

With predefined data fields, the AutoExtract component picks up the correct data field on the image and processes the extraction securely based on the snipped image. Therefore, full information of client documents will not be seen or shared by any third party.

AutoExtract produces a confidence score for each data field. The score then can be used to determine the quality of the extraction in the set of rules of AutoQC component.

 

AutoExtract is applied:

AutoQC runs the quality control based on a complex scoring combination:

  • Common rules such as the format of IBAN Number, ID Card, Postal Code, Age, Gender, Date/Time, etc.
  • Business rules based on the client business domain
  • Data Field Relationships such as {age, gender, disease}, {title, salary, business}, {hospital, treatment, age, gender}, etc.
  • Image Quality Analytics: clear/unclear, blurred, skewed, flipped, distorted, low resolution

With the traditional quality control approach, there are different methodologies with human involvement. With AutoQC, the process is broken down into data levels and tracked by metadata in various steps. The AutoQC runs through 100 percent processed data and points out potential errors.

By the scoring, AutoQC can tell the potential error and control the Straight-Through-Rate (STR) so that the system can decide to let the data go through or transfer it to the data correction step for quality enhancement.

AutoQC is applied:

DIGI-XTRACT Process

Straight-through Process (STP) /  Automation Process

Guaranteed Process / Automation Process with Human Touch

Product Features 

Product Features 

Product Features 

Key Product Values

Key Product Values

Key Product Values

Deployment and Pricing Model

Deployment and Pricing Model

Deployment and Pricing Model

Accuracy Rate

Accuracy Rate is the calculation based on the machine score (confidence score) to measure the certainty of the extracted data from an original image. A high accuracy rate brings better data quality and supports analytic purposes.

The accuracy rate depends on the document quality. With DIGI-XTRACT, the accuracy rate is equipped with intelligent engines to ensure the quality meets the client’s expectations.

The accuracy rate can be measured by various units such as character, word, field and line.

Accuracy Rate

Accuracy Rate is the calculation based on the machine score (confidence score) to measure the certainty of the extracted data from an original image. A high accuracy rate brings better data quality and supports analytic purposes.

The accuracy rate depends on the document quality. With DIGI-XTRACT, the accuracy rate is equipped with intelligent engines to ensure the quality meets the client’s expectations.

The accuracy rate can be measured by various units such as character, word, field and line.

Accuracy Rate

Accuracy Rate is the calculation based on the machine score (confidence score) to measure the certainty of the extracted data from an original image. A high accuracy rate brings better data quality and supports analytic purposes.

The accuracy rate depends on the document quality. With DIGI-XTRACT, the accuracy rate is equipped with intelligent engines to ensure the quality meets the client’s expectations.

The accuracy rate can be measured by various units such as character, word, field and line.

Client Support

DIGI-XTRACT is supported and delivered by an excellent onboarding team together with the client experience team.

All projects are monitored 24/7 by our Network Operating Center to ensure optimal service availability.  

DIGI-TEXX provides an end-to-end client experience from the first step of the analysis to the final step of implementation and enhancement. On top of that, the Support Team accompanies our clients throughout the whole operation phase to ensure a smooth transition and successful delivery.

Client Support

DIGI-XTRACT is supported and delivered by an excellent onboarding team together with the client experience team.

All projects are monitored 24/7 by our Network Operating Center to ensure optimal service availability.  

DIGI-TEXX provides an end-to-end client experience from the first step of the analysis to the final step of implementation and enhancement. On top of that, the Support Team accompanies our clients throughout the whole operation phase to ensure a smooth transition and successful delivery.

Client Support

DIGI-XTRACT is supported and delivered by an excellent onboarding team together with the client experience team.

All projects are monitored 24/7 by our Network Operating Center to ensure optimal service availability.  

DIGI-TEXX provides an end-to-end client experience from the first step of the analysis to the final step of implementation and enhancement. On top of that, the Support Team accompanies our clients throughout the whole operation phase to ensure a smooth transition and successful delivery.