Data Extraction Solution for Customer Onboarding Straight-Through Process

Data extraction solution powered by machine learning technology boosts the quality of input data, resulting in an output with a high accuracy rate.

SERVICE OFFERS: Data Extraction Automation

BUSINESS CHALLENGES

Our Client

We serve a leading international insurance and financial services company with over 1.5 million customers operating in Asia, Canada, and the United States. In the Vietnam market, their network of 80 offices provides financial advice, insurance, wealth management, and asset management services for individuals, groups, and institutions.

Project Challenges

Limited OCR’s Captured Capacity

The client’s current OCR engine is powered to capture ID cards, yet, the onboarding process is now open for various ID document types (ID cards, Passports, Birth certificates, military ID cards, etc.). This leads to the limitation of documents processed by the OCR engine, resulting in an increase in the human workforce for verification.

Data Extraction Solution For Customer Onboarding v5

Project Objective

  • Shorten the document and data processing time for one document to < 1 minute. 
  • Facilitate an end-to-end automatic approval process while ensuring data accuracy at the highest level.

Project Scope

Build a straight-through process for customer and agency onboarding by enhancing the OCR engine’s extraction capacity

  • Document types:
    • Identity documents (ID cards, Passports, Birth certificates, Military ID cards, etc.)
    • Application forms
  • Languages: English and Vietnamese 
  • Service time: 24/7 
  • Committed accuracy rate: 95%

SOLUTION

Data Extraction Solution

The quality of the input data plays a significant role in defining the output quality, therefore, DIGI-TEXX has developed a three-step data extraction with no human verification needed.

Data Extraction Solution For Customer Onboarding V3

DIGI-TEXX applies Image Quality Enhancement technology in the pre-processing step to transform the images and make them more OCR-friendly in later processing stages. 

This technology identifies the key features and details of the images, then adjust them using digital image processing techniques like:

  •  Remove image background noise
  •  Adjust skew and rotation
  •  Crop the excess areas
  • Tune the brightness, sharpness, and other color settings
BANKING AUTOMATION DATA EXTRACTION BACKGROUND

The processed documents will be processed by DIGI-XTRACT, a Document Processing service built by DIGI-TEXX’s software development team. 

DIGI-XTRACT is powered with Machine Learning (ML) and Deep Learning (DL) technology to enrich the data extraction quality to more documents like birth certificates, passports, military IDs, and bank statements.

Data Extraction Solution For Customer Onboarding V2

Auto QC runs the quality control based on confidence level – a complex scoring combination to ensure the highest output quality:

  • Common rules such as the format of ID cards, Postal Code, Age, Gender, Date/Time, etc.
  • Business rules based on the client’s business domain
  • Data Field Relationships
  • Image Quality Analytics: clear/unclear, blurred, skewed, flipped, distorted, low resolution.

When the extracted data is below a predefined threshold of confidence level, a notification will be sent to the client for further steps.

BUSINESS OUTCOME

  • Processing time per document is shortened from 3 minutes to 5 seconds/ document.
  • Accuracy  Rate: 60% to 97% (on field level)
  • Enhance the client’s document processing capacity from 95,000 pages/month to 3 million pages/month
  • The data output quality is no longer dependent on human
Data Extraction Solution For Customer Onboarding

RELATED CASE STUDIES

Data Preparation Service On ERP Systems Thumbnail

Data Preparation Service On ERP Systems

DIGI-TEXX’s client is a retail department store chain with over 90 locations in Germany. Our client needs clean, accurate, and accessible data to ensure proper data management in the SAP system, make informed decisions, and optimize operations.

Data Annotation and Labeling Social Media Data To Predict The Pandemic

Data Annotation and Labeling Social Media Data To Predict The Pandemic

DIGI-TEXX provided a robust text annotation service with human-in-the-loop, which combined the power of machine learning, natural language processing (NLP)...

Historical Obituary Data Collection With Web Scraping Solution

Online Historical Obituary Data Collection With Web Scraping Solution

A web scraping solution to automate collecting and processing historical obituary data across public digital newspaper archives and open-source sites.

SHARE YOUR CHALLENGES