DIGI-XTRACT recognizes and classifies various document types automatically, using the AutoClassify component. It can accurately detect document types based on vectorization and big data analytics. The system then will route the document to AutoXtract component to extract data for an optimized accuracy rate.
Common document types from individual/organization include:
Application Form, Historical Documents (Marriage, Death Certificate,..)
Identity Documents: ID Cards, Passport, Birth Certificate
Employment Document: Labor Contract, Work Permit, Confirmation Letter
Income Document: Bank Statement, Payslip
Financial and Accounting Document: Invoice, Contract, Financial Statement,..
Other Document types: Examination Sheet, Catalog Book, Land Register Book
AutoClassify is applied:
Computer Vision
Deep Learning (DL)
Machine Learning (ML)
Elastic Search (Similarity Search)