BUSINESS CHALLENGES
Our Client
Our client includes a network of leading schools and reputable educational institutions across Europe. These schools are known for their academic excellence and strong commitment to fairness and accuracy in standardized testing.
Each year, they conduct large-scale examinations required within the academic program. Maintaining accurate and fair grading is crucial to preserving the credibility and confidence of both students and educators.
Project Challenges
Manual Grading Process Is Overwhelmed When Handling Large Volumes of Exams
On average, schools must grade between 50,000 and 80,000 exams per semester. With the manual grading process, it takes approximately 7 minutes to complete both the grading and quality checking for each exam. This creates several significant challenges, including:
| Challenge | Operational Impact |
| High volume of examination papers. | Manual grading becomes extremely time-consuming. |
| Manual grading process. | Increased susceptibility to human error and mistakes during evaluation. |
| Unclear or inconsistent student handwriting. | Further complicates the process and slows down the overall evaluation speed. |

Unclear or inconsistent student handwriting further complicates the grading process
Project Scope
Building an automated grading process by applying automatic data extraction technology from mixed-format exams, then automatically transferring the complete test data to the grading system for processing.
This solution ensures high accuracy even in cases with erased or modified answers, so that the final scores accurately reflect the student’s performance.
- Processing Volume: Automatically extract data from 50,000 to 80,000 multiple-choice tests per exam session.
- Document Type: Paper-based test combining multiple-choice and essay questions
- Languages: English and German.
- Service time: 24/7.
- Accuracy: Achieves 99% accuracy in data extraction, including precise detection of erased and modified areas.
SOLUTION
Automated Data Extraction Solution with Multi-Layered QC
Accurate data extraction plays a critical role in truly reflecting students’ performance. To meet this need, DIGI-TEXX has developed a three-step automated data extraction process that eliminates manual intervention and significantly enhances the reliability of the automated grading process.
1. Image Processing
Before data extraction, DIGI-TEXX applies Image Quality Enhancement technology to optimize the visual quality of test papers, improving readability and the accuracy of information recognition.
This technology leverages digital image processing techniques to:
- Remove background noise and unnecessary blur
- Correct skew and properly rotate the test paper
- Crop out excess or irrelevant areas around the answer sheet
- Adjust brightness, sharpness, and color settings to highlight critical data zones
2. Automated Data Extraction
After the image preprocessing step, the tests are processed by DIGI-XTRACT, an automated document processing platform developed by the DIGI-TEXX software team.
DIGI-XTRACT utilizes Machine Learning (ML) and Deep Learning (DL) models to accurately detect marked answer areas (bubbled answers), even in cases with erasures or corrections.
The system is capable of:
- Identifying and extracting the final selected answers
- Distinguishing clearly between crossed-out, corrected, or lightly shaded choices
- Standardizing the output into a format ready for transfer to the automated grading system

The data is extracted with a confidence level in each field
3. Multi-layered Auto Quality Control
To ensure the extracted results accurately reflect each student’s responses, DIGI-TEXX implements a Multi-layered Auto Quality Control process consisting of two to three levels:
- Level 1 – Automated Scoring: Executes quality control based on confidence levels, applying a complex scoring method to ensure the highest possible automated output quality.
- Level 2 – Randomized Verification: A reviewer performs random quality checks on machine-verified data to detect potential algorithmic oversights and ensure consistent accuracy across the dataset.
- Level 3 – Targeted Expertise (if required): For exams with abnormal indicators or low-confidence levels, the data is escalated to a second reviewer for deeper inspection.
As a part of the Auto Quality Control process, we integrate detailed validation criteria to ensure input data accuracy and consistency, including:
- Format validation: Ensures fields like student ID, test code, and date follow required formats.
- Data logic checks: Verifies logical consistency across fields, such as making sure the total number of selected answers doesn’t exceed the number of questions
- Image quality re-assessment: Ensures that issues such as blurriness, skewing, or incorrect page size do not compromise the integrity of the extracted data.

We ensure data quality and accuracy through Multi-layered Quality Control
This process is also integrated with detailed validation criteria to ensure input data accuracy and consistency, including:
- Format validation: Ensures fields like student ID, test code, and date follow required formats.
- Data logic checks: Verifies logical consistency across fields, such as making sure the total number of selected answers doesn’t exceed the number of questions.
Image quality re-assessment: Ensures that issues such as blurriness, skewing, or incorrect page size do not compromise the integrity of the extracted data.
BUSINESS OUTCOME
- Processing Time
| Metric | Traditional Process | Optimized Process with Auto QC |
| Processing Time | 7 Minutes per test | Under 1 Minute per test |
| Workflow Depth | Single-layer manual process | Three-layer comprehensive verification |
| Quality Control | 1 layer (Standard) | 1 layer of System Auto QC + 2 additional layers of Expert QC |
- Data extraction accuracy per exam increased from 80% to 99.97%, ensuring that results accurately reflect students’ abilities and maintain the school’s reputation.
- Efficiently processes between 50,000 and 80,000 tests per exam period, increasing 45% productivity and significantly reducing 68% of the manual grading workload.
- Eliminates manual data entry processes, optimizes resources, and reduces dependence on grading staff.
- Meets urgent and continuous grading deadlines with a 24/7 operational system.




