This is a problem that has stumped DocumentIntelligence (85%), Sonnet 3.5 (~90%) and other models. It’s pretty simple:
Can you OCR this table (just the first page is hard) and get back to structured data?
With GOT and Sonnet-3.5, we can build a more generic pipeline with just rendered images that achieves 98.79%.
Here’s the approach:
Here’s what we get:
Here is a table image and the OCR output. Can you use the OCR output and put it into a proper markdown table without any mistakes?
Certainly. I'll create a markdown table using the OCR output, correcting any obvious errors and formatting it properly. Here's the markdown table:
Sr. No. | Station Name | Anurag Singh Thakur | Satpal Raizada | Hem Raj | Arun Ankesh Syal | Subedar Major Kulwant Singh | Jagdeep Kumar | Sumit Rana | Garib Dass Katoch | Gopi Chand Attari | Nand Lal | Ramesh Chand Sarthi | Surender Kumar Gautam | Total valid votes | Rejected votes | NOTA | Total votes | No. of Tendered Votes |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1-SAYAR | 393 | 213 | 9 | 0 | 2 | 12 | 0 | 0 | 0 | 0 | 2 | 1 | 632 | 0 | 8 | 640 | 0 |
2 | 2-KAROT | 313 | 172 | 20 | 2 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 512 | 0 | 3 | 515 | 0 |
3 | 3-RANIKOTLA | 640 | 281 | 0 | 1 | 4 | 17 | 0 | 0 | 0 | 0 | 1 | 3 | 947 | 0 | 2 | 949 | 0 |
4 | 4-SAI BRAHMA | 364 | 249 | 1 | 0 | 1 | 10 | 0 | 0 | 1 | 1 | 0 | 6 | 633 | 0 | 6 | 639 | 0 |
5 | 5-BATHOH | 331 | 250 | 3 | 1 | 0 | 21 | 0 | 2 | 0 | 0 | 1 | 3 | 612 | 0 | 2 | 614 | 0 |
6 | 6-BHOLI | 364 | 313 | 12 | 0 | 3 | 4 | 1 | 0 | 1 | 2 | 4 | 3 | 707 | 0 | 6 | 713 | 0 |
7 | 7-KHARSI KNAITTA | 316 | 96 | 2 | 0 | 1 | 4 | 0 | 0 | 0 | 1 | 0 | 0 | 420 | 0 | 3 | 423 | 0 |
8 | 8-SOHRA BYUS | 254 | 116 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 375 | 0 | 2 | 377 | 0 |
9 | 9-SUI | 457 | 240 | 4 | 1 | 3 | 3 | 0 | 1 | 2 | 1 | 0 | 1 | 713 | 0 | 5 | 718 | 0 |
10 | 10-MALOKHAR | 494 | 206 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 1 | 1 | 1 | 705 | 0 | 7 | 712 | 0 |
11 | 11-SOLDHA | 445 | 275 | 5 | 0 | 3 | 6 | 0 | 0 | 0 | 2 | 2 | 1 | 739 | 0 | 5 | 744 | 0 |
12 | 12-CHANDPUR | 254 | 167 | 1 | 0 | 0 | 7 | 1 | 1 | 0 | 2 | 1 | 5 | 439 | 0 | 6 | 445 | 0 |
13 | 13-KYARAN | 301 | 243 | 2 | 0 | 0 | 7 | 0 | 0 | 0 | 3 | 0 | 1 | 557 | 0 | 2 | 559 | 0 |
14 | 14-CHHAKOH | 542 | 261 | 4 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 811 | 0 | 5 | 816 | 0 |
15 | 15-GALAU | 317 | 360 | 4 | 0 | 1 | 2 | 1 | 0 | 0 | 1 | 2 | 1 | 689 | 0 | 3 | 692 | 0 |
16 | 16-SAI NOWDAN | 147 | 115 | 0 | 0 | 0 | 13 | 0 | 0 | 0 | 0 | 0 | 0 | 275 | 0 | 3 | 278 | 0 |
17 | 17-JANLAGHAT | 296 | 336 | 2 | 0 | 0 | 17 | 0 | 0 | 0 | 1 | 0 | 0 | 652 | 0 | 1 | 653 | 0 |
18 | 18-SOHRI | 263 | 182 | 1 | 1 | 3 | 8 | 2 | 0 | 0 | 0 | 0 | 2 | 462 | 0 | 4 | 466 | 0 |
19 | 19-BANDU | 199 | 167 | 2 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 369 | 0 | 1 | 370 | 0 |
20 | 20-ASHA MAJARI | 224 | 157 | 0 | 0 | 1 | 4 | 0 | 0 | 1 | 0 | 0 | 1 | 388 | 0 | 0 | 388 | 0 |
21 | 21-JUKHALA-1 | 195 | 198 | 2 | 0 | 1 | 17 | 1 | 0 | 0 | 1 | 1 | 2 | 418 | 0 | 4 | 422 | 0 |
22 | 22-GASSAUR | 322 | 262 | 1 | 0 | 1 | 6 | 1 | 0 | 0 | 1 | 0 | 4 | 598 | 0 | 3 | 601 | 0 |
23 | 23-JUKHALA-2 | 281 | 238 | 0 | 0 | 0 | 7 | 0 | 0 | 0 | 0 | 1 | 0 | 527 | 0 | 2 | 529 | 0 |
24 | 24-SYCHLA | 450 | 359 | 1 | 1 | 1 | 48 | 5 | 0 | 0 | 2 | 0 | 0 | 867 | 0 | 9 | 876 | 1 |
25 | 25-KOTLA | 510 | 318 | 3 | 0 | 1 | 12 | 1 | 0 | 1 | 1 | 0 | 0 | 847 | 0 | 8 | 855 | 0 |
26 | 26-HIHARKHAN BASLA | 309 | 320 | 4 | 0 | 0 | 18 | 1 | 0 | 1 | 0 | 0 | 2 | 655 | 0 | 4 | 659 | 0 |
27 | 27-PALOG | 67 | 139 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 2 | 209 | 0 | 1 | 210 | 0 |
28 | 28-BAGKHURD | 210 | 235 | 1 | 1 | 9 | 22 | 2 | 0 | 0 | 0 | 1 | 0 | 481 | 0 | 0 | 481 | 0 |
29 | 29-SOSAN | 273 | 323 | 4 | 0 | 5 | 1 | 1 | 0 | 0 | 1 | 1 | 3 | 612 | 0 | 4 | 616 | 0 |
30 | 30-GHYAL | 186 | 440 | 2 | 1 | 0 | 3 | 0 | 0 | 0 | 0 | 1 | 1 | 634 | 0 | 7 | 641 | 0 |
31 | 31-TEPARA | 76 | 166 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 245 | 0 | 1 | 246 | 0 |
32 | 32-DABAR | 199 | 315 | 0 | 0 | 1 | 4 | 0 | 0 | 0 | 0 | 1 | 1 | 521 | 0 | 0 | 521 | 0 |
33 | 33-KOTHI | 139 | 127 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 1 | 271 | 0 | 1 | 272 | 0 |
34 | 34-LUHARDA | 169 | 136 | 0 | 0 | 1 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 312 | 0 | 2 | 314 | 0 |
35 | 35-KAHLI | 113 | 59 | 4 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 177 | 0 | 1 | 178 | 0 |
This table represents the voting results for various polling stations, including the number of votes received by each candidate and other relevant information.
This is the image we’re using: