Apurba Bangla Optical Character Recognition (OCR)

Data Source

NumtaDB

NumtaBD is compiled by Bengali.ai and hosted at Bengali.ai website and can be downloaded from here.


85,000+ dataset was compiled

Sample digits from the NumtaDB database.

Distribution of the different digit classes in the NumtaDB database

Principal component analysis to reduce dimensionality on NumtaDB

Bengali Digit Recognition in the Wild (BDRW)

BDRW was released for a machine vision challenge hosted by IEEE TechSym 2016 and can be downloaded from here.


Sample digits from the BDRW dataset

The distribution of the different classes in the BDRW database

Principal component analysis to reduce dimensionality on BDRW