Punjabi University to develop OCR system for Indian languages
Patiala, February 9
Punjabi University, along with the IIIT, Hyderabad, the IIT, Delhi, the IIT, Bombay, and the IIT, Jodhpur, will develop the optical character recognition (OCR) system and applications for Indian languages under a Ministry of Electronics and Information Technology project for building technologies for recognition of spoken and written text of Indian languages and their translation to another language named National Language Translation Mission (NLTM): BHASHINI project.
The ministry will spend Rs 495.51 crore on the project in three years, out of which Punjabi University, along with others, has been allocated a grant-in-aid of Rs 14.7 crore to build APIs for recognition of printed and handwritten material.
The project team, which is led by IIIT Hyderabad, will develop high-accuracy recognisers for printed, handwritten and scene text for all the 22 scheduled Indian languages. This will open up opportunities for the technologies that use Indian language OCR systems. The technologies will be made available to start-ups and industries, state and Central Governments, banks, service centres, NGOs, researchers and students.
Gurpreet Singh Lehal, Director, Research Centre for Technical Development of Punjabi on the campus and co-investigator of the project, said Punjabi University will mainly be focusing on developing recognisers for books, newspapers, journals, thesis and other printed material. “We will develop recognisers for recognising printed text in scanned documents for 13 Indian languages, including Punjabi, Hindi, Sindhi, Kashmiri, Urdu, Marathi, Sanskrit, Dogri, Konkani, Nepali, Maithali, Santhali and Bodo. We will also develop technologies for creation of Braille printed books in these languages”, he said. —