Developing Text-to-Speech (TTS) and Speech-to-Text (ASR) Recognition System for the Kashmiri Language
Funded by DST – TIDE Grant for Accessibility
Grant Details
Funding Agency
DST – TIDE Grant (Technology Interventions for Elderly and Disabled Persons)
Approved Budget
₹20.54 Lakhs
Duration
2 Years
Current Status
Ongoing
Principal Investigator
Co-Principal Investigator
Project Mandate and Focus
This project is mandated to develop an integrated Text-to-Speech (TTS) and Speech-to-Text (ASR) system for the Kashmiri language. The primary goal is to improve digital accessibility for elderly and disabled individuals, aligning directly with the objectives of the DST–TIDE programme.
Text-to-Speech (TTS) Component
The TTS system focuses on generating highly intelligible and natural-sounding Kashmiri speech:
- Recording of natural speech samples to build a neural TTS voice bank for Kashmiri.
- Implementation of models like Tacotron, FastSpeech, and WaveRNN for realistic speech synthesis.
- Special emphasis on clear articulation, slower speech modes, and high intelligibility to assist elderly users and persons with visual or speech impairments.
Speech-to-Text (ASR) Component
The ASR system is designed for robustness across various speakers and accents:
- Collection of regionally diverse speech data representing major Kashmiri dialects.
- Development of robust ASR models using Transformers, CTC-based models, and encoder–decoder networks.
- Focus on high accuracy for spontaneous and accented speech, ensuring usability by elderly speakers.
Alignment with TIDE Programme
The project directly supports the goals of the TIDE scheme by enabling foundational accessibility:
- Voice-based digital interactions for persons with disabilities.
- Assistive reading and communication tools for the visually impaired.
- Hands-free interfaces for elderly users who face challenges with text-based systems.
Impact and Applications
By creating foundational speech technology for Kashmiri, the project significantly advances inclusive AI, enhances digital accessibility, and supports the preservation and modernization of the language.
- Voice-enabled e-governance services.
- Assistive speech technologies for disability support.
- Kashmiri learning tools and digital content accessibility.