Leveraging Synthetic Data for Improved Manipuri-English Code-Switched ASR
Leveraging Synthetic Data for Improved Manipuri-English Code-Switched ASR
Blog Article
Accurately recognizing code-switched speech presents a significant challenge in the field of Automatic speech recognition (ASR), particularly for low-resource regional languages.In this work, we investigate various Rocking Chair approaches to enhance ASR performance for code-switched Manipuri-English speech by generating synthetic code-switched sentences from parallel language-pair datasets and employing audio augmentation techniques.We propose a hybrid model of transformer and pointer generation network to generate high quality code-switched sentences, by leveraging the parallel monolingual texts as input.Transformer helps in attending the long-term dependencies of the input and the pointer generator in copying from source while still maintaining the code-switching Construction Vehicles constraints.We test our data augmentation method by using two parallel text corpora that we have developed.
Using text generated by our proposed model, we achieve significant reductions in perplexity (PPL) for a language modeling task, surpassing the performance of text generated by other baseline models.We also explore different strategy of training the language model (LM) to improve ASR performance.Moreover, we observed that combining audio augmentation techniques outperform each individual method.