Deep Learning based Multilingual Speech Synthesis using Multi Feature Fusion MethodsJust Accepted

Praveena Nuthakki,Madhavi Katamaneni, Chandra Sekhar J. N., Kumari Gubbala,Bullarao Domathoti,Venkata Rao Maddumala, Kumar Raja Jetti

ACM Transactions on Asian and Low-Resource Language Information Processing(2023)

引用 0|浏览0
暂无评分
摘要
The poor intelligibility and out-of-the-ordinary nature of the traditional concatenation speech synthesis technologies are two major problems. CNN's context deep learning approaches aren't robust enough for sensitive speech synthesis. Our suggested approach may satisfy such needs and modify the complexities of voice synthesis. The suggested model's minimal aperiodic distortion makes it an excellent candidate for a communication recognition model. Our suggested method is as close to human speech as possible, despite the fact that speech synthesis has a number of audible flaws. Additionally, there is excellent hard work to be done in incorporating sentiment analysis into text categorization using natural language processing. The intensity of feeling varies greatly from nation to country. To improve their voice synthesis outputs, models need to include more and more concealed layers & nodes into the updated mixture density network. For our suggested algorithm to perform at its best, we need a more robust network foundation and optimization methods. We hope that after reading this article and trying out the example data provided, both experienced researchers and those just starting out would have a better grasp of the steps involved in creating a deep learning approach. Overcoming fitting issues with less data in training, the model is making progress. More space is needed to hold the input parameters in the DL-based method.
更多
查看译文
关键词
Natural Language Processing,Deep Learning,Machine Learning,Speech to Text
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要