What can phone attractors in RPS tell us? A study of dynamic information in speech signals for phone classification purposes

Applied Acoustics(2023)

引用 0|浏览0
暂无评分
摘要
The speech production system is time-varying, multidimensional, and nonlinear. Most techniques for spoken feature extraction (SFE), which are tools for extracting information from speech signals, rely on the linear aspects of this system. In the past two decades, several techniques have been developed to account for the nonlinear characteristics of the system using embedded speech attractors in the reconstructed phase space (RPS). However, despite the clear benefits of speech representation in the RPS domain, only a few studies have successfully applied it for classification purposes. The main goal of this study is to develop an RPS-based framework that uses dynamic information of the embedded speech attractors in the RPS domain and outperforms the time-domain SFE techniques. The extracted features are based on multivariate linear prediction models of phone trajectories that show the dynamic information of the embedded speech attractor in the RPS. Several experiments on the FARSDAT and TIMIT databases test the phone classification accuracy of the proposed framework and show that the dynamic information of the phone attractors can significantly improve phone classification accuracy.
更多
查看译文
关键词
Signal embedding,Reconstructed phase space,Phone Trajectory,Dynamic Information,Feature extraction,Linear Predication,Phone classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要