VoiceFlow: Efficient text-to-speech with rectified flow matching
Yiwei Guo, Chenpeng Du, Ziyang Ma, Xie Chen, Kai Yu. (2024). "VoiceFlow: Efficient text-to-speech with rectified flow matching." In Proc. IEEE ICASSP, 2024, pp. 11121-11125.
Yiwei Guo, Chenpeng Du, Ziyang Ma, Xie Chen, Kai Yu. (2024). "VoiceFlow: Efficient text-to-speech with rectified flow matching." In Proc. IEEE ICASSP, 2024, pp. 11121-11125.
Chenpeng Du, Yiwei Guo, Feiyu Shen, Zhijun Liu, Zheng Liang, Xie Chen, Shuai Wang, Hui Zhang, Kai Yu. (2024). "UniCATS: A unified context-aware text-to-speech framework with contextual vq-diffusion and vocoding." Proc. AAAI, 2024, vol. 38, No. 16, pp. 17924-17932.
Chenpeng Du, Yiwei Guo, Xie Chen, Kai Yu. (2023). "Speaker adaptive text-to-speech with timbre-normalized vector-quantized feature." IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023, vol. 31, pp. 3446-3456.
Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu. (2023). "EmoDiff: Intensity controllable emotional text-to-speech with soft-label guidance." In Proc. IEEE ICASSP, 2023.
Chenpeng Du, Yiwei Guo, Xie Chen, Kai Yu. (2022). "VQTTS: High-fidelity text-to-speech synthesis with self-supervised VQ acoustic feature." In Proc. ISCA Interspeech, 2022, pp.1596-1600.
Yiwei Guo, Chenpeng Du, Kai Yu. (2022). "Unsupervised word-level prosody tagging for controllable speech synthesis." In Proc. IEEE ICASSP, 2022, pp.7597-7601.
Tutorial at NCMMSC 2024, Xinjiang, China