Posts by Collection

portfolio

publications

Speaker adaptive text-to-speech with timbre-normalized vector-quantized feature

Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023

This paper proposes TN-VQTTS that leverages timbre-normalized vector-quantized acoustic feature for TTS speaker adaptation with little data.

Recommended citation: Chenpeng Du, Yiwei Guo, Xie Chen, Kai Yu. (2023). "Speaker adaptive text-to-speech with timbre-normalized vector-quantized feature." IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023, vol. 31, pp. 3446-3456.

UniCATS: A unified context-aware text-to-speech framework with contextual vq-diffusion and vocoding

Published in AAAI, 2024

This paper proposes a context-aware TTS system with strong zero-shot TTS and speech editing abilities, by a contextual token vocoder CTX-vec2wav and discrete diffusion-based CTX-txt2vec.

Recommended citation: Chenpeng Du, Yiwei Guo, Feiyu Shen, Zhijun Liu, Zheng Liang, Xie Chen, Shuai Wang, Hui Zhang, Kai Yu. (2024). "UniCATS: A unified context-aware text-to-speech framework with contextual vq-diffusion and vocoding." Proc. AAAI, 2024, vol. 38, No. 16, pp. 17924-17932.

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.