Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2
publications
Unsupervised word-level prosody tagging for controllable speech synthesis
Published in IEEE ICASSP, 2022
This paper aims at enhancing word-level prosody controllability in TTS models by decision tree-based clustering.
Recommended citation: Yiwei Guo, Chenpeng Du, Kai Yu. (2022). "Unsupervised word-level prosody tagging for controllable speech synthesis." In Proc. IEEE ICASSP, 2022, pp.7597-7601.
VQTTS: High-fidelity text-to-speech synthesis with self-supervised VQ acoustic feature
Published in ISCA Interspeech, 2022
This paper is the first to successfully integrate discrete SSL features in TTS that produces a competitive high-fidelity TTS system.
Recommended citation: Chenpeng Du, Yiwei Guo, Xie Chen, Kai Yu. (2022). "VQTTS: High-fidelity text-to-speech synthesis with self-supervised VQ acoustic feature." In Proc. ISCA Interspeech, 2022, pp.1596-1600.
EmoDiff: Intensity controllable emotional text-to-speech with soft-label guidance
Published in IEEE ICASSP, 2023
This paper is about designing a emotion intensity-controllable TTS model by a new soft-label guidance algorithm in the diffusion paradigm.
Recommended citation: Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu. (2023). "EmoDiff: Intensity controllable emotional text-to-speech with soft-label guidance." In Proc. IEEE ICASSP, 2023.
Speaker adaptive text-to-speech with timbre-normalized vector-quantized feature
Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023
This paper proposes TN-VQTTS that leverages timbre-normalized vector-quantized acoustic feature for TTS speaker adaptation with little data.
Recommended citation: Chenpeng Du, Yiwei Guo, Xie Chen, Kai Yu. (2023). "Speaker adaptive text-to-speech with timbre-normalized vector-quantized feature." IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023, vol. 31, pp. 3446-3456.
UniCATS: A unified context-aware text-to-speech framework with contextual vq-diffusion and vocoding
Published in AAAI, 2024
This paper proposes a context-aware TTS system with strong zero-shot TTS and speech editing abilities, by a contextual token vocoder CTX-vec2wav and discrete diffusion-based CTX-txt2vec.
Recommended citation: Chenpeng Du, Yiwei Guo, Feiyu Shen, Zhijun Liu, Zheng Liang, Xie Chen, Shuai Wang, Hui Zhang, Kai Yu. (2024). "UniCATS: A unified context-aware text-to-speech framework with contextual vq-diffusion and vocoding." Proc. AAAI, 2024, vol. 38, No. 16, pp. 17924-17932.
VoiceFlow: Efficient text-to-speech with rectified flow matching
Published in IEEE ICASSP, 2024
This paper applies the rectified flow matching algorithm to improve the efficiency of TTS system in the differential equation family (e.g. diffusion and flow matching).
Recommended citation: Yiwei Guo, Chenpeng Du, Ziyang Ma, Xie Chen, Kai Yu. (2024). "VoiceFlow: Efficient text-to-speech with rectified flow matching." In Proc. IEEE ICASSP, 2024, pp. 11121-11125.
talks
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.