Papers
- SonarGuard2: Ultrasonic Face Liveness Detection Based on Adaptive Doppler Effect Feature Extraction.
Xiaoming Zhang, Keyue Zhang, Taiping Yao, Songjun Cao, Shouhong Ding, Long Ma. INTERSPEECH 2025.
- MPE-TTS: Customized Emotion Zero-Shot Text-To-Speech Using Multi-Modal Prompt.
Zhichao Wu, Yueteng Kang, Songjun Cao, Long Ma, Qiulin Li, Qun Yang. INTERSPEECH 2025.
- Monotonic Attention for Robust Text-to-Speech Synthesis in Large Language Model Frameworks.
Yike Zhang, Yiming Li, Jie Chen, Qinghua Wu, Songjun Cao, Long Ma. INTERSPEECH 2025.
- DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models.
Weihao Wu*, Zhiwei Lin*, Yixuan Zhou, Jingbei Li, Rui Niu, Qinghua Wu, Songjun Cao, Long Ma, Zhiyong Wu. ICASSP 2025.
- M-MoE: Mixture of Mixture-of-Expert Model for CTC-based Streaming Multilingual ASR.
Songjun Cao, Xiong Wang, Yike Zhang, Xiaoming Zhang, Long Ma. ICASSP 2025.
- A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition.
Yangze Li*, Xiong Wang*, Songjun Cao, Yike Zhang, Long Ma, Lei Xie. INTERSPEECH 2024.
- DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model.
Yanzhe Fu*, Yueteng Kang*, Songjun Cao, Long Ma. arXiv 2023.
- Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training.
Bowen Zhang*, Songjun Cao*, Xiaoming Zhang, Yike Zhang, Long Ma, Takahiro Shinozaki. INTERSPEECH 2022.
- Improving CTC-based Speech Recognition via Knowledge Transferring from Pre-trained Language Models.
Keqi Deng*, Songjun Cao*, Yike Zhang, Long Ma, Gaofeng Cheng, Ji Xu, Pengyuan Zhang. ICASSP 2022.
- A Practical Framework for Multi-domain Speech Recognition and an Instance Sampling Method to Neural Language Modeling.
Yike Zhang, Xiaobing Feng, Yi Liu, Songjun Cao, Long Ma. arXiv 2021.
- Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Models.
Keqi Deng*, Songjun Cao*, Yike Zhang, Long Ma. ASRU 2021.
- Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning .
Songjun Cao, Yueteng Kang, Yanzhe Fu, Xiaoshuo Xu, Sining Sun, Yike Zhang, Long Ma. INTERSPEECH 2021.
- Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-supervised Learning.
Keqi Deng*, Songjun Cao*, Long Ma. INTERSPEECH 2021.
- Explore Wav2vec 2.0 for Mispronunciation Detection .
Xiaoshuo Xu, Yueteng Kang, Songjun Cao, Binghuai Lin, Long Ma. INTERSPEECH 2021.
- Improving Speech Recognition Accuracy of Local Poi Using Geographical Models.
Songjun Cao* , Yike Zhang*, Xiaobing Feng, Long Ma. SLT 2021.
- Multi-head Monotonic Chunkwise Attention For Online Speech Recognition.
Baiji Liu, Songjun Cao, Sining Sun, Weibin Zhang, Long Ma. arXiv 2020.