Academic service
- Reviewer
- JASA Express Letters
- Transactions on Audio, Speech and Language Processing (TASLP)
- Transactions on Asian and Low-Resource Language Information Processing (TALLIP)
- IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- International Speech Communication Association (Interspeech)
- National Conference on Man-Machine Speech Communication (NCMMSC)
- Engineering Applications of Artificial Intelligence (EAAI)
- International Joint Conference on Neural Networks (IJCNN)
- Executive Committee of CCF TCSDAP (Speech Dialogue and Auditory Processing)
Papers
- DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models.
Weihao Wu*, Zhiwei Lin*, Yixuan Zhou, Jingbei Li, Rui Niu, Qinghua Wu, Songjun Cao, Long Ma, Zhiyong Wu. ICASSP 2025.
- M-MoE: Mixture of Mixture-of-Expert Model for CTC-based Streaming Multilingual ASR.
Songjun Cao, Xiong Wang, Yike Zhang, Xiaoming Zhang, Long Ma. ICASSP 2025.
- A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition.
Yangze Li*, Xiong Wang*, Songjun Cao, Yike Zhang, Long Ma, Lei Xie. INTERSPEECH 2024.
- DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model.
Yanzhe Fu*, Yueteng Kang*, Songjun Cao, Long Ma. arXiv 2023.
- Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training.
Bowen Zhang*, Songjun Cao*, Xiaoming Zhang, Yike Zhang, Long Ma, Takahiro Shinozaki. INTERSPEECH 2022.
- Improving CTC-based Speech Recognition via Knowledge Transferring from Pre-trained Language Models.
Keqi Deng*, Songjun Cao*, Yike Zhang, Long Ma, Gaofeng Cheng, Ji Xu, Pengyuan Zhang. ICASSP 2022.
- A Practical Framework for Multi-domain Speech Recognition and an Instance Sampling Method to Neural Language Modeling.
Yike Zhang, Xiaobing Feng, Yi Liu, Songjun Cao, Long Ma. arXiv 2021.
- Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Models.
Keqi Deng*, Songjun Cao*, Yike Zhang, Long Ma. ASRU 2021.
- Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning .
Songjun Cao, Yueteng Kang, Yanzhe Fu, Xiaoshuo Xu, Sining Sun, Yike Zhang, Long Ma. INTERSPEECH 2021.
- Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-supervised Learning.
Keqi Deng*, Songjun Cao*, Long Ma. INTERSPEECH 2021.
- Explore Wav2vec 2.0 for Mispronunciation Detection .
Xiaoshuo Xu, Yueteng Kang, Songjun Cao, Binghuai Lin, Long Ma. INTERSPEECH 2021.
- Improving Speech Recognition Accuracy of Local Poi Using Geographical Models.
Songjun Cao* , Yike Zhang*, Xiaobing Feng, Long Ma. SLT 2021.
- Multi-head Monotonic Chunkwise Attention For Online Speech Recognition.
Baiji Liu, Songjun Cao, Sining Sun, Weibin Zhang, Long Ma. arXiv 2020.