Modeling, Summarizing and Translating Speech

by Dr. Pascale Fung, Human Language Technology Center, Department of Electronic & Computer Engineering, Hong Kong University of Science & Technology

 :  09 May 2005 (Mon)
 :  1:00pm
Venue  :  Cheung On Tak Lecture Theater (LTE), HKUST

The maturing of speech recognition and corpus-based natural language processing has led to many practical applications in human-machine or human-human interactions utilizing both technologies. Speech processing is ultimately about detecting, finding and translating pertinent information from the spoken input rather than word by word transcription.

In this talk, I will give an overview of our research HLTC combining both automatic speech recognition and natural language processing for spontaneous speech modeling, speech topic detection and summarization and speech translation. The main challenge of these tasks lies in discovering critical information from large amounts of unstructured, spontaneous, often accented, and multilingual speech. To this end, we propose that:

* A common acoustic model for speech recognition of multiple languages can be achieved by bootstrapping from a single language.
* Spontaneous and accented speech recognition can be best achieved by differentiating between phonetic and acoustic changes.
* Spontaneous and colloquial speech recognition can be made efficient by statistical learning of a spontaneous speech grammar.
* The best context information for translation disambiguation in a mixed language query is the most salient trigger word.
* Topic detection and summarization of multilingual, multimodal and multiple documents can be efficiently achieved by a unified segmental HMM framework.
* Fixed-point front end processing, discrete HMMs, and unambiguous inversion transduction grammars provide the optimal performance and speed tradeoff for speech translation on portable devices.

I will also discuss our contributions in mining and collecting large amounts of speech and text data for the above research.

Pascale Fung received her PhD from Columbia University in 1997. She is one of the founding faculty members of the Human Language Technology Center (HLTC) at HKUST. She is the co-editor of the Special Issue on Learning in Speech and Language Technologies of the Machine Learning Journal. She has been a board member of the Association of Computational Linguistics (ACL)'s SIGDAT, and served as area chair for ACL and chair for the Conference on Empirical Methods in Natural Language Processing (EMNLP), as well as co-chair of SemaNet 2002 at Coling. Pascale was the team leader for Pronunciation Modeling of Mandarin Casual Speech at the 2000 Johns Hopkins Summer Workshop. She has served as program committee member of numerous international conferences and technical publications. She has also served as panelist for the National Science Foundation in the US and reviewer for the Hong Kong Research Grants Council. She is a Senior Member of the Institute of Electrical and Electronic Engineers (IEEE) and a Member of the Association of Computational Linguistics (ACL).


*** All are Welcome ***