GUAN-LIN CHAO

Machine Learning Scientist
Microsoft Azure AI, Speech Group

[email] guanlinchao [AT] gmail.com
[CV] [Google Scholar] [github]

Bio

I am a Senior Machine Learning Scientist in Microsoft Azure AI, Speech Group. I received my PhD in Electrical and Computer Engineering at Carnegie Mellon University (advisors: Prof. Ian Lane and Prof. John Shen) and bachelor’s degree in Electrical Engineering from National Taiwan University.

During my PhD, I interned at Google Research with Abhinav Rastogi (2019, 2018), Srinivas Sunkara (2019), Dilek Hakkani-Tur (2018), Jindong Chen (2018) and Heng-Tze Cheng (2017).

My research interests include Speech Recognition and Multimodal Machine Learning across Speech, Text and Vision. I’ve also worked on Robotics projects at National Taiwan University.

Selected Publications

End-to-End Multimodal Learning for Situated Dialogue Systems
Guan-Lin Chao
PhD Thesis
Human-Agent Collaboration Strategies for Vision-Grounded Instruction Following
Guan-Lin Chao, Ian Lane
In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2021
Curriculum Learning for Vision-Grounded Instruction Following
Guan-Lin Chao, Ian Lane
In NAACL 2021 Visually Grounded Interaction and Language (ViGIL) Workshop
BERT-DST: Scalable End-to-End Dialogue State Tracking with Bidirectional Encoder Representations from Transformer
Guan-Lin Chao, Ian Lane
In INTERSPEECH 2019
[bib] [pdf] [code] [slides]
DEEPCOPY: Grounded Response Generation with Hierarchical Pointer Networks
Semih Yavuz, Abhinav Rastogi, Guan-Lin Chao, Dilek Hakkani-Tür
In SIGDIAL 2019 and NeurIPS 2018 Conversational AI Workshop (Best Paper)
[bib] [pdf]
Learning Question-Guided Video Representation for Multi-Turn Video Question Answering
Guan-Lin Chao, Abhinav Rastogi, Semih Yavuz, Dilek Hakkani-Tür, Jindong Chen, Ian Lane
In SIGDIAL 2019 and SIGIR 2019 Workshop on Conversational Interaction Systems
[bib] [pdf] [slides] [poster]
Audio-Visual TED Corpus: Enhancing the TED-LIUM Corpus with Facial Information, Contextual Text and Object Recognition
Guan-Lin Chao, Chih Chi Hu, Bing Liu, John Paul Shen, Ian Lane
In UbiComp 2019 Workshop on Continual and Multimodal Learning for Internet of Things
[bib] [pdf]
Deep Speaker Embedding for Speaker-Targeted Automatic Speech Recognition
Guan-Lin Chao, John Paul Shen, Ian Lane
In International Conference on Natural Language Processing and Information Retrieval (NLPIR) 2019 (Best Paper)
[bib] [pdf]
Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Guan-Lin Chao, William Chan, Ian Lane
In INTERSPEECH 2016
[bib] [pdf] [slides]
City-Identification of Flickr Videos Using Semantic Acoustic Features
Benjamin Elizalde, Guan-Lin Chao, Ming Zeng, Ian Lane
In IEEE International Conference on Multimedia Big Data (BigMM) 2016
[bib] [pdf]
Scene Feature Recognition-Enabled Framework for Mobile Service Information Query System
Yi-Chong Zeng, Ya-Hui Chan, Ting-Yu Lin, Meng-Jung Shih, Pei-Yu Hsieh, Guan-Lin Chao
In International Conference on Human Interface and the Management of Information (HIMI) 2015
[bib] [pdf]