
Machine Learning Scientist
Microsoft Azure AI, Speech Group
[email] guanlinchao [AT] gmail.com
[CV] [Google Scholar] [github]
Bio
I am a Senior Machine Learning Scientist in Microsoft Azure AI, Speech Group. I received my PhD in Electrical and Computer Engineering at Carnegie Mellon University (advisors: Prof. Ian Lane and Prof. John Shen) and bachelor’s degree in Electrical Engineering from National Taiwan University.
During my PhD, I interned at Google Research with Abhinav Rastogi (2019, 2018), Srinivas Sunkara (2019), Dilek Hakkani-Tur (2018), Jindong Chen (2018) and Heng-Tze Cheng (2017).
My research interests include Speech Recognition and Multimodal Machine Learning across Speech, Text and Vision. I’ve also worked on Robotics projects at National Taiwan University.
Selected Publications
- End-to-End Multimodal Learning for Situated Dialogue Systems
Guan-Lin Chao
PhD Thesis - Human-Agent Collaboration Strategies for Vision-Grounded Instruction Following
Guan-Lin Chao, Ian Lane
In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2021 - Curriculum Learning for Vision-Grounded Instruction Following
Guan-Lin Chao, Ian Lane
In NAACL 2021 Visually Grounded Interaction and Language (ViGIL) Workshop - BERT-DST: Scalable End-to-End Dialogue State Tracking with Bidirectional Encoder Representations from Transformer
Guan-Lin Chao, Ian Lane
In INTERSPEECH 2019
[bib] [pdf] [code] [slides] - DEEPCOPY: Grounded Response Generation with Hierarchical Pointer Networks
Semih Yavuz, Abhinav Rastogi, Guan-Lin Chao, Dilek Hakkani-Tür
In SIGDIAL 2019 and NeurIPS 2018 Conversational AI Workshop (Best Paper)
[bib] [pdf] - Learning Question-Guided Video Representation for Multi-Turn Video Question Answering
Guan-Lin Chao, Abhinav Rastogi, Semih Yavuz, Dilek Hakkani-Tür, Jindong Chen, Ian Lane
In SIGDIAL 2019 and SIGIR 2019 Workshop on Conversational Interaction Systems
[bib] [pdf] [slides] [poster] - Audio-Visual TED Corpus: Enhancing the TED-LIUM Corpus with Facial Information, Contextual Text and Object Recognition
Guan-Lin Chao, Chih Chi Hu, Bing Liu, John Paul Shen, Ian Lane
In UbiComp 2019 Workshop on Continual and Multimodal Learning for Internet of Things
[bib] [pdf] - Deep Speaker Embedding for Speaker-Targeted Automatic Speech Recognition
Guan-Lin Chao, John Paul Shen, Ian Lane
In International Conference on Natural Language Processing and Information Retrieval (NLPIR) 2019 (Best Paper)
[bib] [pdf] - Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Guan-Lin Chao, William Chan, Ian Lane
In INTERSPEECH 2016
[bib] [pdf] [slides] - City-Identification of Flickr Videos Using Semantic Acoustic Features
Benjamin Elizalde, Guan-Lin Chao, Ming Zeng, Ian Lane
In IEEE International Conference on Multimedia Big Data (BigMM) 2016
[bib] [pdf] - Scene Feature Recognition-Enabled Framework for Mobile Service Information Query System
Yi-Chong Zeng, Ya-Hui Chan, Ting-Yu Lin, Meng-Jung Shih, Pei-Yu Hsieh, Guan-Lin Chao
In International Conference on Human Interface and the Management of Information (HIMI) 2015
[bib] [pdf]