Deep Learning 101 – Speech with Catarina Botelho & Alberto Abad
Deep learning has had a significant impact on speech research. In fact, it has revolutionized several speech processing tasks, including automatic speech recognition, speech synthesis, speaker verification, emotion detection, and more. In this talk we will highlight the positive outcomes of leveraging deep learning methods in speech research, the off-the-shelf tools already available for a beginner in the field, the challenges and the open problems that are currently driving the research. We will illustrate this talk with examples from different speech processing tasks, particularly for the case of automatic speech recognition.
Furthermore, we will discuss the dual nature of speech, encompassing both the linguistic content and rich paralinguistic information. This included insights into the environment where the recording took place, and into the speaker characteristics, such as demographic information, emotional states and health indicators. We will conclude this talk with our work on the exploration of speech’s potential as a tool to support medical diagnosis, highlighting its intersection with deep learning, and the associated challenges and opportunities.
Bios:
Catarina Botelho received the Biomedical Engineering degree from Instituto Superior Técnico (IST), University of Lisbon, and currently she is a PhD student at INESC-ID / IST, since 2019. Her research topic is "Speech as a biomarker for speech affecting diseases”, focusing on the use of speech for medical diagnosis. Particularly, she worked with obstructive sleep apnea, Parkinson's disease, Alzheimer’s disease, depression and COVID-19, and multimodal signals including EMG and visual speech. Her MSc and PhD work has been distinguished by two awards from IST and University of Lisbon.
She was a research intern at Google AI, Toronto, and a visitor researcher at the Cognitive Systems Lab, University of Bremen. She has been involved in the student advisory committee of the International Speech Communication Association (ISCA-SAC), since 2020, acting as Coordinator in 2022.
Alberto Abad received the Telecommunication Engineering degree from the Technical University of Catalonia (UPC), Barcelona, Spain, in 2002 and the Ph.D. degree from UPC, in 2007. Currently, he is an Associate Professor at the Department of Computer Science and Engineering (DEI) of Instituto Superior Técnico (IST) and researcher at INESC-ID. He is the coordinator of the Human Language Technologies laboratory at INESC-ID and the deputy coordinator of the Master in Computer Science and Engineering of IST. He is also an IEEE Senior member.
Alberto Abad has developed his research career in the area of human language technologies for more than 20 years. His research interests include robust speech recognition, speaker and language characterization, applied machine learning, health-care applications, and privacy-preserving speech processing and machine learning.
Feedback form:
https://docs.google.com/forms/d/e/1FAIpQLSdjXGRq9r_XyjG7_FXG3Ks7SCqzLKkyCiHWq0_FAS6GYPwViw/viewform
Find more about us at:
https://deeplearningpt.github.io
Deep learning has had a significant impact on speech research. In fact, it has revolutionized several speech processing tasks, including automatic speech recognition, speech synthesis, speaker verification, emotion detection, and more. In this talk we will highlight the positive outcomes of leveraging deep learning methods in speech research, the off-the-shelf tools already available for a beginner in the field, the challenges and the open problems that are currently driving the research. We will illustrate this talk with examples from different speech processing tasks, particularly for the case of automatic speech recognition.
Furthermore, we will discuss the dual nature of speech, encompassing both the linguistic content and rich paralinguistic information. This included insights into the environment where the recording took place, and into the speaker characteristics, such as demographic information, emotional states and health indicators. We will conclude this talk with our work on the exploration of speech’s potential as a tool to support medical diagnosis, highlighting its intersection with deep learning, and the associated challenges and opportunities.
Bios:
Catarina Botelho received the Biomedical Engineering degree from Instituto Superior Técnico (IST), University of Lisbon, and currently she is a PhD student at INESC-ID / IST, since 2019. Her research topic is “Speech as a biomarker for speech affecting diseases”, focusing on the use of speech for medical diagnosis. Particularly, she worked with obstructive sleep apnea, Parkinson’s disease, Alzheimer’s disease, depression and COVID-19, and multimodal signals including EMG and visual speech. Her MSc and PhD work has been distinguished by two awards from IST and University of Lisbon.
She was a research intern at Google AI, Toronto, and a visitor researcher at the Cognitive Systems Lab, University of Bremen. She has been involved in the student advisory committee of the International Speech Communication Association (ISCA-SAC), since 2020, acting as Coordinator in 2022.
Alberto Abad received the Telecommunication Engineering degree from the Technical University of Catalonia (UPC), Barcelona, Spain, in 2002 and the Ph.D. degree from UPC, in 2007. Currently, he is an Associate Professor at the Department of Computer Science and Engineering (DEI) of Instituto Superior Técnico (IST) and researcher at INESC-ID. He is the coordinator of the Human Language Technologies laboratory at INESC-ID and the deputy coordinator of the Master in Computer Science and Engineering of IST. He is also an IEEE Senior member.
Alberto Abad has developed his research career in the area of human language technologies for more than 20 years. His research interests include robust speech recognition, speaker and language characterization, applied machine learning, health-care applications, and privacy-preserving speech processing and machine learning.
Feedback form:
https://docs.google.com/forms/d/e/1FAIpQLSdjXGRq9r_XyjG7_FXG3Ks7SCqzLKkyCiHWq0_FAS6GYPwViw/viewform
Find more about us at:
https://deeplearningpt.github.io