Want to build a web or mobile app? Get in touch so we can help.

# 9: Automatic speech recognition

September 19, 19:00

Berlin, Germany
Mozilla Berlin, Schlesische Straße 27, 10997 Berlin

External Registration

Open Registration Page

Talk 1: Traditional hybrid ASR systems
Abstract: I am going to give an overview of typical production-grade speech recognition systems, their general architecture, design choices and implementation details. I will cover acoustic modeling, language modeling and grapheme-to-phoneme conversion. The core of the modern speech technology is machine learning (including neural networks); we will look into particular examples in some depth and discuss tradeoffs. A convenient formalism of weighted finite state transducers (WFSTs) as search spaces will be introduced and explained. Practical issues of ramping up and fine-tuning speech recognition systems will be discussed.
Bio: Ilya Edrenkin works for Yandex, the most popular Russian web search engine and internet portal. For several years he served as the head of voice technology development there, responsible for speech recognition, text-to-speech and voice biometry. Now he is leading self-driving technology research and development at Yandex. He holds a PhD in Neuroscience from Moscow State University.

Talk 2: Mozilla's work on end2end ASR 
Abstract: In the last ten years deep learning has revolutionized numerous fields, speech recognition is no exception. In this talk I will give an overview of the architecture and implementation of the deep learning based speech recognition engine Mozilla is developing. I'll cover its neural network architecture (and variants), language model integration, and the CTC algorithm (and variants). I'll also touch upon the dearth of open training data sets and what we are doing to help through Project Common Voice. Bio: Kelly Davis has many irons in the fire. He studied Mathematics and Physics at MIT, then went on to do graduate work in Superstring Theory/M-Theory. He then jumped ship, coding at a startup that eventually went public in the late 90's. When the bubble burst, he jumped back into an academic setting and joined the Max Planck Institute for Gravitational Physics where he worked on software systems used to help simulate black hole mergers. Jumping ship yet again, he went back into industry, writing 3D rendering software at Mental Images/NVIDIA. When that lost its charm, he founded a NLU at a startup, 42, that created a system, based off of IBM'S Watson, able to answer general knowledge questions. After a brief stint as the Director of Machine Learning at another Berlin startup, he joined Mozilla where he now leads the machine learning group.

Berlin NLP Berlin NLP

Propose talk to Berlin NLP