Publicado el 28/10/2015
IEEE AR SPS - Conferencias ‘The Challenges of Pattern Recognition for Speech Signals’ y ‘25 Years of Audio Coding: How we arrived at audio playback on iPhone and its underlying technology’
Douglas O'Shaughnessy y Akihiko Sugiyama - 30 de octubre de 2015, en ITBA, CABA
El Capítulo Argentino de la IEEE Signal Processing Society invita a las Conferencias ‘The Challenges of Pattern Recognition for Speech Signals’ y ‘25 Years of Audio Coding: How we arrived at audio playback on iPhone and its underlying technology’.
Las mismas serán ofrecidas por Douglas O'Shaughnessy, IEEE Fellow, Director para Latinoamérica y Past Chair de la IEEE SPS y miembro del Centre Énergie Matériaux Télécommunications de la INRS Université de Recherche de Canadá y por Akihiko Sugiyama, IEEE Fellow y Disertante Distinguido de la IEEE SPS y afiliado con NEC Information and Media Processing Labs.
El evento tendrá lugar el viernes 30 de octubre de 2015, a partir de las 18:00, en el ITBA, Instituto Tecnológico de Buenos Aires, Avda. E. Madero 399, CABA.
Se agradecerá inscripción previa, sin cargo, en http://eventioz.com.ar/2015-itba-sps
Consultas: cursos@ieee.org.ar
* The Challenges of Pattern Recognition for Speech Signals
Speaker: Douglas O’Shaughnessy
Abstract
Speech coding has found great success in today’s widespread usage of cell phones. In addition, people are increasingly accustomed to hearing and accepting synthetic voices when they access information by phone. A third major system used for speech, automatic speech recognition (ASR), is also seeing significant usage, but still has major limitations, falling far short of what human listeners can do.
This talk will examine the modern techniques applied for recognition of the information present in speech: its textual content, the identity of the speaker, emotional state, and the language used.
We will examine ways to extract relevant parameters, while ignoring channel distortions and extraneous sounds that may also be present in the signal.
A brief history of ASR development will show the evolution of usage of Fourier analysis, linear prediction, cepstrum, and neural networks.
The strengths and weaknesses of the modern approach to ASR that uses mel-frequency cepstral coefficients (MFCC), hidden Markov Models (HMM), and language models will be discussed.
Short Biography
Douglas O’Shaughnessy, (Massachusetts Institute of Technology (MIT), Ph.D., 1976); has been professor at INRS (University of Quebec) and adjunct professor at McGill University in Montreal, Canada since 1977.
He is a Fellow of the Acoustical Society of America (1992) and of the IEEE (2006).
He is a Regional Director of the IEEE Signal Processing Society and a member of its Board of Governors.
He is the founding Editor-in-Chief of the EURASIP Journal on Audio, Speech, and Music Processing.
He is Secretary of the International Speech Communication Association (ISCA), and Past Chair of the IEEE Signal Processing Society's Speech and Language Technical Committee.
He has presented tutorials on speech recognition at ICASSP-1996, ICASSP-2001, at ICC-2003, and at ICASSP-2009. He is the author of the textbook Speech Communications: Human and Machine (1986 Addison-Wesley; revised 2000, IEEE Press).
* How we arrived at audio playback on iPhone and its underlying technology
Speaker: Akihiko Sugiyama
Abstract
This lecture presents the 30-year history of audio coding technology.
Focusing on MPEG Audio Coding that is the most widely used international standard in our daily life, some important techniques we contributed are explained along the history.
Recent standardization activities are briefly touched to show the unlimited potential of audio coding.
An encounter of the Silicon Audio, developed in 1994 and the real ancestor of iPod, is the highlight of this lecture, which cannot be experienced elsewhere.
The audience will see how iPod started its function 20 years ago.
Short Biography
Akihiko Sugiyama (a.k.a. Ken Sugiyama), affiliated with NEC Information and Media Processing Labs., has been engaged in a wide variety of research projects in signal processing such as audio coding and interference/noise control. His team developed the world's first Silicon Audio in 1994, the ancestor of iPod. He served as Chair of Audio and Acoustic Signal Processing Tech. Committee, IEEE Signal Processing Society (SPS) [2011-2012], as associate editors for several journals such as IEEE Trans. SP [1994-1996], as the Secretary and a Member at Large to the Conference Board of SPS [2010-2011], as a member of the Awards Board of SPS [2015- ], and as the Chair of Japan Chapter of SPS [2010-2011]. He was a Technical Program Chair for ICASSP2012. He has contributed to 16 chapters of books and is the inventor of over 150 registered patents with more pending applications in the field of signal processing in Japan and overseas. He received 13 awards such as the 2002 IEICE Best Paper Award, the 2006 IEICE Achievement Award, and the 2013 Ichimura Industry Award. He is Fellow of IEEE and IEICE, and a Distinguished Lecturer for IEEE SPS. He is also known as a big host for a total of over 70 internship students.