Binaural Hearing for Robots

Description

Robots have gradually moved from factory floors to populated areas. Therefore, there is a crucial need to endow robots with perceptual and interaction skills enabling them to communicate with people in the most natural way. With auditory signals distinctively characterizing physical environments and speech being the most effective means of communication among people, robots must be able to fully extract the rich auditory information from their environment.

This course will address fundamental issues in robot hearing; it will describe methodologies requiring two or more microphones embedded into a robot head, thus enabling sound-source localization, sound-source separation, and fusion of auditory and visual information.

The course will start by briefly describing the role of hearing in human-robot interaction, overviewing the human binaural system, and introducing the computational auditory scene analysis paradigm. Then, it will describe in detail sound propagation models, audio signal processing techniques, geometric models for source localization, and unsupervised and supervised machine learning techniques for characterizing binaural hearing, fusing acoustic and visual data, and designing practical algorithms. The course will be illustrated with numerous videos shot in the author’s laboratory.

Who can attend this course ?

The course is intended for Master of Science students with good background in signal processing and machine learning. The course is also valuable to PhD students, researchers and practitioners, who work in signal and image processing, machine learning, robotics, or human-machine interaction, and who wish to acquire competences in binaural hearing methodologies.

The course material will allow the attendants to design and develop robot and machine hearing algorithms.

Recommended Background

Introductory courses in digital signal processing, probability and statistics, computer science.

Course Syllabus

Part 1: Introduction to Robot Hearing

Part 2 : Methodological Foundations

Part 3 : Sound-Source Localization

Part 4 : Machine Learning and Binaural Hearing

Part 5 : Fusion of Audio and Vision

The material of this course come from a MOOC delivered on France Université Numérique :

https://www.fun-mooc.fr/courses/inria/41004/session01/about

Collections

collection

9 vidéos

5 : Fusion of Audio and Vision

HORAUD Radu

The fifth and last part of the video lectures will address the fusion of auditory and visual data. We will start with the motivation behind audio-visual fusion, followed by a short overview of the visual features that are likely to be used. We will describe audio-visual fusion in the temporal and in the spectral domains and we will present a few examples using the audio-visual head of the NAO robot. This part will complete the course that described the main methodologies needed to perform hearing with a binaural robot head.

Programmation et modélisation

Programmation et modélisation

16.03.2015

niveau 1 niveau 2 niveau 3
document 1 document 2 document 3

collection

10 vidéos

4 : Machine Learning and Binaural Hearing

HORAUD Radu

During the fourth part, we will work on binaural features, mapping sounds onto their directions, collecting training data, the binaural Manifold, predicting the direction of speech and work on some principles of sound Separation.

Programmation et modélisation

Programmation et modélisation

16.03.2015

niveau 1 niveau 2 niveau 3
document 1 document 2 document 3

collection

10 vidéos

3 : Sound-Source Localization

HORAUD Radu

During the third part, we will study Sound-Source Localization: time difference of arrival (TDOA), estimation of TDOA by cross-correlation, in the temporal and spectral domains, the geometry of multiple microphones, embedding the microphones in a robot head, predicting direction of a sound with a robot head and an example of sound direction estimation.

Programmation et modélisation

Programmation et modélisation

16.03.2015

niveau 1 niveau 2 niveau 3
document 1 document 2 document 3

collection

10 vidéos

2 : Methodological Foundations

HORAUD Radu

In this second part, we will talk about "Methodological Foundations" : Robot heads and acoustic laboratories, Binaural Processing Pipeline, Continuous-time Fourier transform, Discrete-time signals, Spectrogram of an acoustic signal,...

Programmation et modélisation

Programmation et modélisation

16.03.2015

niveau 1 niveau 2 niveau 3
document 1 document 2 document 3

collection

7 vidéos

1: Introduction to Robot Hearing

HORAUD Radu

Welcome to this course "Binaural audition for robots"! In this first part, we propose an introduction to robot hearing.

Programmation et modélisation

Programmation et modélisation

16.03.2015

niveau 1 niveau 2 niveau 3
document 1 document 2 document 3

Intervenant

Horaud Radu

France

Roumanie

Auteur d'une thèse de docteur-ingénieur en automatique (Grenoble INPG, 1981). - Directeur de recherche au CNRS, puis à l'INRIA, Laboratoire d'informatique fondamentale et d'intelligence artificielle de l'Institut national polytechnique de Grenoble (en 1993). - Directeur de thèse à l'Université Joseph Fourier de Grenoble et à Grenoble INPG (-1990-1994-). - Consultant

Programmation et modélisation