Multi-modal speech processing methods: an overview and future research directions using a MATLAB based audio-visual toolbox

Research Output

This paper presents an overview of the main multi-modal speech enhancement methods reported to date. In particular, a new MATLAB based Toolbox developed by Barbosa et al (2007) for processing audio-visual data is reviewed and its performance potential evaluated. It is shown that the tool does not represent a complete and comprehensive speech processing solution, but rather serves as a standardised, yet versatile base to build upon with further research. To demonstrate this versatility, preliminary examples that make use of these computational procedures with an audiovisual corpus are demonstrated. Finally, some future research directions in the area of multi-modal speech processing are outlined, including future research that the authors aim to carry out with the aid of this newly developed audio-visual MATLAB toolbox, including toolbox customisation, and processing noisy speech in real world environments.

Date:

31 December 2009
Publication Status:

Published
DOI:

10.1007/978-3-642-00525-1_12
Funders:

Historic Funder (pre-Worktribe)

http://researchrepository.napier.ac.uk/output/1793520 <p>Abel, A., & Hussain, A. (2009). Multi-modal speech processing methods: an overview and future research directions using a MATLAB based audio-visual toolbox. In <i>Multimodal Signals: Cognitive and Algorithmic Issues</i>. , (121-129). https://doi.org/10.1007/978-3-642-00525-1_12</p>

Citation

Abel, A., & Hussain, A. (2009). Multi-modal speech processing methods: an overview and future research directions using a MATLAB based audio-visual toolbox. In Multimodal Signals: Cognitive and Algorithmic Issues. , (121-129). https://doi.org/10.1007/978-3-642-00525-1_12

Authors

Prof Amir Hussain

Professor
School of Computing Engineering and the Built Environment

0131 455 2239

A.Hussain@napier.ac.uk

Keywords

Discrete Cosine Transform, Gaussian Mixture Model, Audio Signal, Blind Source Separation, Speech Enhancement

Monthly Views:

Available Documents

Files currently unavailable for download , please contact A.Hussain@napier.ac.uk to request a copy
Downloadable citations
HTML BIB RTF

Date:

Publication Status:

DOI:

Funders:

Citation

Authors

Prof Amir Hussain

Keywords

Monthly Views:

Files currently unavailable for download , please contact A.Hussain@napier.ac.uk to request a copy

Downloadable citations