A data driven approach to audiovisual speech mapping

Research Output

The concept of using visual information as part of audio speech processing has been of significant recent interest. This paper presents a data driven approach that considers estimating audio speech acoustics using only temporal visual information without considering linguistic features such as phonemes and visemes. Audio (log filterbank) and visual (2D-DCT) features are extracted, and various configurations of MLP and datasets are used to identify optimal results, showing that given a sequence of prior visual frames an equivalent reasonably accurate audio frame estimation can be mapped.

Date:

13 November 2016
Publication Status:

Published
DOI:

10.1007/978-3-319-49685-6_30
Funders:

Historic Funder (pre-Worktribe)

http://researchrepository.napier.ac.uk/output/1792590 <p>Abel, A., Marxer, R., Barker, J., Watt, R., Whitmer, B., Derleth, P., & Hussain, A. (2016). A data driven approach to audiovisual speech mapping. In <i>Advances in Brain Inspired Cognitive Systems</i>. , (331-342). https://doi.org/10.1007/978-3-319-49685-6_30</p>

Citation

Abel, A., Marxer, R., Barker, J., Watt, R., Whitmer, B., Derleth, P., & Hussain, A. (2016). A data driven approach to audiovisual speech mapping. In Advances in Brain Inspired Cognitive Systems. , (331-342). https://doi.org/10.1007/978-3-319-49685-6_30

Authors

Prof Amir Hussain

Professor
School of Computing Engineering and the Built Environment

0131 455 2239

A.Hussain@napier.ac.uk

Monthly Views:

Available Documents

Files currently unavailable for download , please contact A.Hussain@napier.ac.uk to request a copy
Downloadable citations
HTML BIB RTF

Date:

Publication Status:

DOI:

Funders:

Citation

Authors

Prof Amir Hussain

Monthly Views:

Files currently unavailable for download , please contact A.Hussain@napier.ac.uk to request a copy

Downloadable citations