Research Output
A data driven approach to audiovisual speech mapping
  The concept of using visual information as part of audio speech processing has been of significant recent interest. This paper presents a data driven approach that considers estimating audio speech acoustics using only temporal visual information without considering linguistic features such as phonemes and visemes. Audio (log filterbank) and visual (2D-DCT) features are extracted, and various configurations of MLP and datasets are used to identify optimal results, showing that given a sequence of prior visual frames an equivalent reasonably accurate audio frame estimation can be mapped.


Abel, A., Marxer, R., Barker, J., Watt, R., Whitmer, B., Derleth, P., & Hussain, A. (2016). A data driven approach to audiovisual speech mapping. In Advances in Brain Inspired Cognitive Systems. , (331-342).


Monthly Views:

Available Documents