Visual Speech Recognition with Lightweight Psychologically Motivated Gabor Features



Zhang X, Xu Y, Abel AK, Smith LS, Watt R, Hussain A & Gao C (2020) Visual Speech Recognition with Lightweight Psychologically Motivated Gabor Features. Entropy, 22 (12), Art. No.: 1367.

Extraction of relevant lip features is of continuing interest in the visual speech domain. 1 Using end-to-end feature extraction can produce good results, but at the cost of the results being 2 difficult for humans to comprehend and relate to. We present a new, lightweight feature extraction 3 approach, motivated by human-centric glimpse based psychological research into facial barcodes, 4 and demonstrate that these simple, easy to extract 3D geometric features (produced using Gabor 5 based image patches), can successfully be used for speech recognition with LSTM based machine 6 learning. This approach can successfully extract low dimensionality lip parameters with a minimum 7 of processing. One key difference between using these Gabor-based features and using other features 8 such as traditional DCT, or the current fashion for CNN features is that these are human-centric 9 features that can be visualised and analysed by humans. This means that it is easier to explain and 10 visualise the results. They can also be used for reliable speech recognition, as demonstrated using the 11 Grid corpus. Results for overlapping speakers using our lightweight system gave a recognition rate 12 of over 82%, which compares well to less explainable features in the literature. 13

Speech Recognition; Image Processing; Gabor Features; Lip Reading; Explainable

Entropy: Volume 22, Issue 12

FundersEPSRC Engineering and Physical Sciences Research Council
Publication date31/12/2020
Publication date online03/12/2020
Date accepted by journal23/11/2020

People (2)


Professor Leslie Smith

Professor Leslie Smith

Emeritus Professor, Computing Science

Professor Roger Watt

Professor Roger Watt

Emeritus Professor, Psychology


Research programmes

Research centres/groups

Research themes