10. Automatic recognition and generation of Cued Speech using deep learning

Beneficiary: Centre National de la Recherche Scientifique (GIPSA-lab), France

In the technology domain, the accessibility to communication tools for people with sensory disabilities is a priority. Relay Services dedicated to people with hearing impairment are created. These services are designed for people with hearing or speech impairment who use telecommunication devices to contact hearing interpreters in Sign Language, Cued Speech and speech language at a distant Centre. To integrate automatisation in this telecommunication chain, applications based on automatic gestural iconic signs recognition will be developed to complement vocal and tactile commands within mobile phones or tablet computers. For this objective, the project will develop models for automatic recognition of iconic signs derived from Sign Language and/or Cued Speech gestures towards text and/or speech sound. This work will inform the development of new algorithms (based on recent techniques of deep learning) for multimodal communication (including text, speech, lipreading, and manual gestures) between hearing participants and participants with hearing impairment. Indeed, the Convolution Neural Networks could process the automatic extraction of pertinent features from the videos and the Recurrent Neural Networks as the Long Short-Term Memory methods could be applied for the automatic processing of the large desynchronisation that can occur between hands and lips. A first application of these methods at CNRS-GIPSA made it possible to reach a score of 72.67 % for Cued Speech recognition of phonemes in the context of continuous speech (PhD thesis of Li Liu, 2018). The applications based on automatic recognition of iconic signs will be developed for telecommunication devices in relation with the IVèS telecommunication platform. It will thus increase the telecommunication accessibility for people with hearing impairment, including children with only developing text-based skills, taking into account their own preferred communication means.

Supervisors: Denis Beautemps and Thomas Hueber