Lip reading using deep learning in Turkish language


Pourmousa H., ÖZEN Ü.

IAES International Journal of Artificial Intelligence, cilt.13, sa.3, ss.3250-3261, 2024 (Scopus) identifier

Özet

Computer vision is one of the most important areas of artificial intelligence and lip reading is one of the most important areas of computer vision. Lip-reading, which is more important in noisy environments or where there is no sound flow, is one of the working areas that can help the hearing-impaired people. There is no dataset in Turkish for lip reading, which there are different datasets at alphabet, word, and sentence level in different languages. The dataset of this study was created by the author and video data were collected from 72 people for 71 words. Audio streams were removed from the collected videos and a dataset was created using only images. Due to the small size of the dataset, the data was replicated with the Camtasia application. After the model of the research was designed and trained, the model was tested on adjectives, nouns, and verbs dataset and success rates of 71.8%, 71.88%, and 79.69% were obtained, respectively.