LipVision: A Deep Learning Approach

Parth Khetarpal; Riaz Moradian; Shayan Sadar; Sunny Doultani; Salma Pathan

Call for Paper

March Edition

IJCA solicits high quality original research papers for the upcoming March edition of the journal. The last date of research paper submission is 20 February 2026

Submit your paper

Know more

The week's pick

A Knowledge-Graph–Driven Multimodal Large Model for Semantic Understanding and Controllable Generation of Intangible Cultural Heritage

Jundi Yang Heng Yao

Random Articles

Reseach Article

LipVision: A Deep Learning Approach

by Parth Khetarpal, Riaz Moradian, Shayan Sadar, Sunny Doultani, Salma Pathan

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 179 - Number 8

Year of Publication: 2017

Authors: Parth Khetarpal, Riaz Moradian, Shayan Sadar, Sunny Doultani, Salma Pathan

10.5120/ijca2017916029

Parth Khetarpal, Riaz Moradian, Shayan Sadar, Sunny Doultani, Salma Pathan . LipVision: A Deep Learning Approach. International Journal of Computer Applications. 179, 8 ( Dec 2017), 34-36. DOI=10.5120/ijca2017916029

@article{ 10.5120/ijca2017916029,

author = { Parth Khetarpal, Riaz Moradian, Shayan Sadar, Sunny Doultani, Salma Pathan },

title = { LipVision: A Deep Learning Approach },

journal = { International Journal of Computer Applications },

issue_date = { Dec 2017 },

volume = { 179 },

number = { 8 },

month = { Dec },

year = { 2017 },

issn = { 0975-8887 },

pages = { 34-36 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume179/number8/28759-2017916029/ },

doi = { 10.5120/ijca2017916029 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:54:50.216591+05:30

%A Parth Khetarpal

%A Riaz Moradian

%A Shayan Sadar

%A Sunny Doultani

%A Salma Pathan

%T LipVision: A Deep Learning Approach

%J International Journal of Computer Applications

%@ 0975-8887

%V 179

%N 8

%P 34-36

%D 2017

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Lip-Reading is the task of interpreting what an individual is saying by analysing his/her mouth patterns while the individual is talking. The paper is conducting a survey on the previously done work on Lip-Reading. It will be discussing the different classifiers used, their efficiency and the end accuracy obtained. Lip-Reading can be used in a myriad of fields such as medical, communication and gaming. The proposed system will use the GRID corpus dataset in which the videos are recorded from 33 speakers. OpenCV and dlib will be used for face and mouth detection. Then the mouth ROI will be used with the iBug tool to annotate facial landmarks. The architecture consists of Convolutional Neural Networks which will be created and trained in Tensorflow (Open Source Software Library), which are then passed through Connectionist Temporal Classification. It will then be using saliency visualisation technique to interpret and match the learned behaviour and generate text.

References

Yannis M. Assael, Brendan Shillingford, Shimon Whiteson and Nandode Freitas, “Lipnet: End-to-end sentence-level lipreading”, arXiv > cs > arXiv:1611.01599, 2016.
Jithin George, Ronan Keane and Conor Zellmer, “Estimating speech from lip dynamics”, arXiv > cs > arXiv:1708.01198, 2017.
Salma Pathan and Archana Ghotkar, “Recognition of spoken English phrases using visual features extraction and classification”, International Journal of Computer Science and Information Technologies, Vol. 6 (4), 3716-3719, 2015.
Bor-Shing Lin, Yu-Hsien Yao, Ching-Feng Liu, Ching-Feng Lien, and Bor-Shyh Lin, “Development of Novel Lip-reading Recognition Algorithm”, IEEE Access Volume 5, Pages 794 – 801, 2017.
Amit Garg, Jonathan Noyola and Sameep Bagadia, “Lip reading using CNN and LSTM”, 2016.
Website – https://www.docs.opencv.org
GRID Corpus Dataset http://spandh.dcs.shef.ac.uk/gridcorpus/

Index Terms

Computer Science

Information Sciences

Keywords

Computer Vision Deep Learning Pattern Recognition.