DSpace Repository

REAL TIME SPEECH DRIVEN FACE ANIMATION SYSTEM

Show simple item record

dc.contributor.author Kayani, Izhar us Salam Reg # 36567
dc.contributor.author Asim, Muhammad Ahmer Reg # 36581
dc.contributor.author Abbassi, Hassan Shahab Reg # 36563
dc.contributor.author Bhimani, Rakesh Kumar Reg # 36598
dc.date.accessioned 2020-12-12T00:57:29Z
dc.date.available 2020-12-12T00:57:29Z
dc.date.issued 2018
dc.identifier.uri http://hdl.handle.net/123456789/10444
dc.description Supervised by Asia Samreen en_US
dc.description.abstract We have opted a paper “LIPNET: END-TO-END SENTENCE-LEVEL LIPREADING” as a base paper of our Final Year Project. Lip-reading is the task of decryption text from the movement of a speaker’s mouth. Ancient approaches separated the issue into 2 stages: planning or learning visual options, and prediction. Newer deep lip-reading approaches are end-to-end trainable (Wand et al„ 2016; Chung & Zisserman, 20j6a). However, existing work on models trained end-to-end perform solely word classification, instead of sentence-level sequence prediction. Studies have shown that human lip-reading performance will for extended words (Easton & Basala, 1982), indicating the importance of options capturing temporal context in an ambiguous communication. Intended by this observation, our project presents, a model that maps a video frames to text, creating of spatial-temporal convolutions, a neural network, and therefore the connection temporal classification loss, trained entirely end-to-end. End-to-end sentence-level lip reading model that at the same time learns spatial-temporal visual options and a sequence model. en_US
dc.language.iso en_US en_US
dc.publisher Bahria University Karachi Campus en_US
dc.relation.ispartofseries BS CS;MFN BSCS 109
dc.title REAL TIME SPEECH DRIVEN FACE ANIMATION SYSTEM en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account