DSpace Repository

Classification of Synthetic Acoustic Data

Show simple item record

dc.contributor.author Mudassir Ahmed Khan, 01-249222-015
dc.date.accessioned 2025-02-21T06:45:33Z
dc.date.available 2025-02-21T06:45:33Z
dc.date.issued 2024
dc.identifier.uri http://hdl.handle.net/123456789/19122
dc.description Supervised by Dr. Sumaira Kausar en_US
dc.description.abstract Speech synthesizers may produce extremely accurate sounds that mimics a person’s voice. This capability can be exploited to make false audio recordings, making it impossible to tell between real and fake communications. Modern speech synthesis technologies, such as deep learning-based models, can produce synthetic speech that is substantially identical to human voice. These techniques can reproduce not only the tone and rhythm of a voice, but also sophisticated speech patterns, making detection difficult for both human listeners and automated systems. The continuous growth of speech synthesis technologies raise another challenge which is out-dated and limited datasets available. Older datasets might not accurately reflect the state of synthesized speech today, which would reduce its diversity and applicability. To address that we extended the dataset by utilizing ASVspoof 2019 dataset as well as we generated 3,000 synthesized audio samples using ElevenLabs API. We also developed a classifier to classify the real and synthesized audio. We used Melspectrogram and an input to our model and It shows promising results by achieving 99.5% accuracy on our dataset. We also compared the results with previously available model for synthesized speech classification and our model achieve lowest ERR 1.02%. These findings highlight the potential of our technique to advance the field of synthetic speech classification. Our model enhances detection accuracy while also providing a framework for future study into constructing more resilient defenses against increasingly advanced synthetic speech generating technologies. The study’s findings contribute to a larger effort to protect communication authenticity and improve security in the digital age. en_US
dc.language.iso en en_US
dc.publisher Computer Sciences en_US
dc.relation.ispartofseries MS (DS);T-957
dc.subject Classification en_US
dc.subject Synthetic en_US
dc.subject Acoustic Data en_US
dc.title Classification of Synthetic Acoustic Data en_US
dc.type MS Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account