DSpace Repository

UDOC QA-Urdu Document-Based Question Answering

Show simple item record

dc.contributor.author Misbah Zafar, 01-249212-006
dc.date.accessioned 2023-12-18T10:43:17Z
dc.date.available 2023-12-18T10:43:17Z
dc.date.issued 2023
dc.identifier.uri http://hdl.handle.net/123456789/16831
dc.description Supervised by Dr. Arif ur Rahman en_US
dc.description.abstract In today’s data-driven world, Document AI and Machine Reading Comprehension (MRC) have emerged as pivotal technologies with profound implications. This abstract explores their significant impact and the compelling reasons for their necessity in contemporary applications. It delves into the historical context, emphasizing the reliance on specific models and techniques, particularly in low-resource languages like Urdu, which have been relatively uncharted territory in the realm of question answering. Traditionally, the field of document AI and MRC predominantly relied on state-of-the-art models and techniques, often leaving low-resource languages underrepresented and underserved. In response to this gap, our research initiative sought to address the challenges faced in Urdu question answering. To this end, we embarked on the creation of a dedicated Urdu dataset by translating the wellestablished MLQA dataset. Our study introduces two distinct methodologies tailored to enhance Urdu question answering performance. The first methodology involves feature extraction combined with a predictive model, while the second method focuses on fine-tuning state-of-the-art models on our newly crafted dataset. Through a rigorous comparative analysis, we aimed to discern which approach yields superior results in the context of Urdu question answering. The findings of our research indicate that the feature extraction methodology surpasses the fine-tuning of state-of-the-art models when applied to our Urdu dataset. This conclusion highlights the potential for innovative techniques in lowresource language applications within the domain of Document AI and MRC, showcasing the significance of such endeavors in bridging linguistic and technological gaps. en_US
dc.language.iso en en_US
dc.publisher Computer Sciences en_US
dc.relation.ispartofseries MS (DS);T-1107
dc.subject UDOC en_US
dc.subject Urdu Document-Based en_US
dc.subject Question Answering en_US
dc.title UDOC QA-Urdu Document-Based Question Answering en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account