DSpace Repository

Extraction Of User Defined Information from PDF

Show simple item record

dc.contributor.author 03-134202-102 Rohaan Nadeem, 03-134211-033 Noor Fatima
dc.date.accessioned 2026-04-22T08:24:49Z
dc.date.available 2026-04-22T08:24:49Z
dc.date.issued 2025-01-01
dc.identifier.uri http://hdl.handle.net/123456789/21068
dc.description Mr. Tahir Iqbal en_US
dc.description.abstract Problem and Significance: Organizing and utilizing information contained in PDF files present different difficulties arising from the different organization of such documents. As shown in Figure 3.3.1, the methodology adopted highlights the systematic approach for solving these challenges.[Polak & Morgan, 2024] Due to the increasing need and availability of large documents in PDF format, lack of an efficient mechanism to upload these documents and query their contents automatically without supervised interventions with extracting user-specified information, this project comes as a response for these needs. The relevance of this project is in proposing an intelligent and efficient way of dealing with research materials in the PDF format and in gaining the desired piece of information from it thus enhancing research analysis and information process in general across numerous disciplines. Method/Tool/Technology and Solution: The proposed system forms text data from uploaded PDFs using NLP techniques that are powered by LLMs to provide automation in the extraction of pertinent data from the pieces presented in PDF format. PDFs’ data extracted using PyMuPDF, meta-analyses using Matplotlib, chunking large queries, and user query analysis is conducted through an LLM specifically a ChatGPT API. The front end of the application is created with help of React, while back-end part is implemented by Express and Python. It saves the file path, chat and user data; the entire system exists inside a safe and user-friendly space. This solution saves users’ time and minimizes their efforts when searching for specific data in PDFs while ensuring increased efficiency of the data search process. en_US
dc.language.iso en_US en_US
dc.relation.ispartofseries ;BULC1346
dc.title Extraction Of User Defined Information from PDF en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account