Entities, Terms and Phrases for Recovering Dead Broken Links (T-0735) (MFN 6879)

Welcome to DSpace BU Repository

Welcome to the Bahria University DSpace digital repository. DSpace is a digital service that collects, preserves, and distributes digital material. Repositories are important tools for preserving an organization's legacy; they facilitate digital preservation and scholarly communication.

Show simple item record

dc.contributor.author Saeed ur Rehman, 01-241161-013
dc.date.accessioned 2018-08-29T07:18:08Z
dc.date.available 2018-08-29T07:18:08Z
dc.date.issued 2018
dc.identifier.uri http://hdl.handle.net/123456789/7365
dc.description Supervised by Dr. Muhammad Asfand e Yar en_US
dc.description.abstract Web is of dynamic nature, lots of changes occurred daily. Increase in new Web sites and updating, deleting, or modification make Web more dynamic. The rapid changes in Webpages increase the broken links, the user face the broken links when they search of required information. Several approaches are presented by research community to provide solution to the problem. The approaches, discussed in chapter 2, used various techniques for recovery of those broken links that includes, source page, anchor text, URL and surrounding text around anchor text. In the thesis, the research study is based on data mining approaches, to identify the most suitable query terms for broken links. In Pervious researches not all the words/terms in the pages text can useful and source page text can efficient enough to retrieve the broken page. During the research work it is ana lysed that not only single terms but phrases and entities in the page text have increase the chance to recover the broken links. Therefore, to prove the approach initially the terms, phrases and entities are extracted from the source pages and classify them as good and bad query terms. Manually these queries are assigned to search engine "Google". Furthermore, data mining approaches are applied to classify the extracted query candidate into two classes (i.e. good and bad). The results reveal that among the three sources of queries i.e. terms, phrases and entities; only phrases proved good source for query as compared to other. en_US
dc.language.iso en en_US
dc.publisher Software Engineering, Bahria University Engineering School Islamabad en_US
dc.relation.ispartofseries MS SE;T-0735
dc.subject Software Engineering en_US
dc.title Entities, Terms and Phrases for Recovering Dead Broken Links (T-0735) (MFN 6879) en_US
dc.type MS Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account