Producing efficient retrievability ranks of documents using normalized retrievability scoring function

Welcome to DSpace BU Repository

Welcome to the Bahria University DSpace digital repository. DSpace is a digital service that collects, preserves, and distributes digital material. Repositories are important tools for preserving an organization's legacy; they facilitate digital preservation and scholarly communication.

Show simple item record

dc.contributor.author Shariq Bashir
dc.contributor.author Akmal Saeed Khattak
dc.date.accessioned 2018-01-04T14:15:42Z
dc.date.available 2018-01-04T14:15:42Z
dc.date.issued 2014
dc.identifier.uri http://hdl.handle.net/123456789/5237
dc.description.abstract In this paper, we perform a number of experiments with large scale queries to analyze the retrieval bias of standard retrieval models. These experiments analyze how far different retrieval models differ in terms of retrieval bias that they imposed on the collection. Along with the retrieval bias analysis, we also exploit a limitation of standard retrievability scoring function and propose a normalized retrievability scoring function. Results of retrieval bias experiments show us that when a collection contains highly skewed distribution, then the standard retrievability calculation function does not take into account the differences in vocabulary richness across documents of collection. In such case, documents having large vocabulary produce many more queries and such documents thus have theoretically large probability of retrievability via a much large number of queries. We thus propose a normalized retrievability scoring function that tries to mitigate this effect by normalizing the retrievability scores of documents relative to their total number of queries. This provides an unbiased representation of the retrieval bias that could occurred due to vocabulary differences between the documents of collection without automatically inflicting a penalty on the retrieval models that favor or disfavor long documents. Finally, in order to examine, which retrievability scoring function has better effectiveness than other for correctly producing the retrievability ranks of documents, we perform a comparison between the both functions on the basis of known-items search method. Experiments on known-items search show that normalized retrievability en_US
dc.language.iso en en_US
dc.publisher Bahria University Islamabad Campus en_US
dc.subject Department of Computer Science CS en_US
dc.title Producing efficient retrievability ranks of documents using normalized retrievability scoring function en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account