Abstract:
The amount of data on world wide web has been more than earlier days and it is growing speedily on daily basis. A huge amount of data is being repeated on web since 2010. This massive amount of data is known as big data. The systems that can retrieve information against users queries request are known as search engines or search systems.
Users are always interested in top results. In the modern era with the advancement in information technology users’ queries also become more complicated. The end-user principally concerned in getting most applicable answers against query, instead of discovering huge number of weakly or partially relevant answers. These types of search queries are named as top-K queries, in which the user concerned is attaining ’K’ best answers. To incorporate these types of queries information retrieval domain, have some algorithms named as Top-k algorithms or ranking algorithms. Top-k algorithm uses joining procedure on data and get top-k most relevant results without exploring all the data set. Algorithm preserves global threshold with every iteration of data. Results that are collected after data iteration are store in output buffer. Join results that are greater or equal to threshold value moved to final output list. If top-k results are completed algorithm stop otherwise this procedure is repeatedly until top-k results are completed.
In case of web services calling and data fetching whereas web services have non-negligible time for every service call of each single fetching the score of the value might reduce gradually. Therefore, a time delay in presenting a join outcome store in output bucket. Whereas, our proposed study addresses the problem of efficiently reporting joins results on the base of probability. We compute the probability under the normally distribution of data.