Abstract:
The large volume of distribute documents are available on web and they are increasing tremendously day by day, it is very difficult to find informative documents related to user query. Due to high demand of IR system this field appearing as an emerging research area . In fact, there is need of efficient program that can extract useful knowledge from distributed documents. Appling traditional informational retrieval technique on large dataset can take long time because these techniques scan the whole dataset. Recently cluster based information retrieval techniques are proposed that are much faster than old information retrieval systems but the quality of retrieved documents are less than old documents retrial systems. This work refer to the problem of answering to user query by selecting related information from clustered data of documents. For this cluster-based IR system proposed in this work. For analyzing documents to extract similar patter in dataset a preprocessing step is perform. To represent new resulted dataset Vector Space Model (VSM) is performed. This work carried out in two phases. In first phase a process is design to group documents in clusters.in second phase information retrial process is implemented according to user query on ranked clusters that are make in phase 1. Than result are evaluated with classical notation Precision and Recall (P@5) 0.690 and p@10 0.680.