Wir verwenden Cookies und Analyse-Tools, um die Nutzerfreundlichkeit der Internet-Seite zu verbessern und für Marketingzwecke. Wenn Sie fortfahren, diese Seite zu verwenden, nehmen wir an, dass Sie damit einverstanden sind. Zur Datenschutzerklärung.
Cluster-based collection selection for information retrieval
Details
The focus of this research is collection selection for distributed information retrieval. The collection descriptions that are necessary for selecting the most relevant collections are often created from information gathered by random sampling. Collection selection based on an incomplete index constructed by using random sampling instead of a full index leads to inferior results. We propose to use collection clustering to compensate for the incompleteness of the indexes. When collection clustering is used we do not only select the collections that are considered relevant based on their collection descriptions, but also collections that have similar content in their indexes. We describe a new clustering algorithm that allows us to specify the sizes of the produced clusters instead of the number of clusters. Our experiments show that that collection clustering can indeed improve the performance of distributed information retrieval systems that use random sampling. There is not much difference in retrieval performance between our clustering algorithm and the well-known k-means algorithm. We suggest to use the algorithm we proposed because it is more scalable.
Autorentext
Bertold van Voorst, MSc. studied computer science at Twente University, Netherlands, where he specialized in the field of Information Retrieval.
Weitere Informationen
- Allgemeine Informationen
- Sprache Englisch
- Herausgeber LAP LAMBERT Academic Publishing
- Gewicht 143g
- Autor Bertold van Voorst
- Titel Cluster-based collection selection for information retrieval
- Veröffentlichung 11.03.2011
- ISBN 3844318852
- Format Kartonierter Einband
- EAN 9783844318852
- Jahr 2011
- Größe H220mm x B150mm x T6mm
- Anzahl Seiten 84
- GTIN 09783844318852