Xml: Study of Controlling Overlap |
The direct application of standard ranking techniques to retrieveindividual elements from a collection of XML documents often produces a resultset in which the top ranks are dominated by a large number of elements takenfrom a small number of highly relevant documents. This paper presents andevaluates an algorithm that re-ranks this result set, with the aim ofminimizing redundant content while preserving the benefits of elementretrieval, including the benefit of identifying topic-focused componentscontained within relevant documents. The test collection developed by theInitiative for the Evaluation of XML Retrieval (INEX) forms the basis for theevaluation.