Package it.unimi.di.big.mg4j.search.visitor
Visitors for composite document iterators.
Composites and visitors
A DocumentIterator
(in particular, those provided by MG4J in the package it.unimi.di.big.mg4j.search
)
is usually structured as a composite,
with operators as internal nodes and IndexIterator
s
as leaves. A composite can be explored using a visitor: thus,
the DocumentIterator
interface provides two methods,
accept(DocumentIteratorVisitor)
and
acceptOnTruePaths(DocumentIteratorVisitor)
,
that let a DocumentIteratorVisitor
visit the composite structure.
A DocumentIteratorVisitor
provides methods
for visiting in preorder
and in postorder all internal nodes.
Leaves have two visit methods, DocumentIteratorVisitor.visit(it.unimi.di.big.mg4j.index.IndexIterator)
and DocumentIteratorVisitor.visit(it.unimi.di.big.mg4j.index.MultiTermIndexIterator)
.
Note that a DocumentIteratorVisitor
must be (re)usable after each call
to prepare()
.
The abstract class AbstractDocumentIteratorVisitor
provides
stubs implementing internal visits and prepare()
as no-ops for visitors that do not return values.
Computing true terms
A simple example of a visitor is TrueTermsCollectionVisitor
, which
just collects all terms that make a query true.
Counting term occurrences
Another example of the utility of visitors for document iterators is given by term counting:
using a number of coordinated visitors, it is possible to compute
a count for each term appearing in a (no matter how complex) query. The count can be used as
an input for counting-based scoring schemes, such as BM25 or cosine-based measures. For more information,
please read the documentation of CounterCollectionVisitor
.
-
Interface Summary Interface Description DocumentIteratorVisitor<T> A visitor for the tree defined by aDocumentIterator
. -
Class Summary Class Description AbstractDocumentIteratorVisitor An abstract implementation of aDocumentIteratorVisitor
without return values.CounterCollectionVisitor A visitor collecting the counts of terms in aDocumentIterator
tree.CounterSetupVisitor A visitor using the information collected by aTermCollectionVisitor
to set up term frequencies and counters.TermCollectionVisitor A visitor collecting information about terms appearing in aDocumentIterator
.TrueTermsCollectionVisitor A visitor collecting terms that satisfy a query for the current document.