Interface DocumentIterator
-
- All Known Subinterfaces:
IndexIterator
- All Known Implementing Classes:
AbstractCompositeDocumentIterator
,AbstractDocumentIterator
,AbstractIndexIterator
,AbstractIntersectionDocumentIterator
,AbstractIntervalDocumentIterator
,AbstractOrderedIntervalDocumentIterator
,AbstractUnionDocumentIterator
,AlignDocumentIterator
,AndDocumentIterator
,AnnotationDocumentIterator
,BitStreamHPIndexReader.BitStreamHPIndexReaderIndexIterator
,BitStreamIndexReader.BitStreamIndexReaderIndexIterator
,CachingDocumentIterator
,ConsecutiveDocumentIterator
,ContainmentDocumentIterator
,DifferenceDocumentIterator
,DocumentalConcatenatedClusterDocumentIterator
,DocumentalConcatenatedClusterIndexIterator
,DocumentalMergedClusterDocumentIterator
,DocumentalMergedClusterIndexIterator
,FalseDocumentIterator
,GammaDeltaGammaDeltaBitStreamHPIndexReader.BitStreamHPIndexReaderIndexIterator
,GammaDeltaGammaDeltaBitStreamIndexReader.BitStreamIndexReaderIndexIterator
,InclusionDocumentIterator
,Index.EmptyIndexIterator
,LowPassDocumentIterator
,MultiTermIndexIterator
,NotDocumentIterator
,OrderedAndDocumentIterator
,OrDocumentIterator
,PayloadPredicateDocumentIterator
,QuasiSuccinctIndexReader.AbstractQuasiSuccinctIndexIterator
,QuasiSuccinctIndexReader.EliasFanoIndexIterator
,QuasiSuccinctIndexReader.RankedIndexIterator
,RemappingDocumentIterator
,SkipGammaDeltaGammaDeltaBitStreamIndexReader.BitStreamIndexReaderIndexIterator
,TrueDocumentIterator
public interface DocumentIterator
An iterator over documents (pointers) and their intervals.Warning: from MG4J 5.0, this class does not implement
IntIterator
. Moreover, {nextDocument()
no longer returns -1 to denote end of iteration, but ratherEND_OF_LIST
.Each call to
nextDocument()
will return a document pointer, orEND_OF_LIST
if no more documents are available. Just after the call tonextDocument()
,intervalIterator(Index)
will return an interval iterator enumerating intervals in the last returned document for the specified index. The latter method may return, as a special result, a specialTRUE
value: this means that albeit the current document satisfies the query, there is only a generic empty witness to prove it (seeTRUE
for some elaboration).A document iterator is usually structured as composite, with operators as internal nodes and
IndexIterator
s as leaves. The methodsaccept(DocumentIteratorVisitor)
andacceptOnTruePaths(DocumentIteratorVisitor)
implement the visitor pattern.The
dispose()
method is intended to recursively release all resources associated to a composite document iterator. Note that this is not always what you want, as you might be, say, pooling index readers to reduce the number of file open/close operations. For this reason, we intentionally avoid calling the method “close”.Warning: interval enumeration can be carried out only just after a call to
nextDocument()
. Subsequent calls tonextDocument()
will reset the internal state of the iterator.
-
-
Field Summary
Fields Modifier and Type Field Description static long
END_OF_LIST
A special value denoting that the end of the list has been reached.
-
Method Summary
Modifier and Type Method Description <T> T
accept(DocumentIteratorVisitor<T> visitor)
Accepts a visitor.<T> T
acceptOnTruePaths(DocumentIteratorVisitor<T> visitor)
Accepts a visitor after a call tonextDocument()
, limiting recursion to true paths.void
dispose()
Disposes this document iterator, releasing all resources.long
document()
Returns the last document returned bynextDocument()
.ReferenceSet<Index>
indices()
Returns the set of indices over which this iterator is built.IntervalIterator
intervalIterator()
Returns the interval iterator of this document iterator for single-index queries.IntervalIterator
intervalIterator(Index index)
Returns the interval iterator of this document iterator for the given index.Reference2ReferenceMap<Index,IntervalIterator>
intervalIterators()
Returns an unmodifiable map from indices to interval iterators.boolean
mayHaveNext()
Returns whether there may be a next document, possibly with false positives.long
nextDocument()
Returns the next document provided by this document iterator, orEND_OF_LIST
if no more documents are available.long
skipTo(long n)
Skips all documents smaller thann
.double
weight()
Returns the weight associated with this iterator.DocumentIterator
weight(double weight)
Sets the weight of this index iterator.
-
-
-
Field Detail
-
END_OF_LIST
static final long END_OF_LIST
A special value denoting that the end of the list has been reached.- See Also:
- Constant Field Values
-
-
Method Detail
-
intervalIterator
IntervalIterator intervalIterator() throws IOException
Returns the interval iterator of this document iterator for single-index queries.This is a commodity method that can be used only for queries built over a single index.
- Returns:
- an interval iterator.
- Throws:
IllegalStateException
- if this document iterator is not built on a single index.IOException
- See Also:
intervalIterator(Index)
-
intervalIterator
IntervalIterator intervalIterator(Index index) throws IOException
Returns the interval iterator of this document iterator for the given index.After a call to
nextDocument()
, this iterator can be used to retrieve the intervals in the current document (the one returned bynextDocument()
) for the indexindex
.Note that if all indices have positions, it is guaranteed that at least one index will return an interval. However, for disjunctive queries it cannot be guaranteed that all indices will return an interval.
Indices without positions always return
IntervalIterators.TRUE
. Thus, in presence of indices without positions it is possible that no intervals at all are available.- Parameters:
index
- an index (must be one over which the query was built).- Returns:
- an interval iterator over the current document in
index
. - Throws:
IOException
-
intervalIterators
Reference2ReferenceMap<Index,IntervalIterator> intervalIterators() throws IOException
Returns an unmodifiable map from indices to interval iterators.After a call to
nextDocument()
, this map can be used to retrieve the intervals in the current document. An invocation ofMap.get(java.lang.Object)
on this map with argumentindex
yields the same result asintervalIterator(index)
.- Returns:
- a map from indices to interval iterators over the current document.
- Throws:
UnsupportedOperationException
- if this index does not contain positions.IOException
- See Also:
intervalIterator(Index)
-
indices
ReferenceSet<Index> indices()
Returns the set of indices over which this iterator is built.- Returns:
- the set of indices over which this iterator is built.
-
nextDocument
long nextDocument() throws IOException
Returns the next document provided by this document iterator, orEND_OF_LIST
if no more documents are available.- Returns:
- the next document, or
END_OF_LIST
if no more documents are available. - Throws:
IOException
-
mayHaveNext
boolean mayHaveNext()
Returns whether there may be a next document, possibly with false positives.- Returns:
- true there may be a next document; false if certainly there is no next document.
-
document
long document()
Returns the last document returned bynextDocument()
.- Returns:
- the last document returned by
nextDocument()
, -1 if no document has been returned yet, andEND_OF_LIST
if the list of results has been exhausted.
-
skipTo
long skipTo(long n) throws IOException
Skips all documents smaller thann
.Define the current document
k
associated with this document iterator as follows:- -1, if
nextDocument()
and this method have never been called; END_OF_LIST
, if a call to this method or tonextDocument()
returnedEND_OF_LIST
;- the last value returned by a call to
nextDocument()
or this method, otherwise.
If
k
is larger than or equal ton
, then this method does nothing and returnsk
. Otherwise, a call to this method is equivalent towhile( ( k = nextDocument() ) < n ); return k;
Thus, when a result
k
≠END_OF_LIST
is returned, the state of this iterator will be exactly the same as after a call tonextDocument()
that returnedk
. In particular, the first document larger than or equal ton
(when returned by this method) will not be returned by the next call tonextDocument()
.- Parameters:
n
- a document pointer.- Returns:
- a document pointer larger than or equal to
n
if available,END_OF_LIST
otherwise. - Throws:
IOException
- -1, if
-
accept
<T> T accept(DocumentIteratorVisitor<T> visitor) throws IOException
Accepts a visitor.A document iterator is usually structured as composite, with operators as internal nodes and
IndexIterator
s as leaves. This method implements the visitor pattern.- Parameters:
visitor
- the visitor.- Returns:
- an object resulting from the visit, or
null
if the visit was interrupted. - Throws:
IOException
-
acceptOnTruePaths
<T> T acceptOnTruePaths(DocumentIteratorVisitor<T> visitor) throws IOException
Accepts a visitor after a call tonextDocument()
, limiting recursion to true paths.After a call to
nextDocument()
, a document iterator is positioned over a document. This call is equivalent toaccept(DocumentIteratorVisitor)
, but visits only along true paths.We define a true path as a path from the root of the composite that passes only through nodes whose associated subtree is positioned on the same document of the root. Note that
OrDocumentIterator
s detach exhausted iterators from the composite tree, so true paths define the subtree that is causing the current document to satisfy the query represented by this document iterator.For more elaboration, and the main application of this method, see
CounterCollectionVisitor
.- Parameters:
visitor
- the visitor.- Returns:
- an object resulting from the visit, or
null
if the visit was interrupted. - Throws:
IOException
- See Also:
accept(DocumentIteratorVisitor)
,CounterCollectionVisitor
-
weight
double weight()
Returns the weight associated with this iterator.The number returned by this method has no fixed semantics: different scorers might choose different interpretations, or even ignore it.
- Returns:
- the weight associated with this iterator.
-
weight
DocumentIterator weight(double weight)
Sets the weight of this index iterator.- Parameters:
weight
- the weight of this index iterator.- Returns:
- this document iterator.
-
dispose
void dispose() throws IOException
Disposes this document iterator, releasing all resources.This method should propagate down to the underlying index iterators, where it should release resources such as open files and network connections. If you're doing your own resource tracking and pooling, then you do not need to call this method.
- Throws:
IOException
-
-