it.unimi.di.mg4j.index
Interface IndexIterator

All Superinterfaces:
DocumentIterator
All Known Implementing Classes:
AbstractIndexIterator, BitStreamHPIndexReader.BitStreamHPIndexReaderIndexIterator, BitStreamIndexReader.BitStreamIndexReaderIndexIterator, DocumentalConcatenatedClusterIndexIterator, DocumentalMergedClusterIndexIterator, Index.EmptyIndexIterator, MultiTermIndexIterator, QuasiSuccinctIndexReader.AbstractQuasiSuccinctIndexIterator, QuasiSuccinctIndexReader.EliasFanoIndexIterator, QuasiSuccinctIndexReader.RankedIndexIterator

public interface IndexIterator
extends DocumentIterator

An iterator over an inverted list.

An index iterator scans the inverted list of an indexed term. Each integer returned by nextDocument() is the index of a document containing the term. If the index contains counts, they can be obtained after each call to DocumentIterator.nextDocument() using count(). Then, if the index contains positions they can be obtained by calling nextPosition().

Warning: from MG4J 5.0, the plethora of non-lazy methods to access positions (positionArray(), etc.) has been replaced by static methods in IndexIterators.

Note that this interface extends DocumentIterator. The intervals returned for a document are exactly length-one intervals corresponding to the positions returned by nextPosition(). If the index to which an instance of this class refers does not contain positions an UnsupportedOperationException will be thrown.

Additionally, this interface strengthens DocumentIterator.weight(double) so that it returns an index iterator.


Field Summary
static int END_OF_POSITIONS
          A special value denoting that the end of the position list has been reached.
 
Fields inherited from interface it.unimi.di.mg4j.search.DocumentIterator
END_OF_LIST
 
Method Summary
 int count()
          Returns the count, that is, the number of occurrences of the term in the current document.
 int frequency()
          Returns the frequency, that is, the number of documents that will be returned by this iterator.
 int id()
          Returns the id of this index iterator.
 IndexIterator id(int id)
          Sets the id of this index iterator.
 Index index()
          Returns the index over which this iterator is built.
 int nextPosition()
          Returns the next position at which the term appears in the current document.
 Payload payload()
          Returns the payload, if any, associated with the current document.
 String term()
          Returns the term whose inverted list is returned by this index iterator.
 IndexIterator term(CharSequence term)
          Sets the term whose inverted list is returned by this index iterator.
 int termNumber()
          Returns the number of the term whose inverted list is returned by this index iterator.
 IndexIterator weight(double weight)
          Returns the weight of this index iterator.
 
Methods inherited from interface it.unimi.di.mg4j.search.DocumentIterator
accept, acceptOnTruePaths, dispose, document, indices, intervalIterator, intervalIterator, intervalIterators, mayHaveNext, nextDocument, skipTo, weight
 

Field Detail

END_OF_POSITIONS

static final int END_OF_POSITIONS
A special value denoting that the end of the position list has been reached.

See Also:
Constant Field Values
Method Detail

index

Index index()
Returns the index over which this iterator is built.

Returns:
the index over which this iterator is built.

termNumber

int termNumber()
Returns the number of the term whose inverted list is returned by this index iterator.

Usually, the term number is automatically set by IndexReader.documents(CharSequence) or IndexReader.documents(int).

Returns:
the number of the term over which this iterator is built.
Throws:
IllegalStateException - if no term was set when the iterator was created.
See Also:
term()

term

String term()
Returns the term whose inverted list is returned by this index iterator.

Usually, the term is automatically set by IndexReader.documents(CharSequence) or IndexReader.documents(int), but you can supply your own term with term(CharSequence).

Returns:
the term over which this iterator is built, as a compact mutable string.
Throws:
IllegalStateException - if no term was set when the iterator was created.
See Also:
termNumber()

term

IndexIterator term(CharSequence term)
Sets the term whose inverted list is returned by this index iterator.

Usually, the term is automatically set by Index.documents(CharSequence) or by IndexReader.documents(CharSequence), but you can use this method to ensure that term() doesn't throw an exception.

Parameters:
term - a character sequence (that will be defensively copied) that will be assumed to be the term whose inverted list is returned by this index iterator.
Returns:
this index iterator.

frequency

int frequency()
              throws IOException
Returns the frequency, that is, the number of documents that will be returned by this iterator.

Returns:
the number of documents that will be returned by this iterator.
Throws:
IOException

payload

Payload payload()
                throws IOException
Returns the payload, if any, associated with the current document.

Returns:
the payload associated with the current document.
Throws:
IOException

count

int count()
          throws IOException
Returns the count, that is, the number of occurrences of the term in the current document.

Returns:
the count (number of occurrences) of the term in the current document.
Throws:
UnsupportedOperationException - if the index of this iterator does not contain counts.
IOException

nextPosition

int nextPosition()
                 throws IOException
Returns the next position at which the term appears in the current document.

Returns:
the next position of the current document in which the current term appears, or END_OF_POSITIONS if there are no more positions.
Throws:
UnsupportedOperationException - if the index of this iterator does not contain positions.
IOException

id

IndexIterator id(int id)
Sets the id of this index iterator.

The id is an integer associated with each index iterator. It has no specific semantics, and can be used differently in different contexts. A typical usage pattern, for instance, is using it to assign a unique number to the index iterators contained in a composite document iterator (say, numbering consecutively the leaves of the composite).

Parameters:
id - the new id for this index iterator.
Returns:
this index iterator.

id

int id()
Returns the id of this index iterator.

Returns:
the id of this index iterator.
See Also:
id(int)

weight

IndexIterator weight(double weight)
Returns the weight of this index iterator.

Specified by:
weight in interface DocumentIterator
Parameters:
weight - the weight of this index iterator.
Returns:
this document iterator.
See Also:
DocumentIterator.weight(double)