it.unimi.di.mg4j.index
Class BitStreamIndexReader.BitStreamIndexReaderIndexIterator

java.lang.Object
  extended by it.unimi.di.mg4j.index.AbstractIndexIterator
      extended by it.unimi.di.mg4j.index.BitStreamIndexReader.BitStreamIndexReaderIndexIterator
All Implemented Interfaces:
IndexIterator, DocumentIterator
Enclosing class:
BitStreamIndexReader

protected static final class BitStreamIndexReader.BitStreamIndexReaderIndexIterator
extends AbstractIndexIterator
implements IndexIterator


Field Summary
protected  int b
          The parameter b for Golomb coding of pointers.
protected  int count
          The current count (if this index contains counts).
protected  CompressionFlags.Coding countCoding
          The cached copy of index.countCoding.
protected  int currentDocument
          The last document pointer we read from current list, -1 if we just read the frequency, DocumentIterator.END_OF_LIST if we are beyond the end of list.
protected  int currentPosition
          The index of the next position to be returned by nextPosition().
protected  int currentTerm
          The current term.
protected  int frequency
          The current frequency.
protected  boolean hasCounts
          The cached copy of index.hasCounts.
protected  boolean hasPayloads
          The cached copy of index.hasPayloads.
protected  boolean hasPointers
          Whether the current terms has pointers at all (this happens when the frequency is smaller than the number of documents).
protected  boolean hasPositions
          The cached copy of index.hasPositions.
protected  boolean hasSkips
          Whether the underlying index has skips.
 int height
          The parameter h (the maximum height of a skip tower).
protected  InputBitStream ibs
          The underlying input bit stream.
protected  BitStreamIndex index
          The reference index.
protected  int log2b
          The parameter log2b for Golomb coding of pointers; it is the most significant bit of b.
protected  int numberOfDocumentRecord
          The number of the document record we are going to read inside the current inverted list.
protected  Payload payload
          The payload, in case the index of this reader has payloads, or null.
protected  CompressionFlags.Coding pointerCoding
          The cached copy of index.pointerCoding.
protected  int[] positionCache
          The cached position array.
protected  CompressionFlags.Coding positionCoding
          The cached copy of index.positionCoding.
 int quantum
          The quantum.
 int quantumDivisionShift
          The shift giving result of the division by quantum.
 int quantumModuloMask
          The bit mask giving the remainder of the division by quantum.
protected  int state
          This variable tracks the current state of the reader.
 
Fields inherited from class it.unimi.di.mg4j.index.AbstractIndexIterator
id, term, weight
 
Fields inherited from interface it.unimi.di.mg4j.index.IndexIterator
END_OF_POSITIONS
 
Fields inherited from interface it.unimi.di.mg4j.search.DocumentIterator
END_OF_LIST
 
Constructor Summary
BitStreamIndexReader.BitStreamIndexReaderIndexIterator(BitStreamIndexReader parent, InputBitStream ibs)
           
 
Method Summary
protected  IndexIterator advance()
           
 int count()
          Returns the count, that is, the number of occurrences of the term in the current document.
 void dispose()
          Disposes this document iterator, releasing all resources.
 int document()
          Returns the last document returned by DocumentIterator.nextDocument().
 int frequency()
          Returns the frequency, that is, the number of documents that will be returned by this iterator.
 Index index()
          Returns the index over which this iterator is built.
 ReferenceSet<Index> indices()
          Returns the set of indices over which this iterator is built.
 IntervalIterator intervalIterator()
          Returns the interval iterator of this document iterator for single-index queries.
 IntervalIterator intervalIterator(Index index)
          Returns the interval iterator of this document iterator for the given index.
 Reference2ReferenceMap<Index,IntervalIterator> intervalIterators()
          Returns an unmodifiable map from indices to interval iterators.
 boolean mayHaveNext()
          Returns whether there may be a next document, possibly with false positives.
 int nextDocument()
          Returns the next document provided by this document iterator, or DocumentIterator.END_OF_LIST if no more documents are available.
 int nextPosition()
          Returns the next position at which the term appears in the current document.
 Payload payload()
          Returns the payload, if any, associated with the current document.
protected  void position(int term)
          Positions the index on the inverted list of a given term.
 int skipTo(int p)
          Skips all documents smaller than n.
 int termNumber()
          Returns the number of the term whose inverted list is returned by this index iterator.
 String toString()
           
protected  void updatePositionCache()
          We read positions, assuming state <= BEFORE_POSITIONS
 
Methods inherited from class it.unimi.di.mg4j.index.AbstractIndexIterator
accept, acceptOnTruePaths, id, id, term, term, weight, weight
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface it.unimi.di.mg4j.index.IndexIterator
id, id, term, term, weight
 
Methods inherited from interface it.unimi.di.mg4j.search.DocumentIterator
accept, acceptOnTruePaths, weight
 

Field Detail

index

protected final BitStreamIndex index
The reference index.


ibs

protected final InputBitStream ibs
The underlying input bit stream.


hasPositions

protected final boolean hasPositions
The cached copy of index.hasPositions.


hasCounts

protected final boolean hasCounts
The cached copy of index.hasCounts.


hasPayloads

protected final boolean hasPayloads
The cached copy of index.hasPayloads.


hasSkips

protected final boolean hasSkips
Whether the underlying index has skips.


pointerCoding

protected final CompressionFlags.Coding pointerCoding
The cached copy of index.pointerCoding.


countCoding

protected final CompressionFlags.Coding countCoding
The cached copy of index.countCoding.


positionCoding

protected final CompressionFlags.Coding positionCoding
The cached copy of index.positionCoding.


payload

protected final Payload payload
The payload, in case the index of this reader has payloads, or null.


b

protected int b
The parameter b for Golomb coding of pointers.


log2b

protected int log2b
The parameter log2b for Golomb coding of pointers; it is the most significant bit of b.


currentTerm

protected int currentTerm
The current term.


frequency

protected int frequency
The current frequency.


hasPointers

protected boolean hasPointers
Whether the current terms has pointers at all (this happens when the frequency is smaller than the number of documents).


count

protected int count
The current count (if this index contains counts).


currentDocument

protected int currentDocument
The last document pointer we read from current list, -1 if we just read the frequency, DocumentIterator.END_OF_LIST if we are beyond the end of list.


numberOfDocumentRecord

protected int numberOfDocumentRecord
The number of the document record we are going to read inside the current inverted list.


state

protected int state
This variable tracks the current state of the reader.


height

public final int height
The parameter h (the maximum height of a skip tower).


quantum

public int quantum
The quantum.


quantumModuloMask

public int quantumModuloMask
The bit mask giving the remainder of the division by quantum.


quantumDivisionShift

public int quantumDivisionShift
The shift giving result of the division by quantum.


positionCache

protected int[] positionCache
The cached position array.


currentPosition

protected int currentPosition
The index of the next position to be returned by nextPosition().

Constructor Detail

BitStreamIndexReader.BitStreamIndexReaderIndexIterator

public BitStreamIndexReader.BitStreamIndexReaderIndexIterator(BitStreamIndexReader parent,
                                                              InputBitStream ibs)
Method Detail

position

protected void position(int term)
                 throws IOException
Positions the index on the inverted list of a given term.

This method can be called at any time. Note that it is always possible to call this method with argument 0, even if offsets have not been loaded.

Parameters:
term - a term.
Throws:
IOException

termNumber

public int termNumber()
Description copied from interface: IndexIterator
Returns the number of the term whose inverted list is returned by this index iterator.

Usually, the term number is automatically set by IndexReader.documents(CharSequence) or IndexReader.documents(int).

Specified by:
termNumber in interface IndexIterator
Returns:
the number of the term over which this iterator is built.
See Also:
IndexIterator.term()

advance

protected IndexIterator advance()
                         throws IOException
Throws:
IOException

index

public Index index()
Description copied from interface: IndexIterator
Returns the index over which this iterator is built.

Specified by:
index in interface IndexIterator
Returns:
the index over which this iterator is built.

frequency

public int frequency()
Description copied from interface: IndexIterator
Returns the frequency, that is, the number of documents that will be returned by this iterator.

Specified by:
frequency in interface IndexIterator
Returns:
the number of documents that will be returned by this iterator.

document

public int document()
Description copied from interface: DocumentIterator
Returns the last document returned by DocumentIterator.nextDocument().

Specified by:
document in interface DocumentIterator
Returns:
the last document returned by DocumentIterator.nextDocument(), -1 if no document has been returned yet, and DocumentIterator.END_OF_LIST if the list of results has been exhausted.

payload

public Payload payload()
                throws IOException
Description copied from interface: IndexIterator
Returns the payload, if any, associated with the current document.

Specified by:
payload in interface IndexIterator
Returns:
the payload associated with the current document.
Throws:
IOException

count

public int count()
          throws IOException
Description copied from interface: IndexIterator
Returns the count, that is, the number of occurrences of the term in the current document.

Specified by:
count in interface IndexIterator
Returns:
the count (number of occurrences) of the term in the current document.
Throws:
IOException

updatePositionCache

protected void updatePositionCache()
                            throws IOException
We read positions, assuming state <= BEFORE_POSITIONS

Throws:
IOException

nextPosition

public int nextPosition()
                 throws IOException
Description copied from interface: IndexIterator
Returns the next position at which the term appears in the current document.

Specified by:
nextPosition in interface IndexIterator
Returns:
the next position of the current document in which the current term appears, or IndexIterator.END_OF_POSITIONS if there are no more positions.
Throws:
IOException

nextDocument

public int nextDocument()
                 throws IOException
Description copied from interface: DocumentIterator
Returns the next document provided by this document iterator, or DocumentIterator.END_OF_LIST if no more documents are available.

Specified by:
nextDocument in interface DocumentIterator
Returns:
the next document, or DocumentIterator.END_OF_LIST if no more documents are available.
Throws:
IOException

skipTo

public int skipTo(int p)
           throws IOException
Description copied from interface: DocumentIterator
Skips all documents smaller than n.

Define the current document k associated with this document iterator as follows:

If k is larger than or equal to n, then this method does nothing and returns k. Otherwise, a call to this method is equivalent to

 while( ( k = nextDocument() ) < n );
 return k;
 

Thus, when a result kDocumentIterator.END_OF_LIST is returned, the state of this iterator will be exactly the same as after a call to DocumentIterator.nextDocument() that returned k. In particular, the first document larger than or equal to n (when returned by this method) will not be returned by the next call to DocumentIterator.nextDocument().

Specified by:
skipTo in interface DocumentIterator
Parameters:
p - a document pointer.
Returns:
a document pointer larger than or equal to n if available, DocumentIterator.END_OF_LIST otherwise.
Throws:
IOException

dispose

public void dispose()
             throws IOException
Description copied from interface: DocumentIterator
Disposes this document iterator, releasing all resources.

This method should propagate down to the underlying index iterators, where it should release resources such as open files and network connections. If you're doing your own resource tracking and pooling, then you do not need to call this method.

Specified by:
dispose in interface DocumentIterator
Throws:
IOException

mayHaveNext

public boolean mayHaveNext()
Description copied from interface: DocumentIterator
Returns whether there may be a next document, possibly with false positives.

Specified by:
mayHaveNext in interface DocumentIterator
Returns:
true there may be a next document; false if certainly there is no next document.

toString

public String toString()
Overrides:
toString in class Object

intervalIterators

public Reference2ReferenceMap<Index,IntervalIterator> intervalIterators()
                                                                 throws IOException
Description copied from interface: DocumentIterator
Returns an unmodifiable map from indices to interval iterators.

After a call to DocumentIterator.nextDocument(), this map can be used to retrieve the intervals in the current document. An invocation of Map.get(java.lang.Object) on this map with argument index yields the same result as intervalIterator(index).

Specified by:
intervalIterators in interface DocumentIterator
Returns:
a map from indices to interval iterators over the current document.
Throws:
IOException
See Also:
DocumentIterator.intervalIterator(Index)

intervalIterator

public IntervalIterator intervalIterator()
                                  throws IOException
Description copied from interface: DocumentIterator
Returns the interval iterator of this document iterator for single-index queries.

This is a commodity method that can be used only for queries built over a single index.

Specified by:
intervalIterator in interface DocumentIterator
Returns:
an interval iterator.
Throws:
IOException
See Also:
DocumentIterator.intervalIterator(Index)

intervalIterator

public IntervalIterator intervalIterator(Index index)
                                  throws IOException
Description copied from interface: DocumentIterator
Returns the interval iterator of this document iterator for the given index.

After a call to DocumentIterator.nextDocument(), this iterator can be used to retrieve the intervals in the current document (the one returned by DocumentIterator.nextDocument()) for the index index.

Note that if all indices have positions, it is guaranteed that at least one index will return an interval. However, for disjunctive queries it cannot be guaranteed that all indices will return an interval.

Indices without positions always return IntervalIterators.TRUE. Thus, in presence of indices without positions it is possible that no intervals at all are available.

Specified by:
intervalIterator in interface DocumentIterator
Parameters:
index - an index (must be one over which the query was built).
Returns:
an interval iterator over the current document in index.
Throws:
IOException

indices

public ReferenceSet<Index> indices()
Description copied from interface: DocumentIterator
Returns the set of indices over which this iterator is built.

Specified by:
indices in interface DocumentIterator
Returns:
the set of indices over which this iterator is built.