it.unimi.di.mg4j.index
Class BitStreamHPIndexReader.BitStreamHPIndexReaderIndexIterator

java.lang.Object
  extended by it.unimi.di.mg4j.index.AbstractIndexIterator
      extended by it.unimi.di.mg4j.index.BitStreamHPIndexReader.BitStreamHPIndexReaderIndexIterator
All Implemented Interfaces:
IndexIterator, DocumentIterator
Enclosing class:
BitStreamHPIndexReader

protected static final class BitStreamHPIndexReader.BitStreamHPIndexReaderIndexIterator
extends AbstractIndexIterator
implements IndexIterator


Field Summary
protected  int b
          The parameter b for Golomb coding of pointers.
protected  int count
          The current count (if this index contains counts).
protected  CompressionFlags.Coding countCoding
          The cached copy of index.countCoding.
protected  int currentDocument
          The last document pointer we read from current list, -1 if we just read the frequency, DocumentIterator.END_OF_LIST if we are beyond the end of list.
protected  int currentPosition
          The index of the next position to be returned by nextPosition().
protected  int frequency
          The current frequency.
protected  boolean hasPointers
          Whether the current terms has pointers at all (this happens when the frequency is smaller than the number of documents).
 int height
          The parameter h (the maximum height of a skip tower).
protected  InputBitStream ibs
          The underlying input bit stream.
protected  BitStreamHPIndex index
          The reference index.
protected  long lastPositionsOffset
          The offset of the positions of the current term.
protected  int log2b
          The parameter log2b for Golomb coding of pointers; it is the most significant bit of b.
protected  int numberOfDocumentRecord
          The number of the document record we are going to read inside the current inverted list.
protected  CompressionFlags.Coding pointerCoding
          The cached copy of index.pointerCoding.
protected  int[] positionCache
          The cached position array.
protected  CompressionFlags.Coding positionCoding
          The cached copy of index.positionCoding.
protected  InputBitStream positions
          The underlying positions input bit stream.
protected  boolean positionsUnread
          Whether the positions for the current document pointer have not been fetched yet.
 int quantum
          The quantum.
protected  int state
          This variable tracks the current state of the reader.
protected  int term
          The current term.
 
Fields inherited from class it.unimi.di.mg4j.index.AbstractIndexIterator
id, weight
 
Fields inherited from interface it.unimi.di.mg4j.index.IndexIterator
END_OF_POSITIONS
 
Fields inherited from interface it.unimi.di.mg4j.search.DocumentIterator
END_OF_LIST
 
Constructor Summary
BitStreamHPIndexReader.BitStreamHPIndexReaderIndexIterator(BitStreamHPIndexReader parent, InputBitStream ibs, InputBitStream positions)
           
 
Method Summary
protected  IndexIterator advance()
           
 int count()
          Returns the count, that is, the number of occurrences of the term in the current document.
 void dispose()
          Disposes this document iterator, releasing all resources.
 int document()
          Returns the last document returned by DocumentIterator.nextDocument().
 int frequency()
          Returns the frequency, that is, the number of documents that will be returned by this iterator.
 Index index()
          Returns the index over which this iterator is built.
 ReferenceSet<Index> indices()
          Returns the set of indices over which this iterator is built.
 IntervalIterator intervalIterator()
          Returns the interval iterator of this document iterator for single-index queries.
 IntervalIterator intervalIterator(Index index)
          Returns the interval iterator of this document iterator for the given index.
 Reference2ReferenceMap<Index,IntervalIterator> intervalIterators()
          Returns an unmodifiable map from indices to interval iterators.
 boolean mayHaveNext()
          Returns whether there may be a next document, possibly with false positives.
 int nextDocument()
          Returns the next document provided by this document iterator, or DocumentIterator.END_OF_LIST if no more documents are available.
 int nextPosition()
          Returns the next position at which the term appears in the current document.
 Payload payload()
          Returns the payload, if any, associated with the current document.
protected  void position(int term)
          Positions the index on the inverted list of a given term.
 int skipTo(int p)
          Skips all documents smaller than n.
 int termNumber()
          Returns the number of the term whose inverted list is returned by this index iterator.
 String toString()
           
protected  void updatePositionCache()
           
 
Methods inherited from class it.unimi.di.mg4j.index.AbstractIndexIterator
accept, acceptOnTruePaths, id, id, term, term, weight, weight
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface it.unimi.di.mg4j.index.IndexIterator
id, id, term, term, weight
 
Methods inherited from interface it.unimi.di.mg4j.search.DocumentIterator
accept, acceptOnTruePaths, weight
 

Field Detail

index

protected final BitStreamHPIndex index
The reference index.


ibs

protected final InputBitStream ibs
The underlying input bit stream.


positions

protected final InputBitStream positions
The underlying positions input bit stream.


pointerCoding

protected final CompressionFlags.Coding pointerCoding
The cached copy of index.pointerCoding.


countCoding

protected final CompressionFlags.Coding countCoding
The cached copy of index.countCoding.


positionCoding

protected final CompressionFlags.Coding positionCoding
The cached copy of index.positionCoding.


b

protected int b
The parameter b for Golomb coding of pointers.


log2b

protected int log2b
The parameter log2b for Golomb coding of pointers; it is the most significant bit of b.


term

protected int term
The current term.


frequency

protected int frequency
The current frequency.


hasPointers

protected boolean hasPointers
Whether the current terms has pointers at all (this happens when the frequency is smaller than the number of documents).


count

protected int count
The current count (if this index contains counts).


currentDocument

protected int currentDocument
The last document pointer we read from current list, -1 if we just read the frequency, DocumentIterator.END_OF_LIST if we are beyond the end of list.


numberOfDocumentRecord

protected int numberOfDocumentRecord
The number of the document record we are going to read inside the current inverted list.


state

protected int state
This variable tracks the current state of the reader.


height

public final int height
The parameter h (the maximum height of a skip tower).


quantum

public int quantum
The quantum.


positionsUnread

protected boolean positionsUnread
Whether the positions for the current document pointer have not been fetched yet.


positionCache

protected int[] positionCache
The cached position array.


currentPosition

protected int currentPosition
The index of the next position to be returned by nextPosition().


lastPositionsOffset

protected long lastPositionsOffset
The offset of the positions of the current term.

Constructor Detail

BitStreamHPIndexReader.BitStreamHPIndexReaderIndexIterator

public BitStreamHPIndexReader.BitStreamHPIndexReaderIndexIterator(BitStreamHPIndexReader parent,
                                                                  InputBitStream ibs,
                                                                  InputBitStream positions)
Method Detail

position

protected void position(int term)
                 throws IOException
Positions the index on the inverted list of a given term.

This method can be called at any time. Note that it is always possible to call this method with argument 0, even if offsets have not been loaded.

Parameters:
term - a term.
Throws:
IOException

termNumber

public int termNumber()
Description copied from interface: IndexIterator
Returns the number of the term whose inverted list is returned by this index iterator.

Usually, the term number is automatically set by IndexReader.documents(CharSequence) or IndexReader.documents(int).

Specified by:
termNumber in interface IndexIterator
Returns:
the number of the term over which this iterator is built.
See Also:
IndexIterator.term()

advance

protected IndexIterator advance()
                         throws IOException
Throws:
IOException

index

public Index index()
Description copied from interface: IndexIterator
Returns the index over which this iterator is built.

Specified by:
index in interface IndexIterator
Returns:
the index over which this iterator is built.

frequency

public int frequency()
Description copied from interface: IndexIterator
Returns the frequency, that is, the number of documents that will be returned by this iterator.

Specified by:
frequency in interface IndexIterator
Returns:
the number of documents that will be returned by this iterator.

document

public int document()
Description copied from interface: DocumentIterator
Returns the last document returned by DocumentIterator.nextDocument().

Specified by:
document in interface DocumentIterator
Returns:
the last document returned by DocumentIterator.nextDocument(), -1 if no document has been returned yet, and DocumentIterator.END_OF_LIST if the list of results has been exhausted.

payload

public Payload payload()
                throws IOException
Description copied from interface: IndexIterator
Returns the payload, if any, associated with the current document.

Specified by:
payload in interface IndexIterator
Returns:
the payload associated with the current document.
Throws:
IOException

count

public int count()
          throws IOException
Description copied from interface: IndexIterator
Returns the count, that is, the number of occurrences of the term in the current document.

Specified by:
count in interface IndexIterator
Returns:
the count (number of occurrences) of the term in the current document.
Throws:
IOException

updatePositionCache

protected void updatePositionCache()
                            throws IOException
Throws:
IOException

nextPosition

public int nextPosition()
                 throws IOException
Description copied from interface: IndexIterator
Returns the next position at which the term appears in the current document.

Specified by:
nextPosition in interface IndexIterator
Returns:
the next position of the current document in which the current term appears, or IndexIterator.END_OF_POSITIONS if there are no more positions.
Throws:
IOException

nextDocument

public int nextDocument()
                 throws IOException
Description copied from interface: DocumentIterator
Returns the next document provided by this document iterator, or DocumentIterator.END_OF_LIST if no more documents are available.

Specified by:
nextDocument in interface DocumentIterator
Returns:
the next document, or DocumentIterator.END_OF_LIST if no more documents are available.
Throws:
IOException

skipTo

public int skipTo(int p)
           throws IOException
Description copied from interface: DocumentIterator
Skips all documents smaller than n.

Define the current document k associated with this document iterator as follows:

If k is larger than or equal to n, then this method does nothing and returns k. Otherwise, a call to this method is equivalent to

 while( ( k = nextDocument() ) < n );
 return k;
 

Thus, when a result kDocumentIterator.END_OF_LIST is returned, the state of this iterator will be exactly the same as after a call to DocumentIterator.nextDocument() that returned k. In particular, the first document larger than or equal to n (when returned by this method) will not be returned by the next call to DocumentIterator.nextDocument().

Specified by:
skipTo in interface DocumentIterator
Parameters:
p - a document pointer.
Returns:
a document pointer larger than or equal to n if available, DocumentIterator.END_OF_LIST otherwise.
Throws:
IOException

dispose

public void dispose()
             throws IOException
Description copied from interface: DocumentIterator
Disposes this document iterator, releasing all resources.

This method should propagate down to the underlying index iterators, where it should release resources such as open files and network connections. If you're doing your own resource tracking and pooling, then you do not need to call this method.

Specified by:
dispose in interface DocumentIterator
Throws:
IOException

mayHaveNext

public boolean mayHaveNext()
Description copied from interface: DocumentIterator
Returns whether there may be a next document, possibly with false positives.

Specified by:
mayHaveNext in interface DocumentIterator
Returns:
true there may be a next document; false if certainly there is no next document.

toString

public String toString()
Overrides:
toString in class Object

intervalIterators

public Reference2ReferenceMap<Index,IntervalIterator> intervalIterators()
                                                                 throws IOException
Description copied from interface: DocumentIterator
Returns an unmodifiable map from indices to interval iterators.

After a call to DocumentIterator.nextDocument(), this map can be used to retrieve the intervals in the current document. An invocation of Map.get(java.lang.Object) on this map with argument index yields the same result as intervalIterator(index).

Specified by:
intervalIterators in interface DocumentIterator
Returns:
a map from indices to interval iterators over the current document.
Throws:
IOException
See Also:
DocumentIterator.intervalIterator(Index)

intervalIterator

public IntervalIterator intervalIterator()
                                  throws IOException
Description copied from interface: DocumentIterator
Returns the interval iterator of this document iterator for single-index queries.

This is a commodity method that can be used only for queries built over a single index.

Specified by:
intervalIterator in interface DocumentIterator
Returns:
an interval iterator.
Throws:
IOException
See Also:
DocumentIterator.intervalIterator(Index)

intervalIterator

public IntervalIterator intervalIterator(Index index)
                                  throws IOException
Description copied from interface: DocumentIterator
Returns the interval iterator of this document iterator for the given index.

After a call to DocumentIterator.nextDocument(), this iterator can be used to retrieve the intervals in the current document (the one returned by DocumentIterator.nextDocument()) for the index index.

Note that if all indices have positions, it is guaranteed that at least one index will return an interval. However, for disjunctive queries it cannot be guaranteed that all indices will return an interval.

Indices without positions always return IntervalIterators.TRUE. Thus, in presence of indices without positions it is possible that no intervals at all are available.

Specified by:
intervalIterator in interface DocumentIterator
Parameters:
index - an index (must be one over which the query was built).
Returns:
an interval iterator over the current document in index.
Throws:
IOException

indices

public ReferenceSet<Index> indices()
Description copied from interface: DocumentIterator
Returns the set of indices over which this iterator is built.

Specified by:
indices in interface DocumentIterator
Returns:
the set of indices over which this iterator is built.