it.unimi.di.mg4j.tool
Class Scan.PayloadAccumulator

java.lang.Object
  extended by it.unimi.di.mg4j.tool.Scan.PayloadAccumulator
Enclosing class:
Scan

protected static class Scan.PayloadAccumulator
extends Object

An accumulator for payloads.

This class is essentially a stripped-down version of Scan that just accumulate payloads in a bitstream and releases them in batches. The main difference is that neither sizes nor occurrencies are saved (as they would not make much sense).


Field Summary
protected  IntArrayList cutPoints
          The cutpoints of the batches (for building later a ContiguousDocumentalStrategy).
 
Constructor Summary
Scan.PayloadAccumulator(IOFactory ioFactory, String basename, Payload payload, String field, Scan.IndexingType indexingType, int documentsPerBatch, File batchDir)
          Creates a new accumulator.
 
Method Summary
 void close()
          Closes this accumulator, releasing all resources.
 void processData(int documentPointer, Object content)
          Processes the payload of a given document.
protected  void writeData()
          Writes in compressed form the data currently accumulated.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

cutPoints

protected final IntArrayList cutPoints
The cutpoints of the batches (for building later a ContiguousDocumentalStrategy).

Constructor Detail

Scan.PayloadAccumulator

public Scan.PayloadAccumulator(IOFactory ioFactory,
                               String basename,
                               Payload payload,
                               String field,
                               Scan.IndexingType indexingType,
                               int documentsPerBatch,
                               File batchDir)
Creates a new accumulator.

Parameters:
ioFactory - the factory that will be used to perform I/O.
basename - the basename (usually a global filename followed by the field name, separated by a dash).
payload - the payload stored by this accumulator.
field - the name of the accumulated field.
indexingType - the type of indexing procedure.
documentsPerBatch - the number of documents in each batch.
batchDir - a directory for batch files; batch names will be relativised to this directory if it is not null.
Method Detail

writeData

protected void writeData()
                  throws IOException,
                         ConfigurationException
Writes in compressed form the data currently accumulated.

Throws:
IOException
ConfigurationException

processData

public void processData(int documentPointer,
                        Object content)
                 throws IOException
Processes the payload of a given document.

Parameters:
documentPointer - the document pointer.
content - the payload.
Throws:
IOException

close

public void close()
           throws ConfigurationException,
                  IOException
Closes this accumulator, releasing all resources.

Throws:
ConfigurationException
IOException