Package it.unimi.di.big.mg4j.tool
Class Scan.PayloadAccumulator
- java.lang.Object
-
- it.unimi.di.big.mg4j.tool.Scan.PayloadAccumulator
-
- Enclosing class:
- Scan
protected static class Scan.PayloadAccumulator extends Object
An accumulator for payloads.This class is essentially a stripped-down version of
Scan
that just accumulate payloads in a bitstream and releases them in batches. The main difference is that neither sizes nor occurrencies are saved (as they would not make much sense).
-
-
Field Summary
Fields Modifier and Type Field Description protected LongArrayList
cutPoints
The cutpoints of the batches (for building later aContiguousDocumentalStrategy
).
-
Constructor Summary
Constructors Constructor Description PayloadAccumulator(IOFactory ioFactory, String basename, Payload payload, String field, Scan.IndexingType indexingType, int documentsPerBatch, File batchDir)
Creates a new accumulator.
-
Method Summary
Modifier and Type Method Description void
close()
Closes this accumulator, releasing all resources.void
processData(int documentPointer, Object content)
Processes the payload of a given document.protected void
writeData()
Writes in compressed form the data currently accumulated.
-
-
-
Field Detail
-
cutPoints
protected final LongArrayList cutPoints
The cutpoints of the batches (for building later aContiguousDocumentalStrategy
).
-
-
Constructor Detail
-
PayloadAccumulator
public PayloadAccumulator(IOFactory ioFactory, String basename, Payload payload, String field, Scan.IndexingType indexingType, int documentsPerBatch, File batchDir)
Creates a new accumulator.- Parameters:
ioFactory
- the factory that will be used to perform I/O.basename
- the basename (usually a global filename followed by the field name, separated by a dash).payload
- the payload stored by this accumulator.field
- the name of the accumulated field.indexingType
- the type of indexing procedure.documentsPerBatch
- the number of documents in each batch.batchDir
- a directory for batch files; batch names will be relativised to this directory if it is notnull
.
-
-
Method Detail
-
writeData
protected void writeData() throws IOException, org.apache.commons.configuration.ConfigurationException
Writes in compressed form the data currently accumulated.- Throws:
IOException
org.apache.commons.configuration.ConfigurationException
-
processData
public void processData(int documentPointer, Object content) throws IOException
Processes the payload of a given document.- Parameters:
documentPointer
- the document pointer.content
- the payload.- Throws:
IOException
-
close
public void close() throws org.apache.commons.configuration.ConfigurationException, IOException
Closes this accumulator, releasing all resources.- Throws:
org.apache.commons.configuration.ConfigurationException
IOException
-
-