Class Scan.PayloadAccumulator

  • Enclosing class:
    Scan

    protected static class Scan.PayloadAccumulator
    extends Object
    An accumulator for payloads.

    This class is essentially a stripped-down version of Scan that just accumulate payloads in a bitstream and releases them in batches. The main difference is that neither sizes nor occurrencies are saved (as they would not make much sense).

    • Constructor Detail

      • PayloadAccumulator

        public PayloadAccumulator​(IOFactory ioFactory,
                                  String basename,
                                  Payload payload,
                                  String field,
                                  Scan.IndexingType indexingType,
                                  int documentsPerBatch,
                                  File batchDir)
        Creates a new accumulator.
        Parameters:
        ioFactory - the factory that will be used to perform I/O.
        basename - the basename (usually a global filename followed by the field name, separated by a dash).
        payload - the payload stored by this accumulator.
        field - the name of the accumulated field.
        indexingType - the type of indexing procedure.
        documentsPerBatch - the number of documents in each batch.
        batchDir - a directory for batch files; batch names will be relativised to this directory if it is not null.
    • Method Detail

      • writeData

        protected void writeData()
                          throws IOException,
                                 org.apache.commons.configuration.ConfigurationException
        Writes in compressed form the data currently accumulated.
        Throws:
        IOException
        org.apache.commons.configuration.ConfigurationException
      • processData

        public void processData​(int documentPointer,
                                Object content)
                         throws IOException
        Processes the payload of a given document.
        Parameters:
        documentPointer - the document pointer.
        content - the payload.
        Throws:
        IOException
      • close

        public void close()
                   throws org.apache.commons.configuration.ConfigurationException,
                          IOException
        Closes this accumulator, releasing all resources.
        Throws:
        org.apache.commons.configuration.ConfigurationException
        IOException