|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectit.unimi.di.mg4j.tool.Combine
it.unimi.di.mg4j.tool.Concatenate
public final class Concatenate
Concatenates several indices.
This implementation of Combine
concatenates
the involved indices: document 0 of the first index is document 0 of the
final collection, but document 0 of the second index is numbered after
the number of documents in the first index, and so on. The resulting
index is exactly what you would obtain by concatenating the document
sequences at the origin of each index.
Note that this class can be used also with a single index, making it possible to recompress easily an index using different compression flags.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class it.unimi.di.mg4j.tool.Combine |
---|
Combine.GammaCodedIntIterator, Combine.IndexType |
Field Summary |
---|
Fields inherited from class it.unimi.di.mg4j.tool.Combine |
---|
additionalProperties, bufferSize, DEFAULT_BUFFER_SIZE, frequency, hasCounts, hasPayloads, hasPositions, haveSumsMaxPos, index, indexIterator, indexReader, indexWriter, inputBasename, ioFactory, maxCount, metadataOnly, needsSizes, numberOfDocuments, numberOfOccurrences, numIndices, outputBasename, p, positionArray, predictedLengthNumBits, predictedSize, quasiSuccinctIndexWriter, size, sumsMaxPos, termQueue, usedIndex, variableQuantumIndexWriter |
Constructor Summary | |
---|---|
Concatenate(IOFactory ioFactory,
String outputBasename,
String[] inputBasename,
boolean metadataOnly,
int bufferSize,
Map<CompressionFlags.Component,CompressionFlags.Coding> writerFlags,
Combine.IndexType indexType,
boolean skips,
int quantum,
int height,
int skipBufferOrCacheSize,
long logInterval)
Concatenates several indices into one. |
|
Concatenate(IOFactory ioFactory,
String outputBasename,
String[] inputBasename,
IntList delete,
boolean metadataOnly,
int bufferSize,
Map<CompressionFlags.Component,CompressionFlags.Coding> writerFlags,
Combine.IndexType indexType,
boolean skips,
int quantum,
int height,
int skipBufferOrCacheSize,
long logInterval)
Concatenates several indices into one. |
Method Summary | |
---|---|
protected int |
combine(int numUsedIndices,
long occurrency)
Combines several indices. |
protected int |
combineNumberOfDocuments()
Combines the number of documents. |
protected int |
combineSizes(OutputBitStream sizesOutputBitStream)
Combines size lists. |
static void |
main(String[] arg)
|
Methods inherited from class it.unimi.di.mg4j.tool.Combine |
---|
main, run, sizes |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public Concatenate(IOFactory ioFactory, String outputBasename, String[] inputBasename, boolean metadataOnly, int bufferSize, Map<CompressionFlags.Component,CompressionFlags.Coding> writerFlags, Combine.IndexType indexType, boolean skips, int quantum, int height, int skipBufferOrCacheSize, long logInterval) throws IOException, ConfigurationException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException
ioFactory
- the factory that will be used to perform I/O.outputBasename
- the basename of the combined index.inputBasename
- the basenames of the input indices.metadataOnly
- if true, we save only metadata (term list, frequencies, global counts).bufferSize
- the buffer size for index readers.writerFlags
- the flags for the index writer.indexType
- the type of the index to build.skips
- whether to insert skips in case interleaved
is true.quantum
- the quantum of skipping structures; if negative, a percentage of space for variable-quantum indices (irrelevant if skips
is false).height
- the height of skipping towers (irrelevant if skips
is false).skipBufferOrCacheSize
- the size of the buffer used to hold temporarily inverted lists during the skipping structure construction, or the size of the bit cache used when
building a quasi-succinct index.logInterval
- how often we log.
IOException
ConfigurationException
URISyntaxException
ClassNotFoundException
SecurityException
InstantiationException
IllegalAccessException
InvocationTargetException
NoSuchMethodException
public Concatenate(IOFactory ioFactory, String outputBasename, String[] inputBasename, IntList delete, boolean metadataOnly, int bufferSize, Map<CompressionFlags.Component,CompressionFlags.Coding> writerFlags, Combine.IndexType indexType, boolean skips, int quantum, int height, int skipBufferOrCacheSize, long logInterval) throws IOException, ConfigurationException, URISyntaxException, ClassNotFoundException, SecurityException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException
ioFactory
- the factory that will be used to perform I/O.outputBasename
- the basename of the combined index.inputBasename
- the basenames of the input indices.delete
- a monotonically increasing list of integers representing documents that will be deleted from the output index, or null
.metadataOnly
- if true, we save only metadata (term list, frequencies, global counts).bufferSize
- the buffer size for index readers.writerFlags
- the flags for the index writer.indexType
- the type of the index to build.skips
- whether to insert skips in case interleaved
is true.quantum
- the quantum of skipping structures; if negative, a percentage of space for variable-quantum indices (irrelevant if skips
is false).height
- the height of skipping towers (irrelevant if skips
is false).skipBufferOrCacheSize
- the size of the buffer used to hold temporarily inverted lists during the skipping structure construction, or the size of the bit cache used when
building a quasi-succinct index.logInterval
- how often we log.
IOException
ConfigurationException
URISyntaxException
ClassNotFoundException
SecurityException
InstantiationException
IllegalAccessException
InvocationTargetException
NoSuchMethodException
Method Detail |
---|
protected int combineNumberOfDocuments()
Combine
combineNumberOfDocuments
in class Combine
protected int combineSizes(OutputBitStream sizesOutputBitStream) throws IOException
Combine
combineSizes
in class Combine
IOException
protected int combine(int numUsedIndices, long occurrency) throws IOException
Combine
When this method is called, exactly numUsedIndices
entries
of Combine.usedIndex
contain, in increasing order, the indices containing
inverted lists for the current term. Implementations of this method must
combine the inverted list and return the total frequency.
combine
in class Combine
numUsedIndices
- the number of valid entries in Combine.usedIndex
.occurrency
- the occurrency of the term (used only when building Combine.IndexType.QUASI_SUCCINCT
indices).
IOException
public static void main(String[] arg) throws ConfigurationException, SecurityException, com.martiansoftware.jsap.JSAPException, IOException, URISyntaxException, ClassNotFoundException, InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException, IllegalArgumentException
ConfigurationException
SecurityException
com.martiansoftware.jsap.JSAPException
IOException
URISyntaxException
ClassNotFoundException
InstantiationException
IllegalAccessException
InvocationTargetException
NoSuchMethodException
IllegalArgumentException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |