AbstractAggregator |
AbstractBitStreamIndexWriter |
AbstractCompositeDocumentIterator |
An abstract iterator on documents based on a list of component iterators.
AbstractCompositeDocumentIterator.AbstractCompositeIndexIntervalIterator |
AbstractCompositeDocumentIterator.AbstractCompositeIntervalIterator |
An abstract interval iterator.
AbstractDocument |
AbstractDocumentCollection |
AbstractDocumentCollection.PropertyKeys |
AbstractDocumentFactory |
An abstract implementation of a factory, providing a protected method to check
for field indices.
AbstractDocumentIterator |
AbstractDocumentIterator |
AbstractDocumentIteratorVisitor |
AbstractDocumentSequence |
AbstractIndexClusterIndexReader |
AbstractIndexIterator |
AbstractIndexReader |
AbstractIntersectionDocumentIterator |
An abstract iterator on documents, generating the intersection of the documents returned by
a number of document iterators.
AbstractIntervalDocumentIterator |
An abstract iterator on documents that provides basic support for handling interval iterators.
AbstractOrderedIntervalDocumentIterator |
AbstractPayload |
An abstract payload.
AbstractQueryBuilderVisitor<T> |
AbstractScorer |
An abstract implementation of Scorer .
AbstractSimpleTikaDocumentFactory |
AbstractSnowballTermProcessor |
AbstractTermExpander |
AbstractTikaDocumentFactory |
An abstract document factory that provides the mapping from field names to field indices.
AbstractUnionDocumentIterator |
A document iterator computing the union of the documents returned by a number of document iterators.
AbstractUnionDocumentIterator.IntHeapSemiIndirectPriorityQueue |
AbstractUnionDocumentIterator.LongHeapSemiIndirectPriorityQueue |
AbstractWeightedScorer |
Align |
A node representing the alignment of the two iterators.
AlignDocumentIterator |
A document iterator that aligns the results of two iterators over
different indices.
Among |
AnchorExtractor |
A callback extracting anchor text.
AnchorExtractor.Anchor |
A class representing an anchor.
And |
A node representing the logical and of the underlying queries.
AndDocumentIterator |
A document iterator that returns the AND of a number of document iterators.
Annotation |
A node representing a low-pass filtering of the only underlying query.
AnnotationDocumentIterator |
A (temporary) document iterator that interpret an index iterator as an annotation and unpacks
the position list into an interval list.
ArithmeticCoder |
An arithmetic coder.
ArithmeticDecoder |
An arithmetic decoder.
AutoDetectDocumentFactory |
A document factory that automatically detect the type of the document content.
BitStreamHPIndex |
BitStreamHPIndexReader |
BitStreamHPIndexReader.BitStreamHPIndexReaderIndexIterator |
BitStreamHPIndexWriter |
Writes a bitstream-based high-performance index.
BitStreamHPIndexWriter.TowerData |
A structure maintaining statistical data about tower construction.
BitStreamIndex |
BitStreamIndex.PropertyKeys |
BitStreamIndexReader |
BitStreamIndexReader.BitStreamIndexReaderIndexIterator |
BitStreamIndexWriter |
Writes a bitstream-based interleaved index.
BM25FScorer |
A scorer that implements the BM25F ranking scheme.
BM25FScorer.Visitor |
BM25Scorer |
A scorer that implements the BM25 ranking scheme.
BrowseItem |
An instance of this class is used to pack the results gathered by QueryServlet
in such a way that they are easily accessible from the Velocity Template Language.
ByteArrayPostingList |
CachingDocumentIterator |
A decorator that caches the intervals produced by the underlying document iterator.
CachingOutputBitStream |
A special output bit stream with an additional
method CachingOutputBitStream.buffer() that returns the internal buffer
if the internal buffer contains all that has been written since
the last call to position(0) .
ChainedLexicalClusteringStrategy |
A lexical clustering strategy that uses a chain of responsability to choose the local index:
term maps out of a given list are inquired
until one contains the given term.
CheckForSelectQueryVisitor |
ClarkeCormackScorer |
Computes the Clarke–Cormack score of all interval iterators of a document.
ClusteringStrategy |
A common ancestor interface for all clustering strategies.
Combine |
Combines several indices.
Combine.GammaCodedIntIterator |
A partial IntIterator implementation based on γ-coded integers.
Combine.IndexType |
Composite |
A abstract composite node containing an array of component queries.
CompositeDocumentFactory |
A composite factory that passes the input stream to a sequence of factories in turn.
CompositeDocumentSequence |
A document sequence composing a list of underlying sequences.
CompressionFlags |
A container for constants and enums related to index compression.
CompressionFlags.Coding |
A coding for an index component.
CompressionFlags.Component |
A component of the index.
ComputeNumBitsPositions |
Concatenate |
Concatenates several indices.
ConcatenatedDocumentCollection |
A document collection exhibiting a list of underlying document collections, called segments,
as a single collection.
ConcatenatedDocumentSequence |
A document sequence exhibiting a list of underlying document sequences, called segments,
as a single sequence.
Consecutive |
A node representing the consecutive composition of the underlying queries.
ConsecutiveDocumentIterator |
An iterator returning documents containing consecutive intervals (in query order)
satisfying the underlying queries.
ConstantScorer |
A scorer assigning a constant score (0 by default) to all documents.
Containment |
A node representing the containment of two queries.
ContainmentDocumentIterator |
A document iterator that computes the containement between two given document iterators.
ContiguousDocumentalStrategy |
A documental partitioning and clustering strategy that partitions an index into contiguous segments.
ContiguousLexicalStrategy |
A lexical strategy that partitions terms into contiguous segments.
CounterCollectionVisitor |
CounterSetupVisitor |
A visitor using the information collected by a
to set up term frequencies and counters.
CountScorer |
A trivial scorer that computes the score by adding the counts
(the number of occurrences within the current document) of each term
multiplied by the weight of the relative index.
CSVDocumentCollection |
DanishStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
DatePayload |
A payload containing a date expressed as seconds from the Epoch
and stored using δ coding.
DecreasingDocumentRankScorer |
Compute scores that do not depend on intervals, but that
just assign a fixed score to each document starting from 1; scores are read
from a file whose name is passed to the constructor.
DelegatingScorer |
Difference |
A node representing a difference of two queries.
DifferenceDocumentIterator |
A document iterator that computes the Brouwerian difference between two given document iterators.
DiskBasedIndex |
A static container providing facilities to load an index based on data stored on disk.
DispatchingDocumentFactory |
A document factory that actually dispatches the task of building documents to various factories
according to some strategy.
DispatchingDocumentFactory.DispatchingStrategy |
A strategy that decides which factory is appropriate using the document metadata.
DispatchingDocumentFactory.MetadataKeys |
Case-insensitive keys for metadata.
DispatchingDocumentFactory.StringBasedDispatchingStrategy |
A strategy that is based on trying to match the value of the metadata with a given key with respect to a
certain set of values.
Document |
An indexable document.
DocumentalCluster |
A abstract class representing a cluster of local indices containing separate
set of documents from the same collection.
DocumentalClusterIndexReader |
DocumentalClusteringStrategy |
A way to associate (quite bidirectionally) local and global document pointers.
DocumentalConcatenatedCluster |
DocumentalConcatenatedClusterDocumentIterator |
A document iterator concatenating iterators from local indices.
DocumentalConcatenatedClusterIndexIterator |
An index iterator concatenating iterators from local indices.
DocumentalMergedCluster |
DocumentalMergedClusterDocumentIterator |
A document iterator merging iterators from local indices.
DocumentalMergedClusterIndexIterator |
An index iterator merging iterators from local indices.
DocumentalPartitioningStrategy |
A way to associate a document with a local index out of a given set and a local document number in the local index.
DocumentalStrategies |
Static utility methods for documental strategies.
DocumentCollection |
A collection of documents.
DocumentCollectionBuilder |
An interface for classes that can build collections during the indexing process.
DocumentFactory |
A factory parsing and building documents of the same type.
DocumentFactory.FieldType |
A field type.
DocumentIterator |
An iterator over documents.
DocumentIterator |
An iterator over documents (pointers) and their intervals.
DocumentIteratorBuilderVisitor |
DocumentIterators |
A class providing static methods and objects that do useful things with document iterators.
DocumentIteratorVisitor<T> |
DocumentRankScorer |
Compute scores that do not depend on intervals, but that
just assign a fixed score to each document; scores are read
from a file whose name is passed to the constructor.
DocumentScoreInfo<T> |
A container used to return scored results with additional information.
DocumentSequence |
A sequence of documents.
DocumentSequenceImmutableSequentialGraph |
Exposes a document sequence as a (sequentially accessible) immutable graph, according to some
virtual field provided by the documents in the sequence.
DowncaseTermProcessor |
A term processor downcasing all characters.
DumpVirtualDocumentFragments |
Scans a document sequence and prints on standard output virtual document fragments as a document specifier (usually, a URL) TAB-separated from the associated text.
DutchStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
EnglishStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
EPUBDocumentFactory |
A document factory for the epub format.
False |
A node representing falseness (i.e., no documents are returned).
FalseDocumentIterator |
An empty document iterator.
FileHPIndex |
A file-based high-performance index.
FileIndex |
A file-based index.
FileSetDocumentCollection |
FileSystemItem |
An item serving a file from the file system.
FilterOutWikipediaDuplicates |
Reads a Wikipedia XML dump and outputs the same dump after eliminating
duplicate pages.
FinnishStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
FrenchStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
FrequencyLexicalStrategy |
A lexical strategy that creates an index containing a subset of the terms.
GammaDeltaGammaDeltaBitStreamHPIndexReader |
GammaDeltaGammaDeltaBitStreamHPIndexReader.BitStreamHPIndexReaderIndexIterator |
GammaDeltaGammaDeltaBitStreamIndexReader |
GammaDeltaGammaDeltaBitStreamIndexReader.BitStreamIndexReaderIndexIterator |
GenericItem |
An generic item, displaying all document fields.
German2Stemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
GermanStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
GreedyTikaField |
The set of all Tika metadata represented as a single field inside MG4J.
HadoopFileSystemIOFactory |
HelpPage |
The help page.
HtmlDocumentFactory |
A factory that provides fields for body and title of HTML documents.
HtmlDocumentFactory |
A document factory for the HTML format.
HtmlDocumentFactory.MetadataKeys |
HttpFileServer |
A minimal, singleton server serving the whole filesystem.
HttpQueryServer |
A very basic HTTP server answering queries.
HungarianStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
IdentityDocumentalStrategy |
A documental strategy that maps identically local to global pointers and viceversa.
IdentityDocumentFactory |
IdentityDocumentFactory.MetadataKeys |
Case-insensitive keys for metadata.
Inclusion |
A node representing the inclusion of two queries.
InclusionDocumentIterator |
A document iterator that computes the inclusion between two given document iterators.
Index |
An abstract representation of an index.
Index.PropertyKeys |
Symbolic names for properties of a Index .
Index.UriKeys |
Keys to be used (downcased) in specifiying additional parameters to a MG4J URI.
Index2IntervalIteratorMap |
A simple, brute-force implementation of a fixed-size map from indices
to interval iterators based on two parallel backing arrays.
IndexBuilder |
An index builder.
IndexCluster |
An abstract index cluster.
IndexCluster.PropertyKeys |
IndexIntervalIterator |
An interval iterator returning the positions of the current document as singleton intervals.
IndexIterator |
An iterator over an inverted list.
IndexIterators |
A class providing static methods and objects that do useful things with index iterators.
IndexIterators.PositionsIterator |
IndexReader |
Provides access to an inverted index.
IndexWriter |
An interface for classes that generate indices.
InMemoryHPIndex |
InMemoryIndex |
A local bitstream index loaded in memory.
InputStreamDocumentSequence |
A document sequence obtained by breaking an input stream at a specified separator.
InputStreamItem |
An item serving a raw input stream from the document collection.
IntegerPayload |
A payload containing a long stored using δ coding.
InterpolativeCoding |
Static methods implementing interpolative coding.
IntervalIterator |
IntervalIterators |
A class providing static methods and objects that do useful things with interval iterators.
IntervalIterators.FakeIterator |
IntervalSelector |
A strategy for selecting reasonable intervals to be shown to the user.
IOFactories |
IOFactories.FileLinesIterable |
IOFactories.FileLinesIterable.FileLinesIterator |
IOFactory |
A factory for streams.
ItalianStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
JavamailDocumentCollection |
JdbcDocumentCollection |
KraaijPohlmannStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
LexicalCluster |
A cluster exhibiting local indices referring to the same collection, but
containing different set of terms, as a single index.
LexicalClusterIndexReader |
LexicalClusteringStrategy |
A way to associate a term with a local index out of a given set.
LexicalPartitioningStrategy |
A way to associate a term number with a local index out of a given set and a local term number in the local index.
LexicalStrategies |
Static utility methods for lexical strategies.
LinearAggregator |
An aggregator that computes a linear combination of the component scorers.
LovinsStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
LowPass |
A node representing a low-pass filtering of the only underlying query.
LowPassDocumentIterator |
A document iterator that filters another document iterator, returning just intervals (and containing
documents) whose length does not exceed a given threshold.
Marker |
A strategy for marking words.
MarkingMutableString |
A mutable string with a special method to append text that should be marked.
MarkingMutableString.EscapeStrategy |
An escaping strategy.
MemoryMappedHPIndex |
MemoryMappedIndex |
A local memory-mapped bitstream index.
Merge |
Merges several indices.
MG4JClassParser |
A small wrapper around JSAP's standard ClassStringParser .
MimeTypeResolver |
A thin wrapper around a singleton instance of MimetypesFileTypeMap
that tries to load /etc/mime.types into the map.
MSOfficeDocumentFactory |
A document factory for the Microsoft Office format.
MultiIndexTermExpander |
A term expander that replaces every term or prefix with a disjunction of
queries; each query is made by the same term or prefix
preceded by a selection over a different index.
MultiTerm |
A node representing a virtual term obtained by merging the occurrences of the given (possibly weighted) terms.
MultiTermIndexIterator |
A virtual index iterator that merges several component index iterators.
NorwegianStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
Not |
A node representing the logical not of the underlying query.
NotDocumentIterator |
A document iterator that returns documents not returned by its underlying iterator,
and returns just IntervalIterators.TRUE on all interval iterators.
NullTermProcessor |
A term processor that accepts all terms and does not do any processing.
OOXMLDocumentFactory |
A document factory for the OOXML format.
OpenDocumentDocumentFactory |
A document factory for the Open Document format.
Or |
A node representing the logical or of the underlying queries.
OrderedAnd |
A node representing the logical ordered and of the underlying queries.
OrderedAndDocumentIterator |
An iterator returning documents containing nonoverlapping intervals in query order
satisfying the underlying queries.
OrDocumentIterator |
An iterator on documents that returns the OR of a number of document iterators.
ParseException |
This exception is thrown when parse errors are encountered.
PartitionDocumentally |
Partitions an index documentally.
PartitioningStrategy |
A common ancestor interface for all partitioning strategies.
PartitionLexically |
Partitions an index lexically.
PartitionLexically.LongWordInputBitStream |
Paste |
Pastes several indices.
Payload |
An index payload.
PayloadPredicateDocumentIterator |
A document iterator that filters an IndexIterator , returning just
documents whose payload satisfies a given predicate.
PdfDocumentFactory |
A document factory for the PDF format.
PorterStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
PortugueseStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
Prefix |
A node representing a set of terms defined by a common prefix.
PropertyBasedDocumentFactory |
A document factory initialised by default properties.
PropertyBasedDocumentFactory.MetadataKeys |
QuasiSuccinctIndex |
A quasi-succinct index.
QuasiSuccinctIndex.PropertyKeys |
QuasiSuccinctIndexReader |
QuasiSuccinctIndexReader.AbstractQuasiSuccinctIndexIterator |
QuasiSuccinctIndexReader.CountReader |
QuasiSuccinctIndexReader.EliasFanoIndexIterator |
QuasiSuccinctIndexReader.EliasFanoPointerReader |
QuasiSuccinctIndexReader.LongWordBitReader |
QuasiSuccinctIndexReader.PointerReader |
QuasiSuccinctIndexReader.PositionReader |
QuasiSuccinctIndexReader.RankedIndexIterator |
QuasiSuccinctIndexReader.RankedPointerReader |
QuasiSuccinctIndexWriter |
QuasiSuccinctIndexWriter.Accumulator |
QuasiSuccinctIndexWriter.LongWordCache |
QuasiSuccinctIndexWriter.LongWordOutputBitStream |
Queries |
Static methods and objects related to queries.
Query |
A node of a composite representing a query.
Query |
A command-line interpreter to query indices.
Query.Command |
Query.OutputType |
QueryBuilderVisitor<T> |
A visitor for a composite query.
QueryBuilderVisitorException |
A wrapper for unchecked exceptions thrown during a visit.
QueryEngine |
An engine that takes a query and returns results, using a programmable
set of scorers and policies.
QueryParser |
A parser transforming query strings in composite Query
QueryParserException |
A parse exception.
QueryServlet |
A query servlet.
QueryTransformer |
A strategy for transforming queries.
Range |
A node representing a range query on a payload-only index.
Remap |
A node representing an index remapping.
RemappingDocumentIterator |
A decorator that remaps interval iterator requests.
ReplicatedDocumentFactory |
A factory that replicates a given factory several times.
ResultItem |
An instance of this class is used to pack the results gathered by QueryServlet
in such a way that they are easily accessible from the Velocity Template Language.
RTFDocumentFactory |
A document factory for the RTF format.
RunQuery |
A very simple example that shows how to load a couple of indices and run them using
a query engine.
Scan |
Scans a document sequence, dividing it in batches of occurrences and writing for each batch a
corresponding subindex.
Scan.Completeness |
Scan.IndexingType |
Scan.PayloadAccumulator |
An accumulator for payloads.
Scan.VirtualDocumentFragment |
An interface that describes a virtual document fragment.
ScanMetadata |
Scans a document sequence and prints on standard output the corresponding URIs.
ScoredDocumentBoundedSizeQueue<T> |
A queue of scored documents with fixed maximum capacity.
Scorer |
Select |
A node representing an index selection.
SelectedInterval |
An interval selected for display.
SelectedInterval.IntervalType |
SemiExternalOffsetBigList |
Provides semi-external random access to offsets of an index .
SimpleCharStream |
An implementation of interface CharStream, where the stream is assumed to
contain only ASCII characters (without unicode processing).
SimpleCompressedDocumentCollection |
A basic, compressed document collection that can be easily built at indexing time.
SimpleCompressedDocumentCollection.FrequencyCodec |
A simple codec for integers that remaps frequent numbers to smaller numbers.
SimpleCompressedDocumentCollectionBuilder |
SimpleParser |
A simple parser that transform a query string into a query.
SimpleParserConstants |
Token literal values and constants.
SimpleParserTokenManager |
Token Manager.
SkipBitStreamIndexWriter |
Writes a bitstream-based interleaved index with skips.
SkipBitStreamIndexWriter.TowerData |
A structure maintaining statistical data about tower construction.
SkipGammaDeltaGammaDeltaBitStreamIndexReader |
SkipGammaDeltaGammaDeltaBitStreamIndexReader.BitStreamIndexReaderIndexIterator |
SpanishStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
SubDocumentCollection |
A collection that exhibits a contiguous subsets of documents from a given collection.
SubDocumentFactory |
A factory that exposes a subset of the fields a given factory.
SubsetDocumentSequence |
A collection that exhibits a subset of documents (possibly not contiguous) from a given sequence.
SwedishStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
Term |
A node representing a single term.
TermCollectionVisitor |
TermProcessor |
A term processor, implementing term/prefix transformation and possibly term/prefix filtering.
TextDocumentFactory |
A document factory for the text format; the character set will be autodetected.
TextMarker |
A marker for text/HTML output.
TfIdfScorer |
A scorer that implements the TF/IDF ranking formula.
TikaField |
A Tika field represented inside MG4J.
Token |
Describes the input token stream.
TokenMgrError |
Token Manager Error.
TooManyTermsException |
Thrown to indicate that a prefix query generated too many terms.
TRECDocumentCollection |
A collection for the TREC GOV2 data set.
TRECDocumentCollection.TRECDocumentDescriptor |
A compact description of the location and of the internal segmentation of
a TREC document inside a file.
TRECHeaderDocumentFactory |
A factory without fields that is used to interpret the header of a
TREC GOV2 document.
True |
TrueDocumentIterator |
TrueTermsCollectionVisitor |
A visitor collecting terms that satisfy a query for the current document.
URLMPHVirtualDocumentResolver |
A virtual-document resolver based on document URIs.
VariableQuantumIndexWriter |
An index writer supporting variable quanta.
Verifier |
Verifies that an index matches a collection.
VignaScorer |
Computes the Vigna score of all interval iterators of a document.
VirtualDocumentResolver |
A resolver for virtual documents.
WarcDocumentSequence |
Weight |
A node representing a weight selection.
WikipediaDocumentCollection |
A DocumentCollection corresponding to
a given set of files in the Yahoo! Wikipedia format.
WikipediaDocumentCollection.WhitespaceWordReader |
WikipediaDocumentSequence |
WikipediaDocumentSequence.MetadataKeys |
WikipediaDocumentSequence.SignedRedirectedStringMap |
A wrapper around a signed function that remaps entries exceeding a provided threshold using a specified target array.
WikipediaDocumentSequence.WikipediaHeaderFactory |
XMLDocumentFactory |
A document factory for XML.
ZipDocumentCollection |
ZipDocumentCollection.PropertyKeys |
ZipDocumentCollection.ZipFactory |
ZipDocumentCollectionBuilder |