MG4J (Managing Gigabytes for Java) is a free full-text search engine for large document collections written in Java.

See: Description

Packages 
Package Description
it.unimi.di.big.mg4j.document
This package contains all the logics related to and useful for managing documents, document collections and such.
it.unimi.di.big.mg4j.document.tika
This package contains classes that expose Tika parsers as MG4J factories.
it.unimi.di.big.mg4j.examples
Examples classes.
it.unimi.di.big.mg4j.index
Index generation and access.
it.unimi.di.big.mg4j.index.cluster
Index partitioning and clustering.
it.unimi.di.big.mg4j.index.payload  
it.unimi.di.big.mg4j.index.remote
Remote index classes.
it.unimi.di.big.mg4j.index.snowball
it.unimi.di.big.mg4j.io
Bit-level support classes.
it.unimi.di.big.mg4j.query
User interfaces for querying indices.
it.unimi.di.big.mg4j.query.nodes
Composite representation for queries
it.unimi.di.big.mg4j.query.parser
A simple JavaCC-generated parser used by the Query class.
it.unimi.di.big.mg4j.search
Classes that compose iterators over documents.
it.unimi.di.big.mg4j.search.score
Classes for assigning scores to documents.
it.unimi.di.big.mg4j.search.visitor
Visitors for composite document iterators.
it.unimi.di.big.mg4j.test  
it.unimi.di.big.mg4j.tool
Line-command tools for index construction.
it.unimi.di.big.mg4j.util
Utility classes.
it.unimi.di.big.mg4j.util.parser.callback