it.unimi.di.mg4j.index
Class DiskBasedIndex

java.lang.Object
  extended by it.unimi.di.mg4j.index.DiskBasedIndex

public class DiskBasedIndex
extends Object

A static container providing facilities to load an index based on data stored on disk.

This class contains several useful static methods such as readOffsets(InputBitStream, int), readSizes(CharSequence, int), loadLongBigList(CharSequence, ByteOrder) and static factor methods such as getInstance(CharSequence, boolean, boolean, boolean, EnumMap) that take care of reading the properties associated with the index, identify the correct Index implementation that should be used to load the index, and load the necessary data into memory.

As an option, a disk-based index can be loaded into main memory (key: Index.UriKeys.INMEMORY), or mapped into main memory (key: Index.UriKeys.MAPPED) (the value assigned to the keys is irrelevant).

Note that quasi-succinct indices are memory-mapped by default, and for bitstream indices there is a limit of two gigabytes for in-memory indices.

By default the term-offset list is accessed using a SemiExternalOffsetList with a step of DEFAULT_OFFSET_STEP. This behaviour can be changed using the URI key Index.UriKeys.OFFSETSTEP.

Disk-based indices are the workhorse of MG4J. All other indices (clustered, remote, etc.) ultimately rely on disk-based indices to provide results.

Note that not all data produced by Scan and by the other indexing utilities are actually necessary to run a disk-based index. Usually the property file and the index files are sufficient: if one needs random access, also the offsets file must be present, and if the compression method requires document sizes or if sizes are requested explicitly, also the sizes file must be present. A StringMap and possibly a PrefixMap will be fetched automatically by getInstance(CharSequence, boolean, boolean) using standard extensions.

Thread safety

A disk-based index is thread safe as long as the offset list, the size list and the term/prefix map are. The static factory methods provided by this class load offsets and sizes using data structures that are thread safe. If you use directly a constructor, instead, it is your responsibility to pass thread-safe data structures.

Since:
1.1
Author:
Sebastiano Vigna

Field Summary
static int BUFFER_SIZE
           
static String COUNTS_EXTENSION
          The extension for the counts bitstream.
static int DEFAULT_OFFSET_STEP
          The default value for the query parameter Index.UriKeys.OFFSETSTEP.
static String FREQUENCIES_EXTENSION
          Standard extension for the file of frequencies.
static String INDEX_EXTENSION
          Standard extension for the index bitstream.
static String OCCURRENCIES_EXTENSION
          Standard extension for the file of global counts.
static String OFFSETS_EXTENSION
          Standard extension for the file of offsets.
static String OFFSETS_POSTFIX
          The postfix to be added to POINTERS_EXTENSIONS, COUNTS_EXTENSION and POSITIONS_EXTENSION for offsets.
static String POINTERS_EXTENSIONS
          The extension for the pointers bitstream.
static String POSITIONS_EXTENSION
          Standard extension for the positions bitstream of a high-performance index.
static String POSITIONS_NUMBER_OF_BITS_EXTENSION
          Standard extension for the file of lengths of positions.
static String PREFIXMAP_EXTENSION
          Standard extension for the prefix map.
static String PROPERTIES_EXTENSION
          Standard extension for the index properties.
static String SIZES_EXTENSION
          Standard extension for the file of sizes.
static String STATS_EXTENSION
          Standard extension for the stats file.
static String SUMS_MAX_POSITION_EXTENSION
          Standard extension for the file of lengths of positions.
static String TERMMAP_EXTENSION
          Standard extension for the term map.
static String TERMS_EXTENSION
          Standard extension for the file of terms.
static String UNSORTED_TERMS_EXTENSION
          Standard extension for the file of terms, unsorted.
 
Method Summary
static ByteOrder byteOrder(String s)
          Parses a ByteOrder value.
static Index getInstance(CharSequence basename)
          Returns a new local index, trying to guess reasonable term and prefix maps from the basename, loading offsets but loading document sizes only if it is necessary.
static Index getInstance(CharSequence basename, boolean randomAccess)
          Returns a new local index, trying to guess reasonable term and prefix maps from the basename, and loading document sizes only if it is necessary.
static Index getInstance(CharSequence basename, boolean randomAccess, boolean documentSizes)
          Returns a new disk-based index, guessing reasonable term and prefix maps from the basename.
static Index getInstance(CharSequence basename, boolean randomAccess, boolean documentSizes, boolean maps)
          Returns a new disk-based index, using preloaded Properties and possibly guessing reasonable term and prefix maps from the basename.
static Index getInstance(CharSequence basename, boolean randomAccess, boolean documentSizes, boolean maps, EnumMap<Index.UriKeys,String> queryProperties)
          Returns a new disk-based index, possibly guessing reasonable term and prefix maps from the basename.
static Index getInstance(CharSequence basename, Properties properties, boolean randomAccess, boolean documentSizes, boolean maps, EnumMap<Index.UriKeys,String> queryProperties)
          Returns a new disk-based index, using preloaded Properties and possibly guessing reasonable term and prefix maps from the basename.
static Index getInstance(CharSequence basename, Properties properties, StringMap<? extends CharSequence> termMap, PrefixMap<? extends CharSequence> prefixMap, boolean randomAccess, boolean documentSizes, EnumMap<Index.UriKeys,String> queryProperties)
          Returns a new disk-based index, loading exactly the specified parts and using preloaded Properties and the IOFactory.FILESYSTEM_FACTORY.
static Index getInstance(IOFactory ioFactory, CharSequence basename, Properties properties, boolean randomAccess, boolean documentSizes, boolean maps, EnumMap<Index.UriKeys,String> queryProperties)
          Returns a new disk-based index, using preloaded Properties and possibly guessing reasonable term and prefix maps from the basename.
static Index getInstance(IOFactory ioFactory, CharSequence basename, Properties properties, StringMap<? extends CharSequence> termMap, PrefixMap<? extends CharSequence> prefixMap, boolean randomAccess, boolean documentSizes, EnumMap<Index.UriKeys,String> queryProperties)
          Returns a new disk-based index, loading exactly the specified parts and using preloaded Properties.
static LongBigArrayBigList loadLongBigList(CharSequence filename, ByteOrder byteOrder)
          Commodity method for loading a big list of binary longs with specified endianness into a long big array using the IOFactory.FILESYSTEM_FACTORY.
static LongBigArrayBigList loadLongBigList(IOFactory ioFactory, CharSequence filename, ByteOrder byteOrder)
          Commodity method for loading a big list of binary longs with specified endianness into a long big array.
static LongBigArrayBigList loadLongBigList(ReadableByteChannel channel, long length, ByteOrder byteOrder)
          Commodity method for loading from a channel a big list of binary longs with specified endianness into a long big array.
static PrefixMap<? extends CharSequence> loadPrefixMap(IOFactory ioFactory, String filename)
          Utility static method that loads a prefix map.
static PrefixMap<? extends CharSequence> loadPrefixMap(String filename)
          Utility static method that loads a prefix map using the IOFactory.FILESYSTEM_FACTORY.
static StringMap<? extends CharSequence> loadStringMap(IOFactory ioFactory, String filename)
          Utility static method that loads a term map.
static StringMap<? extends CharSequence> loadStringMap(String filename)
          Utility static method that loads a term map using the IOFactory.FILESYSTEM_FACTORY.
static LongList offsets(IOFactory ioFactory, String filename, int numberOfTerms, int offsetStep)
          Returns the list of offsets.
static LongList offsets(String filename, int numberOfTerms, int offsetStep)
          Returns the list of offsets using the IOFactory.FILESYSTEM_FACTORY.
static LongList readOffsets(CharSequence filename, int T)
          Utility method to load a compressed offset file into a list using the IOFactory.FILESYSTEM_FACTORY.
static LongList readOffsets(InputBitStream in, int T)
          Utility method to load a compressed offset file into a list.
static LongList readOffsets(IOFactory ioFactory, CharSequence filename, int T)
          Utility method to load a compressed offset file into a list.
static IntList readSizes(CharSequence filename, int N)
          Utility method to load a compressed size file into a list using the IOFactory.FILESYSTEM_FACTORY.
static IntList readSizes(IOFactory ioFactory, CharSequence filename, int N)
          Utility method to load a compressed size file into a list.
static IntList readSizesSuccinct(CharSequence filename, int N)
          Deprecated. This method is an ancestral residue.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_OFFSET_STEP

public static final int DEFAULT_OFFSET_STEP
The default value for the query parameter Index.UriKeys.OFFSETSTEP.

See Also:
Constant Field Values

INDEX_EXTENSION

public static final String INDEX_EXTENSION
Standard extension for the index bitstream.

See Also:
Constant Field Values

POSITIONS_EXTENSION

public static final String POSITIONS_EXTENSION
Standard extension for the positions bitstream of a high-performance index.

See Also:
Constant Field Values

PROPERTIES_EXTENSION

public static final String PROPERTIES_EXTENSION
Standard extension for the index properties.

See Also:
Constant Field Values

SIZES_EXTENSION

public static final String SIZES_EXTENSION
Standard extension for the file of sizes.

See Also:
Constant Field Values

OFFSETS_EXTENSION

public static final String OFFSETS_EXTENSION
Standard extension for the file of offsets.

See Also:
Constant Field Values

POSITIONS_NUMBER_OF_BITS_EXTENSION

public static final String POSITIONS_NUMBER_OF_BITS_EXTENSION
Standard extension for the file of lengths of positions.

See Also:
Constant Field Values

SUMS_MAX_POSITION_EXTENSION

public static final String SUMS_MAX_POSITION_EXTENSION
Standard extension for the file of lengths of positions.

See Also:
Constant Field Values

OCCURRENCIES_EXTENSION

public static final String OCCURRENCIES_EXTENSION
Standard extension for the file of global counts.

See Also:
Constant Field Values

FREQUENCIES_EXTENSION

public static final String FREQUENCIES_EXTENSION
Standard extension for the file of frequencies.

See Also:
Constant Field Values

TERMS_EXTENSION

public static final String TERMS_EXTENSION
Standard extension for the file of terms.

See Also:
Constant Field Values

UNSORTED_TERMS_EXTENSION

public static final String UNSORTED_TERMS_EXTENSION
Standard extension for the file of terms, unsorted.

See Also:
Constant Field Values

TERMMAP_EXTENSION

public static final String TERMMAP_EXTENSION
Standard extension for the term map.

See Also:
Constant Field Values

PREFIXMAP_EXTENSION

public static final String PREFIXMAP_EXTENSION
Standard extension for the prefix map.

See Also:
Constant Field Values

STATS_EXTENSION

public static final String STATS_EXTENSION
Standard extension for the stats file.

See Also:
Constant Field Values

POINTERS_EXTENSIONS

public static final String POINTERS_EXTENSIONS
The extension for the pointers bitstream.

See Also:
Constant Field Values

COUNTS_EXTENSION

public static final String COUNTS_EXTENSION
The extension for the counts bitstream.

See Also:
Constant Field Values

OFFSETS_POSTFIX

public static final String OFFSETS_POSTFIX
The postfix to be added to POINTERS_EXTENSIONS, COUNTS_EXTENSION and POSITIONS_EXTENSION for offsets.

See Also:
Constant Field Values

BUFFER_SIZE

public static final int BUFFER_SIZE
See Also:
Constant Field Values
Method Detail

readOffsets

public static LongList readOffsets(InputBitStream in,
                                   int T)
                            throws IOException
Utility method to load a compressed offset file into a list.

Parameters:
in - the input bit stream providing the offsets (see BitStreamIndexWriter).
T - the number of terms indexed.
Returns:
a list of longs backed by an array; the list has an additional final element of index T that gives the number of bytes of the index file.
Throws:
IOException

readOffsets

public static LongList readOffsets(IOFactory ioFactory,
                                   CharSequence filename,
                                   int T)
                            throws IOException
Utility method to load a compressed offset file into a list.

Parameters:
ioFactory - the factory that will be used to perform I/O.
filename - the file containing the offsets (see BitStreamIndexWriter).
T - the number of terms indexed.
Returns:
a list of longs backed by an array; the list has an additional final element of index T that gives the number of bytes of the index file.
Throws:
IOException

readOffsets

public static LongList readOffsets(CharSequence filename,
                                   int T)
                            throws IOException
Utility method to load a compressed offset file into a list using the IOFactory.FILESYSTEM_FACTORY.

Parameters:
filename - the file containing the offsets (see BitStreamIndexWriter).
T - the number of terms indexed.
Returns:
a list of longs backed by an array; the list has an additional final element of index T that gives the number of bytes of the index file.
Throws:
IOException

readSizes

public static IntList readSizes(IOFactory ioFactory,
                                CharSequence filename,
                                int N)
                         throws IOException
Utility method to load a compressed size file into a list.

Parameters:
ioFactory - the factory that will be used to perform I/O.
filename - the file containing the γ-coded sizes (see BitStreamIndexWriter).
N - the number of documents.
Returns:
a list of integers backed by an array.
Throws:
IOException

readSizes

public static IntList readSizes(CharSequence filename,
                                int N)
                         throws IOException
Utility method to load a compressed size file into a list using the IOFactory.FILESYSTEM_FACTORY.

Parameters:
filename - the file containing the γ-coded sizes (see BitStreamIndexWriter).
N - the number of documents.
Returns:
a list of integers backed by an array.
Throws:
IOException

readSizesSuccinct

@Deprecated
public static IntList readSizesSuccinct(CharSequence filename,
                                                   int N)
                                 throws IOException
Deprecated. This method is an ancestral residue.

Utility method to load a compressed size file into an Elias–Fano compressed list.

Parameters:
filename - the filename containing the γ-coded sizes (see BitStreamIndexWriter).
N - the number of documents indexed.
Returns:
a list of integers backed by an Elias–Fano compressed list.
Throws:
IllegalStateException - if ioFactory is not IOFactory.FILESYSTEM_FACTORY.
IOException

loadLongBigList

public static LongBigArrayBigList loadLongBigList(IOFactory ioFactory,
                                                  CharSequence filename,
                                                  ByteOrder byteOrder)
                                           throws IOException
Commodity method for loading a big list of binary longs with specified endianness into a long big array.

Parameters:
ioFactory - the factory that will be used to perform I/O.
filename - the file containing the longs.
byteOrder - the endianness of the longs.
Returns:
a big list of longs containing the longs in file.
Throws:
IOException

loadLongBigList

public static LongBigArrayBigList loadLongBigList(CharSequence filename,
                                                  ByteOrder byteOrder)
                                           throws IOException
Commodity method for loading a big list of binary longs with specified endianness into a long big array using the IOFactory.FILESYSTEM_FACTORY.

Parameters:
filename - the file containing the longs.
byteOrder - the endianness of the longs.
Returns:
a big list of longs containing the longs in file.
Throws:
IOException

loadLongBigList

public static LongBigArrayBigList loadLongBigList(ReadableByteChannel channel,
                                                  long length,
                                                  ByteOrder byteOrder)
                                           throws IOException
Commodity method for loading from a channel a big list of binary longs with specified endianness into a long big array.

Parameters:
channel - the channel.
byteOrder - the endianness of the longs.
Returns:
a big list of longs containing the longs returned by channel.
Throws:
IOException

byteOrder

public static ByteOrder byteOrder(String s)
Parses a ByteOrder value.

Parameters:
s - a string (either BIG_ENDIAN or LITTLE_ENDIAN).
Returns:
the corresponding byte order (ByteOrder.BIG_ENDIAN or ByteOrder.LITTLE_ENDIAN).

loadStringMap

public static StringMap<? extends CharSequence> loadStringMap(IOFactory ioFactory,
                                                              String filename)
                                                       throws IOException
Utility static method that loads a term map.

Parameters:
ioFactory - the factory that will be used to perform I/O.
filename - the name of the file containing the term map.
Returns:
the map, or null if the file did not exist.
Throws:
IOException - if some IOException (other than FileNotFoundException) occurred.

loadStringMap

public static StringMap<? extends CharSequence> loadStringMap(String filename)
                                                       throws IOException
Utility static method that loads a term map using the IOFactory.FILESYSTEM_FACTORY.

Parameters:
filename - the name of the file containing the term map.
Returns:
the map, or null if the file did not exist.
Throws:
IOException - if some IOException (other than FileNotFoundException) occurred.

loadPrefixMap

public static PrefixMap<? extends CharSequence> loadPrefixMap(IOFactory ioFactory,
                                                              String filename)
                                                       throws IOException
Utility static method that loads a prefix map.

Parameters:
ioFactory - the factory that will be used to perform I/O.
filename - the name of the file containing the prefix map.
Returns:
the map, or null if the file did not exist.
Throws:
IOException - if some IOException (other than FileNotFoundException) occurred.

loadPrefixMap

public static PrefixMap<? extends CharSequence> loadPrefixMap(String filename)
                                                       throws IOException
Utility static method that loads a prefix map using the IOFactory.FILESYSTEM_FACTORY.

Parameters:
filename - the name of the file containing the prefix map.
Returns:
the map, or null if the file did not exist.
Throws:
IOException - if some IOException (other than FileNotFoundException) occurred.

offsets

public static LongList offsets(IOFactory ioFactory,
                               String filename,
                               int numberOfTerms,
                               int offsetStep)
                        throws FileNotFoundException,
                               IOException
Returns the list of offsets.

Parameters:
ioFactory - the factory that will be used to perform I/O.
filename - the file containing the offsets.
numberOfTerms - the number of terms.
offsetStep - the offset step.
Returns:
if offsetStep is less than zero, a memory-mapped, synchronized SemiExternalOffsetList with offset step equal to -offsetStep; if it is zero, an in-memory list; if it is greater than than zero, we return a synchronized SemiExternalOffsetList with offset step equal to -offsetStep.
Throws:
FileNotFoundException
IOException

offsets

public static LongList offsets(String filename,
                               int numberOfTerms,
                               int offsetStep)
                        throws FileNotFoundException,
                               IOException
Returns the list of offsets using the IOFactory.FILESYSTEM_FACTORY.

Parameters:
filename - the file containing the offsets.
numberOfTerms - the number of terms.
offsetStep - the offset step.
Returns:
if offsetStep is less than zero, a memory-mapped, synchronized SemiExternalOffsetList with offset step equal to -offsetStep; if it is zero, an in-memory list; if it is greater than than zero, we return a synchronized SemiExternalOffsetList with offset step equal to -offsetStep.
Throws:
FileNotFoundException
IOException

getInstance

public static Index getInstance(IOFactory ioFactory,
                                CharSequence basename,
                                Properties properties,
                                StringMap<? extends CharSequence> termMap,
                                PrefixMap<? extends CharSequence> prefixMap,
                                boolean randomAccess,
                                boolean documentSizes,
                                EnumMap<Index.UriKeys,String> queryProperties)
                         throws ClassNotFoundException,
                                IOException,
                                InstantiationException,
                                IllegalAccessException
Returns a new disk-based index, loading exactly the specified parts and using preloaded Properties.

Parameters:
ioFactory - the factory that will be used to perform I/O.
basename - the basename of the index.
properties - the properties obtained from the given basename.
termMap - the term map for this index, or null for no term map.
prefixMap - the prefix map for this index, or null for no prefix map.
randomAccess - whether the index should be accessible randomly (e.g., if it will be possible to call IndexReader.documents(int) on the index readers returned by the index).
documentSizes - if true, document sizes will be loaded (note that sometimes document sizes might be loaded anyway because the compression method for positions requires it).
queryProperties - a map containing associations between Index.UriKeys and values, or null.
Throws:
ClassNotFoundException
IOException
InstantiationException
IllegalAccessException

getInstance

public static Index getInstance(CharSequence basename,
                                Properties properties,
                                StringMap<? extends CharSequence> termMap,
                                PrefixMap<? extends CharSequence> prefixMap,
                                boolean randomAccess,
                                boolean documentSizes,
                                EnumMap<Index.UriKeys,String> queryProperties)
                         throws ClassNotFoundException,
                                IOException,
                                InstantiationException,
                                IllegalAccessException
Returns a new disk-based index, loading exactly the specified parts and using preloaded Properties and the IOFactory.FILESYSTEM_FACTORY.

Parameters:
basename - the basename of the index.
properties - the properties obtained from the given basename.
termMap - the term map for this index, or null for no term map.
prefixMap - the prefix map for this index, or null for no prefix map.
randomAccess - whether the index should be accessible randomly (e.g., if it will be possible to call IndexReader.documents(int) on the index readers returned by the index).
documentSizes - if true, document sizes will be loaded (note that sometimes document sizes might be loaded anyway because the compression method for positions requires it).
queryProperties - a map containing associations between Index.UriKeys and values, or null.
Throws:
ClassNotFoundException
IOException
InstantiationException
IllegalAccessException

getInstance

public static Index getInstance(IOFactory ioFactory,
                                CharSequence basename,
                                Properties properties,
                                boolean randomAccess,
                                boolean documentSizes,
                                boolean maps,
                                EnumMap<Index.UriKeys,String> queryProperties)
                         throws ClassNotFoundException,
                                IOException,
                                InstantiationException,
                                IllegalAccessException
Returns a new disk-based index, using preloaded Properties and possibly guessing reasonable term and prefix maps from the basename.

Parameters:
ioFactory - the factory that will be used to perform I/O.
basename - the basename of the index.
properties - the properties obtained by stemming basename.
randomAccess - whether the index should be accessible randomly.
documentSizes - if true, document sizes will be loaded.
maps - if true, term and prefix maps will be guessed and loaded.
queryProperties - a map containing associations between Index.UriKeys and values, or null.
Throws:
IllegalAccessException
InstantiationException
ClassNotFoundException
IOException
See Also:
getInstance(CharSequence, Properties, StringMap, PrefixMap, boolean, boolean, EnumMap)

getInstance

public static Index getInstance(CharSequence basename,
                                Properties properties,
                                boolean randomAccess,
                                boolean documentSizes,
                                boolean maps,
                                EnumMap<Index.UriKeys,String> queryProperties)
                         throws ClassNotFoundException,
                                IOException,
                                InstantiationException,
                                IllegalAccessException
Returns a new disk-based index, using preloaded Properties and possibly guessing reasonable term and prefix maps from the basename.

Parameters:
basename - the basename of the index.
properties - the properties obtained by stemming basename.
randomAccess - whether the index should be accessible randomly.
documentSizes - if true, document sizes will be loaded.
maps - if true, term and prefix maps will be guessed and loaded.
queryProperties - a map containing associations between Index.UriKeys and values, or null.
Throws:
IllegalAccessException
InstantiationException
ClassNotFoundException
IOException
See Also:
getInstance(CharSequence, Properties, StringMap, PrefixMap, boolean, boolean, EnumMap)

getInstance

public static Index getInstance(CharSequence basename,
                                boolean randomAccess,
                                boolean documentSizes,
                                boolean maps,
                                EnumMap<Index.UriKeys,String> queryProperties)
                         throws ConfigurationException,
                                ClassNotFoundException,
                                IOException,
                                InstantiationException,
                                IllegalAccessException
Returns a new disk-based index, possibly guessing reasonable term and prefix maps from the basename.

If there is a term map file (basename stemmed with .termmap), it is used as term map and, in case it implements PrefixMap. Otherwise, we search for a prefix map (basename stemmed with .prefixmap) and, if it implements StringMap and no term map has been found, we use it as prefix map.

Parameters:
basename - the basename of the index.
randomAccess - whether the index should be accessible randomly (e.g., if it will be possible to call IndexReader.documents(int) on the index readers returned by the index).
documentSizes - if true, document sizes will be loaded (note that sometimes document sizes might be loaded anyway because the compression method for positions requires it).
maps - if true, term and prefix maps will be guessed and loaded (this feature might not be available with some kind of index).
queryProperties - a map containing associations between Index.UriKeys and values, or null.
Throws:
ConfigurationException
ClassNotFoundException
IOException
InstantiationException
IllegalAccessException

getInstance

public static Index getInstance(CharSequence basename,
                                boolean randomAccess,
                                boolean documentSizes,
                                boolean maps)
                         throws ConfigurationException,
                                ClassNotFoundException,
                                IOException,
                                InstantiationException,
                                IllegalAccessException
Returns a new disk-based index, using preloaded Properties and possibly guessing reasonable term and prefix maps from the basename.

If there is a term map file (basename stemmed with .termmap), it is used as term map and, in case it implements PrefixMap. Otherwise, we search for a prefix map (basename stemmed with .prefixmap) and, if it implements StringMap and no term map has been found, we use it as prefix map.

Parameters:
basename - the basename of the index.
randomAccess - whether the index should be accessible randomly (e.g., if it will be possible to call IndexReader.documents(int) on the index readers returned by the index).
documentSizes - if true, document sizes will be loaded (note that sometimes document sizes might be loaded anyway because the compression method for positions requires it).
maps - if true, term and prefix maps will be guessed and loaded (this feature might not be available with some kind of index).
Throws:
ConfigurationException
ClassNotFoundException
IOException
InstantiationException
IllegalAccessException
See Also:
getInstance(CharSequence, boolean, boolean, boolean, EnumMap)

getInstance

public static Index getInstance(CharSequence basename,
                                boolean randomAccess,
                                boolean documentSizes)
                         throws ConfigurationException,
                                ClassNotFoundException,
                                IOException,
                                InstantiationException,
                                IllegalAccessException
Returns a new disk-based index, guessing reasonable term and prefix maps from the basename.

Parameters:
basename - the basename of the index.
randomAccess - whether the index should be accessible randomly (e.g., if it will be possible to call IndexReader.documents(int) on the index readers returned by the index).
documentSizes - if true, document sizes will be loaded (note that sometimes document sizes might be loaded anyway because the compression method for positions requires it).
Throws:
ConfigurationException
ClassNotFoundException
IOException
InstantiationException
IllegalAccessException

getInstance

public static Index getInstance(CharSequence basename,
                                boolean randomAccess)
                         throws ConfigurationException,
                                ClassNotFoundException,
                                IOException,
                                InstantiationException,
                                IllegalAccessException
Returns a new local index, trying to guess reasonable term and prefix maps from the basename, and loading document sizes only if it is necessary.

Parameters:
basename - the basename of the index.
randomAccess - whether the index should be accessible randomly (e.g., if it will be possible to call IndexReader.documents(int) on the index readers returned by the index).
Throws:
ConfigurationException
ClassNotFoundException
IOException
InstantiationException
IllegalAccessException

getInstance

public static Index getInstance(CharSequence basename)
                         throws ConfigurationException,
                                ClassNotFoundException,
                                IOException,
                                InstantiationException,
                                IllegalAccessException
Returns a new local index, trying to guess reasonable term and prefix maps from the basename, loading offsets but loading document sizes only if it is necessary.

Parameters:
basename - the basename of the index.
Throws:
ConfigurationException
ClassNotFoundException
IOException
InstantiationException
IllegalAccessException