Once the index has been created, there are many ways in which you
can improve query resolution time. First of all, an index can be read
from disk, memory-mapped, or directly loaded into main memory. These
three solutions work with increasing speed and increased main memory
usage. The default is to read an index from disk, but you can add
suitable options to the index URI (e.g., mapped=1
or
inmemory=1
―see the
Index.UriKeys
documentation) to force your
preferences. Analogously, offsets are necessary
to locate, inside the index file, the posting list of a certain term. By
default they are read from disk using a
SemiExternalOffsetList
, but you can load them in
memory if you prefer so. If you load sizes (e.g., because you want to
run a scorer that needs them) there is a suitable URI option (e.g.,
succinctsizes=1
) that will load sizes in a highly
compact format. This is particularly useful when pasting large
indices.
To get more options, you can partition your index. Once you have a cluster formed by several sub-indices, you can decide which sub-indices go to memory, which will be mapped, and so on.
An important source of delay in loading the index is the expansion
of the dump file of an
ImmutableExternalPrefixMap
, which is the default
term map generated by IndexBuilder
. The dump file
must be copied from the serialized representation to a temporary
directory, and for large collections the process can be very slow. The
solution is either to use a different term map (e.g., some kind of
signed hash—see the minimal perfect hash classes of Sux4J) to generate (either
programmatically or using the main method of
ImmutableExternalPrefixMap
) a
non-self-contained, synchronized instance of
ImmutableExternalPrefixMap
and save it using the
standard suffix for term maps. Such an instance is based on a separate
dump file that must be attached to the deserialized instance before
usage (see the documentation for details). You can attach the dump
stream by invoking
((ImmutableExternalPrefixMap)index.termMap).setDumpStream( filename );
with the appropriate argument.