Once the index has been created, there are many ways in which you
can improve query resolution time. First of all, an index can be read
from disk, memory-mapped, or directly loaded into main memory. These
three solutions work with increasing speed and increased main memory
usage. The default for quasi-succinct indices is to map the index in
memory. Otherwise, the default is to read the index from disk. You can
add suitable options to the index URI (e.g., mapped=1
or inmemory=1
―see the
Index.UriKeys
documentation) to force your
preferences. Analogously, offsets are necessary
to locate, inside the index file, the posting list of a certain term. By
default they are read from disk using a
SemiExternalOffsetList
, but you can load them in
memory if you prefer so using the offsetstep
parameter. If you load sizes (e.g., because you want to run a scorer
that needs them) there is a suitable URI option (e.g.,
succinctsizes=1
) that will load sizes in a highly
compact format. This is particularly useful when pasting large
indices.
To get more options, you can partition your index (see below). Once you have a cluster formed by several sub-indices, you can decide which sub-indices go to memory, which will be mapped, and so on.