it.unimi.di.mg4j.io
Class InterpolativeCoding

java.lang.Object
  extended by it.unimi.di.mg4j.io.InterpolativeCoding

public final class InterpolativeCoding
extends Object

Static methods implementing interpolative coding.

Interpolative coding is a sophisticated compression technique that can be applied to increasing sequences of integers. It is based on the idea that, for instance, when compressing the sequence 2 5 6 we already know that between 2 and 6 there are only 3 integers, so if we already know that the middle integer is between 2 and 6 we can use a small index to denote 5 among 3, 4, and 5.

The main limitation of interpolative coding is that it needs to code and decode the entire sequence in an array. This, however, makes it very suitable to code the positions of the occurrences of a term in a document, in particular in short documents.

Since:
0.6
Author:
Sebastiano Vigna

Method Summary
static void read(InputBitStream in, int[] data, int offset, int len, int lo, int hi)
          Reads from a bit stream an increasing sequence of integers coded using interpolative coding.
static int write(OutputBitStream out, int[] data, int offset, int len, int lo, int hi)
          Writes to a bit stream a increasing sequence of integers using interpolative coding.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

write

public static int write(OutputBitStream out,
                        int[] data,
                        int offset,
                        int len,
                        int lo,
                        int hi)
                 throws IOException
Writes to a bit stream a increasing sequence of integers using interpolative coding.

Note that the length of the sequence and the arguments lo and hi must be known at decoding time.

Parameters:
out - the output bit stream.
data - the vector containing the integer sequence.
offset - the offset into data where the sequence starts.
len - the number of integers to code.
lo - a lower bound (must be smaller than or equal to the first integer in the sequence).
hi - an upper bound (must be greater than or equal to the last integer in the sequence).
Returns:
the number of written bits.
Throws:
IOException

read

public static void read(InputBitStream in,
                        int[] data,
                        int offset,
                        int len,
                        int lo,
                        int hi)
                 throws IOException
Reads from a bit stream an increasing sequence of integers coded using interpolative coding.

Parameters:
in - the input bit stream.
data - the vector that will store the sequence; it may be null, in which case the integers are discarded.
offset - the offset into data where to store the result.
len - the number of integers to decode.
lo - a lower bound (the same as the one given to write()).
hi - an upper bound (the same as the one given to write()).
Throws:
IOException