Class InterpolativeCoding


  • public final class InterpolativeCoding
    extends Object
    Static methods implementing interpolative coding.

    Interpolative coding is a sophisticated compression technique that can be applied to increasing sequences of integers. It is based on the idea that, for instance, when compressing the sequence 2 5 6 we already know that between 2 and 6 there are only 3 integers, so if we already know that the middle integer is between 2 and 6 we can use a small index to denote 5 among 3, 4, and 5.

    The main limitation of interpolative coding is that it needs to code and decode the entire sequence in an array. This, however, makes it very suitable to code the positions of the occurrences of a term in a document, in particular in short documents.

    Since:
    0.6
    Author:
    Sebastiano Vigna
    • Method Summary

      Modifier and Type Method Description
      static void read​(InputBitStream in, int[] data, int offset, int len, int lo, int hi)
      Reads from a bit stream an increasing sequence of integers coded using interpolative coding.
      static int write​(OutputBitStream out, int[] data, int offset, int len, int lo, int hi)
      Writes to a bit stream a increasing sequence of integers using interpolative coding.
    • Method Detail

      • write

        public static int write​(OutputBitStream out,
                                int[] data,
                                int offset,
                                int len,
                                int lo,
                                int hi)
                         throws IOException
        Writes to a bit stream a increasing sequence of integers using interpolative coding.

        Note that the length of the sequence and the arguments lo and hi must be known at decoding time.

        Parameters:
        out - the output bit stream.
        data - the vector containing the integer sequence.
        offset - the offset into data where the sequence starts.
        len - the number of integers to code.
        lo - a lower bound (must be smaller than or equal to the first integer in the sequence).
        hi - an upper bound (must be greater than or equal to the last integer in the sequence).
        Returns:
        the number of written bits.
        Throws:
        IOException
      • read

        public static void read​(InputBitStream in,
                                int[] data,
                                int offset,
                                int len,
                                int lo,
                                int hi)
                         throws IOException
        Reads from a bit stream an increasing sequence of integers coded using interpolative coding.
        Parameters:
        in - the input bit stream.
        data - the vector that will store the sequence; it may be null, in which case the integers are discarded.
        offset - the offset into data where to store the result.
        len - the number of integers to decode.
        lo - a lower bound (the same as the one given to write()).
        hi - an upper bound (the same as the one given to write()).
        Throws:
        IOException