it.unimi.di.mg4j.index.cluster
Class ChainedLexicalClusteringStrategy

java.lang.Object
  extended by it.unimi.di.mg4j.index.cluster.ChainedLexicalClusteringStrategy
All Implemented Interfaces:
ClusteringStrategy, LexicalClusteringStrategy, Serializable

public class ChainedLexicalClusteringStrategy
extends Object
implements LexicalClusteringStrategy

A lexical clustering strategy that uses a chain of responsability to choose the local index: term maps out of a given list are inquired until one contains the given term.

If the index cluster has Bloom filters, they will be used to reduce useless accesses to term maps.

The intended usage of this class is memory/disk lexical partitioning. Note that a serialised version of this class is empty. It acts just like a placeholder, so that loaders now that they must generate a new instance depending on the indices contained in the cluster.

Author:
Sebastiano Vigna
See Also:
Serialized Form

Constructor Summary
ChainedLexicalClusteringStrategy(Index[] index)
          Creates a new chained lexical clustering strategy.
ChainedLexicalClusteringStrategy(Index[] index, BloomFilter[] termFilter)
          Creates a new chained lexical clustering strategy using additional Bloom filters.
 
Method Summary
 int globalNumber(int localIndex, int localNumber)
          Returns the global term number given a local index and a local term number (optional operation).
 int localIndex(CharSequence term)
          Returns the index to which a given term is be mapped by this strategy.
 int numberOfLocalIndices()
          Returns the number of local indices handled by this strategy.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ChainedLexicalClusteringStrategy

public ChainedLexicalClusteringStrategy(Index[] index,
                                        BloomFilter[] termFilter)
Creates a new chained lexical clustering strategy using additional Bloom filters.

Note that the static type of the parameter index is an array of Index, but the elements of the array must be disk-based indices, or an exception will be thrown.

Parameters:
index - an array of disk-based indices, from which term maps will be extracted.
termFilter - an array, parallel to index, of Bloom filter representing the terms contained in each local index.

ChainedLexicalClusteringStrategy

public ChainedLexicalClusteringStrategy(Index[] index)
Creates a new chained lexical clustering strategy.

Note that the static type of the parameter index is an array of Index, but the elements of the array must be disk-based indices, or an exception will be thrown.

Parameters:
index - an array of disk-based indices, from which term maps will be extracted.
Method Detail

numberOfLocalIndices

public int numberOfLocalIndices()
Description copied from interface: ClusteringStrategy
Returns the number of local indices handled by this strategy.

Specified by:
numberOfLocalIndices in interface ClusteringStrategy
Returns:
the number of local indices handled by this strategy.

localIndex

public int localIndex(CharSequence term)
Description copied from interface: LexicalClusteringStrategy
Returns the index to which a given term is be mapped by this strategy.

Specified by:
localIndex in interface LexicalClusteringStrategy
Parameters:
term - a term.
Returns:
the corresponding local index, or -1 if no index contains the term.

globalNumber

public int globalNumber(int localIndex,
                        int localNumber)
Description copied from interface: LexicalClusteringStrategy
Returns the global term number given a local index and a local term number (optional operation).

This operation is not, in general, necessary for a LexicalCluster to work, as no action on a local index returns local numbers. It is defined here mainly for completeness and for debugging purposes (in case it is implemented).

Specified by:
globalNumber in interface LexicalClusteringStrategy
Parameters:
localIndex - the local index.
localNumber - the local term number.
Returns:
the global term number.