Class ClarkeCormackScorer
- java.lang.Object
-
- it.unimi.di.big.mg4j.search.score.AbstractScorer
-
- it.unimi.di.big.mg4j.search.score.AbstractWeightedScorer
-
- it.unimi.di.big.mg4j.search.score.ClarkeCormackScorer
-
- All Implemented Interfaces:
DelegatingScorer
,Scorer
,FlyweightPrototype<Scorer>
public class ClarkeCormackScorer extends AbstractWeightedScorer implements DelegatingScorer
Computes the Clarke–Cormack score of all interval iterators of a document. This score function is defined in Charles L.A. Clarke and Gordon V. Cormack, “Shortest-Substring Retrieval and Ranking”, ACM Transactions on Information Systems, 18(1):44−78, 2000, at page 65.The score for each index depends on two parameters: an integer h and a double α. The score is obtained summing up a certain score assigned to all intervals in the interval iterator under examination. The score assigned to an interval is 1 if the interval has length smaller than h; otherwise, it is obtained by dividing h by the interval length, and raising the result to the power of α.
Note that the score assigned to each interval is between 0 and 1 (highest scores corresponding to best intervals). The score assigned to an interval iterator is thus bounded from above by the number of intervals; an alternative version allows one to have normalized scores (in this case, the resulting value is an average instead of a sum). A scorer with similar relative ranks, but inherently (almost) normalised is provided by
VignaScorer
.Typically, one sets h=16 (or a bit larger) and α=1 (or a bit smaller), but the authors say that the method is rather stable w.r.t. changes in the values of parameters.
-
-
Field Summary
Fields Modifier and Type Field Description double
alpha
The parameter alpha.static int
DEFAULT_H
The default value for h.int
h
The parameter h.boolean
normalize
Whether the result should be normalized (i.e., between 0 and 1).-
Fields inherited from class it.unimi.di.big.mg4j.search.score.AbstractWeightedScorer
index2Weight
-
Fields inherited from class it.unimi.di.big.mg4j.search.score.AbstractScorer
documentIterator, indexIterator
-
-
Constructor Summary
Constructors Constructor Description ClarkeCormackScorer()
Default constructor, assigning the default values (h=DEFAULT_H
, α=1) to the parameters; the resulting scorer is normalized.ClarkeCormackScorer(int h, double alpha, boolean normalize)
Creates a Clarke–Cormack scorer.ClarkeCormackScorer(String h, String alpha, String normalize)
Creates a Clarke–Cormack scorer.
-
Method Summary
Modifier and Type Method Description ClarkeCormackScorer
copy()
double
score(Index index)
Returns a score for the current document of the last document iterator given toScorer.wrap(DocumentIterator)
, but considering only a given index (optional operation).String
toString()
boolean
usesIntervals()
Returns true.-
Methods inherited from class it.unimi.di.big.mg4j.search.score.AbstractWeightedScorer
getWeights, score, setWeights, wrap
-
Methods inherited from class it.unimi.di.big.mg4j.search.score.AbstractScorer
nextDocument
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface it.unimi.di.big.mg4j.search.score.Scorer
getWeights, nextDocument, score, setWeights, wrap
-
-
-
-
Field Detail
-
DEFAULT_H
public static final int DEFAULT_H
The default value for h.- See Also:
- Constant Field Values
-
h
public final int h
The parameter h.
-
alpha
public final double alpha
The parameter alpha.
-
normalize
public final boolean normalize
Whether the result should be normalized (i.e., between 0 and 1).
-
-
Constructor Detail
-
ClarkeCormackScorer
public ClarkeCormackScorer(int h, double alpha, boolean normalize)
Creates a Clarke–Cormack scorer.- Parameters:
h
- the parameter h.alpha
- the parameter α.normalize
- whether the result should be normalized.
-
ClarkeCormackScorer
public ClarkeCormackScorer(String h, String alpha, String normalize)
Creates a Clarke–Cormack scorer.- Parameters:
h
- the parameter h.alpha
- the parameter α.normalize
- whether the result should be normalized.
-
ClarkeCormackScorer
public ClarkeCormackScorer()
Default constructor, assigning the default values (h=DEFAULT_H
, α=1) to the parameters; the resulting scorer is normalized.
-
-
Method Detail
-
copy
public ClarkeCormackScorer copy()
- Specified by:
copy
in interfaceFlyweightPrototype<Scorer>
- Specified by:
copy
in interfaceScorer
-
score
public double score(Index index) throws IOException
Description copied from interface:Scorer
Returns a score for the current document of the last document iterator given toScorer.wrap(DocumentIterator)
, but considering only a given index (optional operation).- Specified by:
score
in interfaceScorer
- Parameters:
index
- the only index to be considered.- Returns:
- the score.
- Throws:
IOException
-
usesIntervals
public boolean usesIntervals()
Returns true.- Specified by:
usesIntervals
in interfaceScorer
- Returns:
- true.
-
-