Class ClarkeCormackScorer
- java.lang.Object
-
- it.unimi.di.big.mg4j.search.score.AbstractScorer
-
- it.unimi.di.big.mg4j.search.score.AbstractWeightedScorer
-
- it.unimi.di.big.mg4j.search.score.ClarkeCormackScorer
-
- All Implemented Interfaces:
DelegatingScorer,Scorer,FlyweightPrototype<Scorer>
public class ClarkeCormackScorer extends AbstractWeightedScorer implements DelegatingScorer
Computes the Clarke–Cormack score of all interval iterators of a document. This score function is defined in Charles L.A. Clarke and Gordon V. Cormack, “Shortest-Substring Retrieval and Ranking”, ACM Transactions on Information Systems, 18(1):44−78, 2000, at page 65.The score for each index depends on two parameters: an integer h and a double α. The score is obtained summing up a certain score assigned to all intervals in the interval iterator under examination. The score assigned to an interval is 1 if the interval has length smaller than h; otherwise, it is obtained by dividing h by the interval length, and raising the result to the power of α.
Note that the score assigned to each interval is between 0 and 1 (highest scores corresponding to best intervals). The score assigned to an interval iterator is thus bounded from above by the number of intervals; an alternative version allows one to have normalized scores (in this case, the resulting value is an average instead of a sum). A scorer with similar relative ranks, but inherently (almost) normalised is provided by
VignaScorer.Typically, one sets h=16 (or a bit larger) and α=1 (or a bit smaller), but the authors say that the method is rather stable w.r.t. changes in the values of parameters.
-
-
Field Summary
Fields Modifier and Type Field Description doublealphaThe parameter alpha.static intDEFAULT_HThe default value for h.inthThe parameter h.booleannormalizeWhether the result should be normalized (i.e., between 0 and 1).-
Fields inherited from class it.unimi.di.big.mg4j.search.score.AbstractWeightedScorer
index2Weight
-
Fields inherited from class it.unimi.di.big.mg4j.search.score.AbstractScorer
documentIterator, indexIterator
-
-
Constructor Summary
Constructors Constructor Description ClarkeCormackScorer()Default constructor, assigning the default values (h=DEFAULT_H, α=1) to the parameters; the resulting scorer is normalized.ClarkeCormackScorer(int h, double alpha, boolean normalize)Creates a Clarke–Cormack scorer.ClarkeCormackScorer(String h, String alpha, String normalize)Creates a Clarke–Cormack scorer.
-
Method Summary
Modifier and Type Method Description ClarkeCormackScorercopy()doublescore(Index index)Returns a score for the current document of the last document iterator given toScorer.wrap(DocumentIterator), but considering only a given index (optional operation).StringtoString()booleanusesIntervals()Returns true.-
Methods inherited from class it.unimi.di.big.mg4j.search.score.AbstractWeightedScorer
getWeights, score, setWeights, wrap
-
Methods inherited from class it.unimi.di.big.mg4j.search.score.AbstractScorer
nextDocument
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface it.unimi.di.big.mg4j.search.score.Scorer
getWeights, nextDocument, score, setWeights, wrap
-
-
-
-
Field Detail
-
DEFAULT_H
public static final int DEFAULT_H
The default value for h.- See Also:
- Constant Field Values
-
h
public final int h
The parameter h.
-
alpha
public final double alpha
The parameter alpha.
-
normalize
public final boolean normalize
Whether the result should be normalized (i.e., between 0 and 1).
-
-
Constructor Detail
-
ClarkeCormackScorer
public ClarkeCormackScorer(int h, double alpha, boolean normalize)Creates a Clarke–Cormack scorer.- Parameters:
h- the parameter h.alpha- the parameter α.normalize- whether the result should be normalized.
-
ClarkeCormackScorer
public ClarkeCormackScorer(String h, String alpha, String normalize)
Creates a Clarke–Cormack scorer.- Parameters:
h- the parameter h.alpha- the parameter α.normalize- whether the result should be normalized.
-
ClarkeCormackScorer
public ClarkeCormackScorer()
Default constructor, assigning the default values (h=DEFAULT_H, α=1) to the parameters; the resulting scorer is normalized.
-
-
Method Detail
-
copy
public ClarkeCormackScorer copy()
- Specified by:
copyin interfaceFlyweightPrototype<Scorer>- Specified by:
copyin interfaceScorer
-
score
public double score(Index index) throws IOException
Description copied from interface:ScorerReturns a score for the current document of the last document iterator given toScorer.wrap(DocumentIterator), but considering only a given index (optional operation).- Specified by:
scorein interfaceScorer- Parameters:
index- the only index to be considered.- Returns:
- the score.
- Throws:
IOException
-
usesIntervals
public boolean usesIntervals()
Returns true.- Specified by:
usesIntervalsin interfaceScorer- Returns:
- true.
-
-