Class TfIdfScorer
- java.lang.Object
-
- it.unimi.di.big.mg4j.search.score.AbstractScorer
-
- it.unimi.di.big.mg4j.search.score.AbstractWeightedScorer
-
- it.unimi.di.big.mg4j.search.score.TfIdfScorer
-
- All Implemented Interfaces:
DelegatingScorer
,Scorer
,FlyweightPrototype<Scorer>
public class TfIdfScorer extends AbstractWeightedScorer implements DelegatingScorer
A scorer that implements the TF/IDF ranking formula.There are a number of incarnations with small variations of the formula itself. Here, the weight assigned to a term which appears in f documents out of a collection of N documents w.r.t. to a document of length l in which the term appears c times is
log(N / f) c / l,This class uses a
CounterCollectionVisitor
and related classes to take into consideration only terms that are actually involved in the current document.- Author:
- Sebastiano Vigna
-
-
Field Summary
-
Fields inherited from class it.unimi.di.big.mg4j.search.score.AbstractWeightedScorer
index2Weight
-
Fields inherited from class it.unimi.di.big.mg4j.search.score.AbstractScorer
documentIterator, indexIterator
-
-
Constructor Summary
Constructors Constructor Description TfIdfScorer()
-
Method Summary
Modifier and Type Method Description TfIdfScorer
copy()
double
score()
Computes a score by callingScorer.score(Index)
for each index in the current document iterator, and adding the weighted results.double
score(Index index)
Returns a score for the current document of the last document iterator given toScorer.wrap(DocumentIterator)
, but considering only a given index (optional operation).boolean
usesIntervals()
Whether this scorer uses intervals.void
wrap(DocumentIterator d)
Wraps the given document iterator.-
Methods inherited from class it.unimi.di.big.mg4j.search.score.AbstractWeightedScorer
getWeights, setWeights
-
Methods inherited from class it.unimi.di.big.mg4j.search.score.AbstractScorer
nextDocument
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface it.unimi.di.big.mg4j.search.score.Scorer
getWeights, nextDocument, setWeights
-
-
-
-
Method Detail
-
copy
public TfIdfScorer copy()
- Specified by:
copy
in interfaceFlyweightPrototype<Scorer>
- Specified by:
copy
in interfaceScorer
-
score
public double score() throws IOException
Description copied from class:AbstractWeightedScorer
Computes a score by callingScorer.score(Index)
for each index in the current document iterator, and adding the weighted results.- Specified by:
score
in interfaceScorer
- Overrides:
score
in classAbstractWeightedScorer
- Returns:
- the combined weighted score.
- Throws:
IOException
-
score
public double score(Index index)
Description copied from interface:Scorer
Returns a score for the current document of the last document iterator given toScorer.wrap(DocumentIterator)
, but considering only a given index (optional operation).
-
wrap
public void wrap(DocumentIterator d) throws IOException
Description copied from class:AbstractScorer
Wraps the given document iterator.This method records internally the provided iterator.
- Specified by:
wrap
in interfaceScorer
- Overrides:
wrap
in classAbstractWeightedScorer
- Parameters:
d
- the document iterator that will be used in subsequent calls toScorer.score()
andScorer.score(Index)
.- Throws:
IOException
-
usesIntervals
public boolean usesIntervals()
Description copied from interface:Scorer
Whether this scorer uses intervals.This method is essential when aggregating scorers, because if several scores need intervals, a
CachingDocumentIterator
will be necessary.- Specified by:
usesIntervals
in interfaceScorer
- Returns:
- true if this scorer uses intervals.
-
-