Class TfIdfScorer

  • All Implemented Interfaces:
    DelegatingScorer, Scorer, FlyweightPrototype<Scorer>

    public class TfIdfScorer
    extends AbstractWeightedScorer
    implements DelegatingScorer
    A scorer that implements the TF/IDF ranking formula.

    There are a number of incarnations with small variations of the formula itself. Here, the weight assigned to a term which appears in f documents out of a collection of N documents w.r.t. to a document of length l in which the term appears c times is

    log(N / f) c / l,

    This class uses a CounterCollectionVisitor and related classes to take into consideration only terms that are actually involved in the current document.

    Sebastiano Vigna