Class QueryEngine

  • All Implemented Interfaces:
    FlyweightPrototype<QueryEngine>

    public class QueryEngine
    extends Object
    implements FlyweightPrototype<QueryEngine>
    An engine that takes a query and returns results, using a programmable set of scorers and policies.

    This class embodies most of the work that must be done when answering a query. Basically, process(query,offset,length,results) takes query, parses it, turns it into a document iterator, scans the results, and deposits length results starting at offset into the list results.

    There however several additional features available. First of all, either by separating several queries with commas, or using directly process(Query[], int, int, ObjectArrayList) it is possible to resolve a series of queries with an “and-then” semantics: results are added from each query, provided they did not appear before.

    It is possible to score queries using one or more scorer with different weights (see it.unimi.di.big.mg4j.search.score), and also set different weights for different indices (they will be passed to the scorers). The scorers influence the order when processing each query, but results from different “and-then” queries are simply concatenated.

    When using multiple scorers, equalisation can be used to avoid the problem associated with the potentially different value ranges of each scorer. Equalisation evaluates a settable number of sample documents and normalize the scorers using the maximum value in the sample. See AbstractAggregator for some elaboration.

    Multiplexing transforms a query q into index0:q | index1:q. In other words, the query is multiplexed on all available indices. Note that if inside q there are selection operators that specify an index, the inner specification will overwrite the external one, so that the semantics of the query is only amplified, but never contradicted.

    The results returned are instances of DocumentScoreInfo. If an interval selector has been set, the info field will contain a map from indices to arrays of selected intervals satisfying the query (see it.unimi.di.big.mg4j.search for some elaboration on minimal-interval semantics support in MG4J).

    For examples of usage of this class, please look at Query and QueryServlet.

    Warning: This class is highly experimental. It has become definitely more decent in MG4J, but still needs some refactoring.

    Warning: This class is not thread safe, but it provides flyweight copies. The copy() method is strengthened so to return an object implementing this interface.

    Since:
    1.0
    Author:
    Sebastiano Vigna, Paolo Boldi
    • Field Detail

      • queryParser

        public final QueryParser queryParser
        The parser used to parse queries.
      • numIndices

        public final int numIndices
        The number of indices used by queryParser.
      • multiplex

        public volatile boolean multiplex
        Whether multiplex is active.
      • intervalSelector

        public volatile IntervalSelector intervalSelector
        The current interval selector, if any.
    • Method Detail

      • equalize

        public void equalize​(int samples)
        Activate equalisation with the given number of samples-
        Parameters:
        samples - the number of samples for equalisation, or 0 for no equalisation.
      • score

        public void score​(Scorer[] scorer,
                          double[] weight)
        Sets the scorers for this query engine.

        If scorer has length zero, scoring is disabled. If it has length 1, the only scorer is used for scoring, and the only element of weight is discarded. Otherwise, a LinearAggregator is used to combine results from the given scorers, using the given weights.

        Parameters:
        scorer - a (possibly empty) array of scorers.
        weight - a parallel array of weights (not to be confused with index weights).
      • transformer

        public void transformer​(QueryTransformer transformer)
        Sets the transformer for this engine, or disables query transformation.
        Parameters:
        transformer - a query transformer, or null to disable query transformation.
      • process

        public int process​(Query query,
                           int offset,
                           int length,
                           ObjectArrayList<DocumentScoreInfo<Reference2ObjectMap<Index,​SelectedInterval[]>>> results)
                    throws QueryBuilderVisitorException,
                           IOException
        Processes one pre-parsed query and deposits in a given array a segment of the results corresponding to the query, using the current settings of this query engine.

        Results are accumulated with an “and-then” semantics: results are added from each query in order, provided they did not appear before.

        Parameters:
        query - a query;
        offset - the first result to be added to results.
        length - the number of results to be added to results
        results - an array list that will hold all results.
        Returns:
        the number of documents scanned while filling results.
        Throws:
        QueryBuilderVisitorException
        IOException
      • process

        public int process​(Query[] query,
                           int offset,
                           int length,
                           ObjectArrayList<DocumentScoreInfo<Reference2ObjectMap<Index,​SelectedInterval[]>>> results)
                    throws QueryBuilderVisitorException,
                           IOException
        Processes one or more pre-parsed queries and deposits in a given array a segment of the results corresponding to the queries, using the current settings of this query engine.

        Results are accumulated with an “and-then” semantics: results are added from each query in order, provided they did not appear before.

        Parameters:
        query - an array of queries.
        offset - the first result to be added to results.
        length - the number of results to be added to results
        results - an array list that will hold all results.
        Returns:
        the number of documents scanned while filling results.
        Throws:
        QueryBuilderVisitorException
        IOException