it.unimi.di.mg4j.query
Class QueryEngine

java.lang.Object
  extended by it.unimi.di.mg4j.query.QueryEngine
All Implemented Interfaces:
FlyweightPrototype<QueryEngine>

public class QueryEngine
extends Object
implements FlyweightPrototype<QueryEngine>

An engine that takes a query and returns results, using a programmable set of scorers and policies.

This class embodies most of the work that must be done when answering a query. Basically, process(query,offset,length,results) takes query, parses it, turns it into a document iterator, scans the results, and deposits length results starting at offset into the list results.

There however several additional features available. First of all, either by separating several queries with commas, or using directly process(Query[], int, int, ObjectArrayList) it is possible to resolve a series of queries with an “and-then” semantics: results are added from each query, provided they did not appear before.

It is possible to score queries using one or more scorer with different weights (see it.unimi.di.mg4j.search.score), and also set different weights for different indices (they will be passed to the scorers). The scorers influence the order when processing each query, but results from different “and-then” queries are simply concatenated.

When using multiple scorers, equalisation can be used to avoid the problem associated with the potentially different value ranges of each scorer. Equalisation evaluates a settable number of sample documents and normalize the scorers using the maximum value in the sample. See AbstractAggregator for some elaboration.

Multiplexing transforms a query q into index0:q | index1:q. In other words, the query is multiplexed on all available indices. Note that if inside q there are selection operators that specify an index, the inner specification will overwrite the external one, so that the semantics of the query is only amplified, but never contradicted.

The results returned are instances of DocumentScoreInfo. If an interval selector has been set, the info field will contain a map from indices to arrays of selected intervals satisfying the query (see it.unimi.di.mg4j.search for some elaboration on minimal-interval semantics support in MG4J).

For examples of usage of this class, please look at Query and QueryServlet.

Warning: This class is highly experimental. It has become definitely more decent in MG4J, but still needs some refactoring.

Warning: This class is not thread safe, but it provides flyweight copies. The copy() method is strengthened so to return an object implementing this interface.

Since:
1.0
Author:
Sebastiano Vigna, Paolo Boldi

Field Summary
protected  Reference2DoubleOpenHashMap<Index> index2Weight
          A map associating a weight with each index.
 Object2ReferenceMap<String,Index> indexMap
          A map from names to indices.
 IntervalSelector intervalSelector
          The current interval selector, if any.
 boolean multiplex
          Whether multiplex is active.
 int numIndices
          The number of indices used by queryParser.
 QueryParser queryParser
          The parser used to parse queries.
 
Constructor Summary
QueryEngine(QueryParser queryParser, QueryBuilderVisitor<DocumentIterator> builderVisitor, Object2ReferenceMap<String,Index> indexMap)
          Creates a new query engine.
 
Method Summary
 QueryEngine copy()
           
 void equalize(int samples)
          Activate equalisation with the given number of samples-
 int process(Query[] query, int offset, int length, ObjectArrayList<DocumentScoreInfo<Reference2ObjectMap<Index,SelectedInterval[]>>> results)
          Processes one or more pre-parsed queries and deposits in a given array a segment of the results corresponding to the queries, using the current settings of this query engine.
 int process(Query query, int offset, int length, ObjectArrayList<DocumentScoreInfo<Reference2ObjectMap<Index,SelectedInterval[]>>> results)
          Processes one pre-parsed query and deposits in a given array a segment of the results corresponding to the query, using the current settings of this query engine.
 int process(String queries, int offset, int length, ObjectArrayList<DocumentScoreInfo<Reference2ObjectMap<Index,SelectedInterval[]>>> results)
          Parses one or more comma-separated queries and deposits in a given array a segment of the results corresponding to the queries, using the current settings of this query engine.
 void score(Scorer scorer)
          Sets a scorer for this query engine.
 void score(Scorer[] scorer, double[] weight)
          Sets the scorers for this query engine.
 void setWeights(Reference2DoubleMap<Index> index2Weight)
          Sets the index weights.
 String toString()
           
 void transformer(QueryTransformer transformer)
          Sets the transformer for this engine, or disables query transformation.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

queryParser

public final QueryParser queryParser
The parser used to parse queries.


indexMap

public final Object2ReferenceMap<String,Index> indexMap
A map from names to indices.


numIndices

public final int numIndices
The number of indices used by queryParser.


multiplex

public volatile boolean multiplex
Whether multiplex is active.


intervalSelector

public volatile IntervalSelector intervalSelector
The current interval selector, if any.


index2Weight

protected final Reference2DoubleOpenHashMap<Index> index2Weight
A map associating a weight with each index.

Constructor Detail

QueryEngine

public QueryEngine(QueryParser queryParser,
                   QueryBuilderVisitor<DocumentIterator> builderVisitor,
                   Object2ReferenceMap<String,Index> indexMap)
Creates a new query engine.

Parameters:
queryParser - a query parser, or null if this query engine will just process pre-parsed queries.
builderVisitor - a builder visitor to transform queries into document iterators.
indexMap - a map from symbolic name to indices (used for multiplexing and default weight initialisation).
Method Detail

copy

public QueryEngine copy()
Specified by:
copy in interface FlyweightPrototype<QueryEngine>

equalize

public void equalize(int samples)
Activate equalisation with the given number of samples-

Parameters:
samples - the number of samples for equalisation, or 0 for no equalisation.

score

public void score(Scorer[] scorer,
                  double[] weight)
Sets the scorers for this query engine.

If scorer has length zero, scoring is disabled. If it has length 1, the only scorer is used for scoring, and the only element of weight is discarded. Otherwise, a LinearAggregator is used to combine results from the given scorers, using the given weights.

Parameters:
scorer - a (possibly empty) array of scorers.
weight - a parallel array of weights (not to be confused with index weights).

score

public void score(Scorer scorer)
Sets a scorer for this query engine.

Parameters:
scorer - a scorer.
See Also:
score(Scorer[], double[])

transformer

public void transformer(QueryTransformer transformer)
Sets the transformer for this engine, or disables query transformation.

Parameters:
transformer - a query transformer, or null to disable query transformation.

setWeights

public void setWeights(Reference2DoubleMap<Index> index2Weight)
Sets the index weights.

This method just delegates to Scorer.setWeights(Reference2DoubleMap).

Parameters:
index2Weight - a map from indices to weights.

process

public int process(String queries,
                   int offset,
                   int length,
                   ObjectArrayList<DocumentScoreInfo<Reference2ObjectMap<Index,SelectedInterval[]>>> results)
            throws QueryParserException,
                   QueryBuilderVisitorException,
                   IOException
Parses one or more comma-separated queries and deposits in a given array a segment of the results corresponding to the queries, using the current settings of this query engine.

Results are accumulated with an “and-then” semantics: results are added from each query in order, provided they did not appear before.

Parameters:
queries - one or more queries separated by commas.
offset - the first result to be added to results.
length - the number of results to be added to results
results - an array list that will hold all results.
Returns:
the number of relevant documents scanned while filling results.
Throws:
QueryParserException
QueryBuilderVisitorException
IOException

process

public int process(Query query,
                   int offset,
                   int length,
                   ObjectArrayList<DocumentScoreInfo<Reference2ObjectMap<Index,SelectedInterval[]>>> results)
            throws QueryBuilderVisitorException,
                   IOException
Processes one pre-parsed query and deposits in a given array a segment of the results corresponding to the query, using the current settings of this query engine.

Parameters:
query - a query.
offset - the first result to be added to results.
length - the number of results to be added to results
results - an array list that will hold all results.
Returns:
the number of documents scanned while filling results.
Throws:
QueryBuilderVisitorException
IOException

process

public int process(Query[] query,
                   int offset,
                   int length,
                   ObjectArrayList<DocumentScoreInfo<Reference2ObjectMap<Index,SelectedInterval[]>>> results)
            throws QueryBuilderVisitorException,
                   IOException
Processes one or more pre-parsed queries and deposits in a given array a segment of the results corresponding to the queries, using the current settings of this query engine.

Results are accumulated with an “and-then” semantics: results are added from each query in order, provided they did not appear before.

Parameters:
query - an array of queries.
offset - the first result to be added to results.
length - the number of results to be added to results
results - an array list that will hold all results.
Returns:
the number of documents scanned while filling results.
Throws:
QueryBuilderVisitorException
IOException

toString

public String toString()
Overrides:
toString in class Object