|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object it.unimi.di.mg4j.document.AbstractDocumentFactory it.unimi.di.mg4j.document.PropertyBasedDocumentFactory it.unimi.di.mg4j.document.DispatchingDocumentFactory
public class DispatchingDocumentFactory
A document factory that actually dispatches the task of building documents to various factories according to some strategy.
The strategy is specified as (an object embedding) a method that determines which factory
should be used on the basis of the metadata that are provided to the getDocument(InputStream, Reference2ObjectMap)
method. Since usually the strategy will have to resolve the name of metadata, it is also passed
this factory, so that the correct
PropertyBasedDocumentFactory.resolve(Enum,Reference2ObjectMap)
method can be invoked.
Moreover, at construction one must specify, for each subfactory and for each field of this factory, which field of the subfactory should be used. Note that to guarantee sequential access, fields specified for each subfactory should appear in increasing order.
Nested Class Summary | |
---|---|
static interface |
DispatchingDocumentFactory.DispatchingStrategy
A strategy that decides which factory is appropriate using the document metadata. |
static class |
DispatchingDocumentFactory.MetadataKeys
Case-insensitive keys for metadata. |
static class |
DispatchingDocumentFactory.StringBasedDispatchingStrategy
A strategy that is based on trying to match the value of the metadata with a given key with respect to a certain set of values. |
Nested classes/interfaces inherited from interface it.unimi.di.mg4j.document.DocumentFactory |
---|
DocumentFactory.FieldType |
Field Summary | |
---|---|
static String |
OTHERWISE_IN_RULE
The value to be used in RULE to introduce the default factory. |
Fields inherited from class it.unimi.di.mg4j.document.PropertyBasedDocumentFactory |
---|
defaultMetadata |
Constructor Summary | |
---|---|
DispatchingDocumentFactory()
|
|
DispatchingDocumentFactory(DocumentFactory[] documentFactory,
String[] fieldName,
DocumentFactory.FieldType[] fieldType,
int[][] rename,
DispatchingDocumentFactory.DispatchingStrategy strategy)
Creates a new dispatching factory. |
|
DispatchingDocumentFactory(Properties properties)
|
|
DispatchingDocumentFactory(Reference2ObjectMap<Enum<?>,Object> defaultMetadata)
|
|
DispatchingDocumentFactory(String[] property)
|
Method Summary | |
---|---|
DispatchingDocumentFactory |
copy()
|
int |
fieldIndex(String fieldName)
Returns the index of a field, given its symbolic name. |
String |
fieldName(int field)
Returns the symbolic name of a field. |
DocumentFactory.FieldType |
fieldType(int field)
Returns the type of a field. |
Document |
getDocument(InputStream rawContent,
Reference2ObjectMap<Enum<?>,Object> metadata)
Returns the document obtained by parsing the given byte stream. |
static void |
main(String[] arg)
|
int |
numberOfFields()
Returns the number of fields present in the documents produced by this factory. |
protected boolean |
parseProperty(String key,
String[] values,
Reference2ObjectMap<Enum<?>,Object> metadata)
Parses a property with given key and value, adding it to the given map. |
Methods inherited from class it.unimi.di.mg4j.document.PropertyBasedDocumentFactory |
---|
ensureJustOne, getInstance, getInstance, getInstance, getInstance, parseProperties, parseProperties, resolve, resolve, resolveNotNull, sameKey |
Methods inherited from class it.unimi.di.mg4j.document.AbstractDocumentFactory |
---|
ensureFieldIndex, toString |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final String OTHERWISE_IN_RULE
RULE
to introduce the default factory. Otherwise, no default factory is
provided for documents that do not match.
Constructor Detail |
---|
public DispatchingDocumentFactory(DocumentFactory[] documentFactory, String[] fieldName, DocumentFactory.FieldType[] fieldType, int[][] rename, DispatchingDocumentFactory.DispatchingStrategy strategy)
documentFactory
- the array of subfactories.fieldName
- the names of this factory's fields.fieldType
- the types of this factory's fields.rename
- the way fields of this class are mapped to fields of the subfactories.strategy
- the strategy to decide which factory should be used.public DispatchingDocumentFactory(Properties properties) throws ConfigurationException
ConfigurationException
public DispatchingDocumentFactory(String[] property) throws ConfigurationException
ConfigurationException
public DispatchingDocumentFactory(Reference2ObjectMap<Enum<?>,Object> defaultMetadata)
public DispatchingDocumentFactory()
Method Detail |
---|
public DispatchingDocumentFactory copy()
protected boolean parseProperty(String key, String[] values, Reference2ObjectMap<Enum<?>,Object> metadata) throws ConfigurationException
PropertyBasedDocumentFactory
Currently this implementation just parses the PropertyBasedDocumentFactory.MetadataKeys.LOCALE
property.
Subclasses should do their own parsing, returing true in case of success and
returning super.parseProperty()
otherwise.
parseProperty
in class PropertyBasedDocumentFactory
key
- the property key.values
- the property value; this is an array, because properties may have a list of comma-separated values.metadata
- the metadata map.
ConfigurationException
public int numberOfFields()
DocumentFactory
public String fieldName(int field)
DocumentFactory
field
- the index of a field (between 0 inclusive and DocumentFactory.numberOfFields()
exclusive}).
field
-th field.public int fieldIndex(String fieldName)
DocumentFactory
fieldName
- the name of a field of this factory.
fieldName
.public DocumentFactory.FieldType fieldType(int field)
DocumentFactory
The possible types are defined in DocumentFactory.FieldType
.
field
- the index of a field (between 0 inclusive and DocumentFactory.numberOfFields()
exclusive}).
field
-th field.public Document getDocument(InputStream rawContent, Reference2ObjectMap<Enum<?>,Object> metadata) throws IOException
DocumentFactory
The parameter metadata
actually replaces the lack of a simple keyword-based
parameter-passing system in Java. This method might take several different type of “suggestions”
which have been collected by the collection: typically, the document title, a URI representing
the document, its MIME type, its encoding and so on. Some of this information might be
set by default (as it happens, for instance, in a PropertyBasedDocumentFactory
).
Implementations of this method must consult the metadata provided by the collection, possibly
complete them with default factory metadata, and proceed to the document construction.
rawContent
- the raw content from which the document should be extracted; it must not be closed, as
resource management is a responsibility of the DocumentCollection.metadata
- a map from enums (e.g., keys taken in PropertyBasedDocumentFactory
) to various kind of objects.
IOException
public static void main(String[] arg) throws IOException, ConfigurationException
IOException
ConfigurationException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |