|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object it.unimi.di.mg4j.document.AbstractDocumentFactory it.unimi.di.mg4j.document.ReplicatedDocumentFactory
public class ReplicatedDocumentFactory
A factory that replicates a given factory several times. A special case of a composite factory.
Note that in general replicated factories support only sequential access to field content (albeit skipping items is allowed).
Nested Class Summary | |
---|---|
protected class |
ReplicatedDocumentFactory.ReplicatedDocument
A document obtained by replication of the underlying-factory document. |
Nested classes/interfaces inherited from interface it.unimi.di.mg4j.document.DocumentFactory |
---|
DocumentFactory.FieldType |
Field Summary | |
---|---|
DocumentFactory |
documentFactory
The document factory that will be replicated. |
int |
numberOfCopies
The number of copies. |
Constructor Summary | |
---|---|
protected |
ReplicatedDocumentFactory(DocumentFactory documentFactory,
int numberOfCopies,
String[] fieldName,
Object2IntOpenHashMap<String> field2Index)
|
Method Summary | |
---|---|
ReplicatedDocumentFactory |
copy()
|
int |
fieldIndex(String fieldName)
Returns the index of a field, given its symbolic name. |
String |
fieldName(int field)
Returns the symbolic name of a field. |
DocumentFactory.FieldType |
fieldType(int field)
Returns the type of a field. |
Document |
getDocument(InputStream rawContent,
Reference2ObjectMap<Enum<?>,Object> metadata)
Returns the document obtained by parsing the given byte stream. |
static DocumentFactory |
getFactory(DocumentFactory documentFactory,
int numberOfCopies,
String[] fieldName)
Returns a document factory replicating the given factory. |
int |
numberOfFields()
Returns the number of fields present in the documents produced by this factory. |
Methods inherited from class it.unimi.di.mg4j.document.AbstractDocumentFactory |
---|
ensureFieldIndex, toString |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public final DocumentFactory documentFactory
public final int numberOfCopies
Constructor Detail |
---|
protected ReplicatedDocumentFactory(DocumentFactory documentFactory, int numberOfCopies, String[] fieldName, Object2IntOpenHashMap<String> field2Index)
Method Detail |
---|
public static DocumentFactory getFactory(DocumentFactory documentFactory, int numberOfCopies, String[] fieldName)
documentFactory
- the factory that will be replicated.numberOfCopies
- the number of copies.
public ReplicatedDocumentFactory copy()
public int numberOfFields()
DocumentFactory
public String fieldName(int field)
DocumentFactory
field
- the index of a field (between 0 inclusive and DocumentFactory.numberOfFields()
exclusive}).
field
-th field.public int fieldIndex(String fieldName)
DocumentFactory
fieldName
- the name of a field of this factory.
fieldName
.public DocumentFactory.FieldType fieldType(int field)
DocumentFactory
The possible types are defined in DocumentFactory.FieldType
.
field
- the index of a field (between 0 inclusive and DocumentFactory.numberOfFields()
exclusive}).
field
-th field.public Document getDocument(InputStream rawContent, Reference2ObjectMap<Enum<?>,Object> metadata) throws IOException
DocumentFactory
The parameter metadata
actually replaces the lack of a simple keyword-based
parameter-passing system in Java. This method might take several different type of “suggestions”
which have been collected by the collection: typically, the document title, a URI representing
the document, its MIME type, its encoding and so on. Some of this information might be
set by default (as it happens, for instance, in a PropertyBasedDocumentFactory
).
Implementations of this method must consult the metadata provided by the collection, possibly
complete them with default factory metadata, and proceed to the document construction.
rawContent
- the raw content from which the document should be extracted; it must not be closed, as
resource management is a responsibility of the DocumentCollection.metadata
- a map from enums (e.g., keys taken in PropertyBasedDocumentFactory
) to various kind of objects.
IOException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |