Class CompositeDocumentFactory

    • Constructor Detail

      • CompositeDocumentFactory

        protected CompositeDocumentFactory​(DocumentFactory[] documentFactory,
                                           String[] fieldName)
        Creates a new composite document factory using the factories in a given array.
        Parameters:
        documentFactory - an array of document factories that will composed.
        fieldName - an array of names for the resulting field, or null.
    • Method Detail

      • getFactory

        public static DocumentFactory getFactory​(DocumentFactory[] documentFactory,
                                                 String[] fieldName)
        Returns a document factory composing the given document factories.

        By passing an optional array of field names, it is possible to rename the fields of the composing factories.

        Parameters:
        documentFactory - an array of document factories that will composed.
        fieldName - an array of names for the resulting field, or null.
        Returns:
        a composed document factory (the first element of the argument, for arguments of length 1).
      • getFactory

        public static DocumentFactory getFactory​(DocumentFactory... documentFactory)
        Returns a document factory composing the given document factories.
        Parameters:
        documentFactory - document factories that will composed.
        Returns:
        a composed document factory (the first element of the argument, for arguments of length 1).
      • numberOfFields

        public int numberOfFields()
        Description copied from interface: DocumentFactory
        Returns the number of fields present in the documents produced by this factory.
        Returns:
        the number of fields present in the documents produced by this factory.
      • fieldName

        public String fieldName​(int field)
        Description copied from interface: DocumentFactory
        Returns the symbolic name of a field.
        Parameters:
        field - the index of a field (between 0 inclusive and DocumentFactory.numberOfFields() exclusive}).
        Returns:
        the symbolic name of the field-th field.
      • fieldIndex

        public int fieldIndex​(String fieldName)
        Description copied from interface: DocumentFactory
        Returns the index of a field, given its symbolic name.
        Parameters:
        fieldName - the name of a field of this factory.
        Returns:
        the corresponding index, or -1 if there is no field with name fieldName.
      • getDocument

        public Document getDocument​(InputStream rawContent,
                                    Reference2ObjectMap<Enum<?>,​Object> metadata)
                             throws IOException
        Description copied from interface: DocumentFactory
        Returns the document obtained by parsing the given byte stream.

        The parameter metadata actually replaces the lack of a simple keyword-based parameter-passing system in Java. This method might take several different type of “suggestions” which have been collected by the collection: typically, the document title, a URI representing the document, its MIME type, its encoding and so on. Some of this information might be set by default (as it happens, for instance, in a PropertyBasedDocumentFactory). Implementations of this method must consult the metadata provided by the collection, possibly complete them with default factory metadata, and proceed to the document construction.

        Parameters:
        rawContent - the raw content from which the document should be extracted; it must not be closed, as resource management is a responsibility of the DocumentCollection.
        metadata - a map from enums (e.g., keys taken in PropertyBasedDocumentFactory) to various kind of objects.
        Returns:
        the document obtained by parsing the given character sequence.
        Throws:
        IOException