java.lang.Object
- it.unimi.di.big.mg4j.document.AbstractDocumentFactory
- - it.unimi.di.big.mg4j.document.TRECHeaderDocumentFactory

All Implemented Interfaces:

DocumentFactory, FlyweightPrototype<DocumentFactory>, Serializable
```
public class TRECHeaderDocumentFactory
extends AbstractDocumentFactory
```
A factory without fields that is used to interpret the header of a TREC GOV2 document. It is usually the first factory to interpret a document of a TRECDocumentCollection.
Presently, its only rôumflex;le is that of parsing the document URI and setting a metadata item with key PropertyBasedDocumentFactory.MetadataKeys.URI.

Author:

Alessio Orlandi, Luca Natali

See Also:

Serialized Form

Nested Class Summary
- Nested classes/interfaces inherited from interface it.unimi.di.big.mg4j.document.DocumentFactory
  DocumentFactory.FieldType

Constructor Summary

Constructors
Constructor Description

TRECHeaderDocumentFactory()

Method Summary

Modifier and Type	Method	Description
`DocumentFactory`	`copy()`
`int`	`fieldIndex(String fieldName)`	Returns the index of a field, given its symbolic name.
`String`	`fieldName(int fieldIndex)`	Returns the symbolic name of a field.
`DocumentFactory.FieldType`	`fieldType(int fieldIndex)`	Returns the type of a field.
`Document`	`getDocument(InputStream rawContent, Reference2ObjectMap<Enum<?>,Object> metadata)`	Returns the document obtained by parsing the given byte stream.
`int`	`numberOfFields()`	Returns the number of fields present in the documents produced by this factory.
`protected static boolean`	`startsWith(byte[] a, int l, byte[] b)`
`protected static boolean`	`startsWithIgnoreCase(byte[] a, int l, char[] b)`

Methods inherited from class it.unimi.di.big.mg4j.document.AbstractDocumentFactory
ensureFieldIndex, toString

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - TRECHeaderDocumentFactory
```
public TRECHeaderDocumentFactory()
```
- Method Detail
  - numberOfFields
```
public int numberOfFields()
```
    Description copied from interface: DocumentFactory
    
    Returns the number of fields present in the documents produced by this factory.
    
    Returns:
    
    the number of fields present in the documents produced by this factory.
  - fieldName
```
public String fieldName(int fieldIndex)
```
    Description copied from interface: DocumentFactory
    
    Returns the symbolic name of a field.
    
    Parameters:
    
    fieldIndex - the index of a field (between 0 inclusive and DocumentFactory.numberOfFields() exclusive}).
    
    Returns:
    
    the symbolic name of the field-th field.
  - fieldIndex
```
public int fieldIndex(String fieldName)
```
    Description copied from interface: DocumentFactory
    
    Returns the index of a field, given its symbolic name.
    
    Parameters:
    
    fieldName - the name of a field of this factory.
    
    Returns:
    
    the corresponding index, or -1 if there is no field with name fieldName.
  - fieldType
```
public DocumentFactory.FieldType fieldType(int fieldIndex)
```
    Description copied from interface: DocumentFactory
    
    Returns the type of a field.
    The possible types are defined in DocumentFactory.FieldType.
    
    Parameters:
    
    fieldIndex - the index of a field (between 0 inclusive and DocumentFactory.numberOfFields() exclusive}).
    
    Returns:
    
    the type of the field-th field.
  - startsWith
```
protected static boolean startsWith(byte[] a,
                                    int l,
                                    byte[] b)
```
  - startsWithIgnoreCase
```
protected static boolean startsWithIgnoreCase(byte[] a,
                                              int l,
                                              char[] b)
```
  - getDocument
```
public Document getDocument(InputStream rawContent,
                            Reference2ObjectMap<Enum<?>,Object> metadata)
                     throws IOException
```
    Description copied from interface: DocumentFactory
    
    Returns the document obtained by parsing the given byte stream.
    The parameter metadata actually replaces the lack of a simple keyword-based parameter-passing system in Java. This method might take several different type of “suggestions” which have been collected by the collection: typically, the document title, a URI representing the document, its MIME type, its encoding and so on. Some of this information might be set by default (as it happens, for instance, in a PropertyBasedDocumentFactory). Implementations of this method must consult the metadata provided by the collection, possibly complete them with default factory metadata, and proceed to the document construction.
    
    Parameters:
    
    rawContent - the raw content from which the document should be extracted; it must not be closed, as resource management is a responsibility of the DocumentCollection.
    
    metadata - a map from enums (e.g., keys taken in PropertyBasedDocumentFactory) to various kind of objects.
    
    Returns:
    
    the document obtained by parsing the given character sequence.
    
    Throws:
    
    IOException
  - copy
```
public DocumentFactory copy()
```

Class TRECHeaderDocumentFactory

Nested Class Summary

Nested classes/interfaces inherited from interface it.unimi.di.big.mg4j.document.DocumentFactory

Constructor Summary

Method Summary

Methods inherited from class it.unimi.di.big.mg4j.document.AbstractDocumentFactory

Methods inherited from class java.lang.Object

Constructor Detail

TRECHeaderDocumentFactory

Method Detail

numberOfFields

fieldName

fieldIndex

fieldType

startsWith

startsWithIgnoreCase

getDocument

copy