|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectit.unimi.di.mg4j.document.SimpleCompressedDocumentCollectionBuilder
public class SimpleCompressedDocumentCollectionBuilder
A builder for simple compressed document collections.
Constructor Summary | |
---|---|
SimpleCompressedDocumentCollectionBuilder(IOFactory ioFactory,
String basename,
DocumentFactory documentFactory,
boolean exact)
|
|
SimpleCompressedDocumentCollectionBuilder(String basename,
DocumentFactory documentFactory,
boolean exact)
|
Method Summary | |
---|---|
void |
add(MutableString word,
MutableString nonWord)
Adds a word and a nonword to the current text field, provided that a text field has started but not yet ended; otherwise, doesn't do anything. |
String |
basename()
Returns the basename of this builder. |
void |
build(DocumentSequence inputSequence)
|
void |
close()
Terminates the contruction of the collection. |
void |
endDocument()
Ends a document entry. |
void |
endTextField()
Ends a new text field. |
void |
nonTextField(Object o)
Adds a non-text field. |
void |
open(CharSequence suffix)
Opens a new collection. |
void |
startDocument(CharSequence title,
CharSequence uri)
Starts a document entry. |
void |
startTextField()
Starts a new text field. |
void |
virtualField(ObjectList<Scan.VirtualDocumentFragment> fragments)
Adds a virtual field. |
static int |
writeSelfDelimitedUtf8String(OutputBitStream obs,
CharSequence s)
|
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public SimpleCompressedDocumentCollectionBuilder(String basename, DocumentFactory documentFactory, boolean exact)
public SimpleCompressedDocumentCollectionBuilder(IOFactory ioFactory, String basename, DocumentFactory documentFactory, boolean exact)
Method Detail |
---|
public String basename()
DocumentCollectionBuilder
basename
in interface DocumentCollectionBuilder
public void open(CharSequence suffix) throws IOException
DocumentCollectionBuilder
open
in interface DocumentCollectionBuilder
suffix
- a suffix that will be added to the basename provided at construction time.
IOException
public void add(MutableString word, MutableString nonWord) throws IOException
DocumentCollectionBuilder
Usually, word
e nonWord
are just the result of a call
to WordReader.next(MutableString, MutableString)
.
add
in interface DocumentCollectionBuilder
word
- a word.nonWord
- a nonword.
IOException
public void close() throws IOException
DocumentCollectionBuilder
close
in interface DocumentCollectionBuilder
IOException
public void endDocument() throws IOException
DocumentCollectionBuilder
endDocument
in interface DocumentCollectionBuilder
IOException
public void endTextField() throws IOException
DocumentCollectionBuilder
endTextField
in interface DocumentCollectionBuilder
IOException
public void nonTextField(Object o) throws IOException
DocumentCollectionBuilder
nonTextField
in interface DocumentCollectionBuilder
o
- the content of the non-text field.
IOException
public static int writeSelfDelimitedUtf8String(OutputBitStream obs, CharSequence s) throws IOException
IOException
public void startDocument(CharSequence title, CharSequence uri) throws IOException
DocumentCollectionBuilder
startDocument
in interface DocumentCollectionBuilder
title
- the document title (usually, the result of Document.title()
).uri
- the document uri (usually, the result of Document.uri()
).
IOException
public void startTextField()
DocumentCollectionBuilder
startTextField
in interface DocumentCollectionBuilder
public void virtualField(ObjectList<Scan.VirtualDocumentFragment> fragments) throws IOException
DocumentCollectionBuilder
virtualField
in interface DocumentCollectionBuilder
fragments
- the virtual fragments to be added.
IOException
public void build(DocumentSequence inputSequence) throws IOException
IOException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |