|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object it.unimi.di.mg4j.document.ZipDocumentCollectionBuilder
public class ZipDocumentCollectionBuilder
A builder for zipped document collections.
Constructor Summary | |
---|---|
ZipDocumentCollectionBuilder(String basename,
DocumentFactory factory,
boolean exact)
Creates a new zipped collection builder. |
Method Summary | |
---|---|
void |
add(MutableString word,
MutableString nonWord)
Adds a word and a nonword to the current text field, provided that a text field has started but not yet ended; otherwise, doesn't do anything. |
String |
basename()
Returns the basename of this builder. |
void |
build(DocumentSequence inputSequence)
|
void |
close()
Terminates the contruction of the collection. |
void |
endDocument()
Ends a document entry. |
void |
endTextField()
Ends a new text field. |
static void |
main(String[] arg)
|
void |
nonTextField(Object o)
Adds a non-text field. |
void |
open(CharSequence suffix)
Opens a new collection. |
void |
startDocument(CharSequence title,
CharSequence uri)
Starts a document entry. |
void |
startTextField()
Starts a new text field. |
void |
virtualField(ObjectList<Scan.VirtualDocumentFragment> fragments)
Adds a virtual field. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public ZipDocumentCollectionBuilder(String basename, DocumentFactory factory, boolean exact)
factory
- the factory of the base document sequence.exact
- true iff also non-words should be preserved.Method Detail |
---|
public void open(CharSequence suffix) throws FileNotFoundException
DocumentCollectionBuilder
open
in interface DocumentCollectionBuilder
suffix
- a suffix that will be added to the basename provided at construction time.
FileNotFoundException
public String basename()
DocumentCollectionBuilder
basename
in interface DocumentCollectionBuilder
public void startDocument(CharSequence title, CharSequence uri) throws IOException
DocumentCollectionBuilder
startDocument
in interface DocumentCollectionBuilder
title
- the document title (usually, the result of Document.title()
).uri
- the document uri (usually, the result of Document.uri()
).
IOException
public void endDocument() throws IOException
DocumentCollectionBuilder
endDocument
in interface DocumentCollectionBuilder
IOException
public void startTextField()
DocumentCollectionBuilder
startTextField
in interface DocumentCollectionBuilder
public void nonTextField(Object o) throws IOException
DocumentCollectionBuilder
nonTextField
in interface DocumentCollectionBuilder
o
- the content of the non-text field.
IOException
public void virtualField(ObjectList<Scan.VirtualDocumentFragment> fragments) throws IOException
DocumentCollectionBuilder
virtualField
in interface DocumentCollectionBuilder
fragments
- the virtual fragments to be added.
IOException
public void endTextField() throws IOException
DocumentCollectionBuilder
endTextField
in interface DocumentCollectionBuilder
IOException
public void add(MutableString word, MutableString nonWord) throws IOException
DocumentCollectionBuilder
Usually, word
e nonWord
are just the result of a call
to WordReader.next(MutableString, MutableString)
.
add
in interface DocumentCollectionBuilder
word
- a word.nonWord
- a nonword.
IOException
public void close() throws IOException
DocumentCollectionBuilder
close
in interface DocumentCollectionBuilder
IOException
public void build(DocumentSequence inputSequence) throws IOException
IOException
public static void main(String[] arg) throws com.martiansoftware.jsap.JSAPException, IOException, ClassNotFoundException, InvocationTargetException, NoSuchMethodException, IllegalAccessException, InstantiationException, IllegalArgumentException, SecurityException
com.martiansoftware.jsap.JSAPException
IOException
ClassNotFoundException
InvocationTargetException
NoSuchMethodException
IllegalAccessException
InstantiationException
IllegalArgumentException
SecurityException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |