Class AbstractDocumentCollection
- java.lang.Object
-
- it.unimi.di.big.mg4j.document.AbstractDocumentSequence
-
- it.unimi.di.big.mg4j.document.AbstractDocumentCollection
-
- All Implemented Interfaces:
DocumentCollection
,DocumentSequence
,SafelyCloseable
,FlyweightPrototype<DocumentCollection>
,Closeable
,AutoCloseable
- Direct Known Subclasses:
ConcatenatedDocumentCollection
,FileSetDocumentCollection
,JavamailDocumentCollection
,JdbcDocumentCollection
,SimpleCompressedDocumentCollection
,SubDocumentCollection
,TRECDocumentCollection
,WikipediaDocumentCollection
,ZipDocumentCollection
public abstract class AbstractDocumentCollection extends AbstractDocumentSequence implements DocumentCollection, SafelyCloseable
An abstract,safely closeable
implementation of a document collection.This class provides ready-made implementation of the
iterator()
method. Concrete subclasses are encouraged to provide optimised, reuse-oriented versions of theiterator()
method. Note that since this implementation usesdocument()
to implement the iterator, creating two iterators concurrently will usually lead to unpredictable results.As a commodity, the
ensureDocumentIndex(long)
method can be called whenever it is necessary to check that a document index is not out of range.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
AbstractDocumentCollection.PropertyKeys
Symbolic names for common properties of aDocumentCollection
.
-
Field Summary
-
Fields inherited from interface it.unimi.di.big.mg4j.document.DocumentCollection
DEFAULT_EXTENSION
-
-
Constructor Summary
Constructors Constructor Description AbstractDocumentCollection()
-
Method Summary
Modifier and Type Method Description protected void
ensureDocumentIndex(long index)
Checks that the index is correct (between 0, inclusive, andDocumentCollection.size()
, exclusive), and throws anIndexOutOfBoundsException
if the index is not correct.DocumentIterator
iterator()
Returns an iterator over the sequence of documents.static void
main(String[] arg)
static void
printAllDocuments(DocumentSequence seq)
Prints all documents in a given sequence.String
toString()
-
Methods inherited from class it.unimi.di.big.mg4j.document.AbstractDocumentSequence
close, filename, finalize, load
-
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface it.unimi.di.big.mg4j.document.DocumentCollection
copy, document, metadata, size, stream
-
Methods inherited from interface it.unimi.di.big.mg4j.document.DocumentSequence
close, factory, filename
-
-
-
-
Method Detail
-
ensureDocumentIndex
protected void ensureDocumentIndex(long index)
Checks that the index is correct (between 0, inclusive, andDocumentCollection.size()
, exclusive), and throws anIndexOutOfBoundsException
if the index is not correct.- Parameters:
index
- the index to be checked.
-
iterator
public DocumentIterator iterator() throws IOException
Description copied from interface:DocumentSequence
Returns an iterator over the sequence of documents.Warning: this method can be safely called just one time. For instance, implementations based on standard input will usually throw an exception if this method is called twice.
Implementations may decide to override this restriction (in particular, if they implement
DocumentCollection
). Usually, however, it is not possible to obtain two iterators at the same time on a collection.- Specified by:
iterator
in interfaceDocumentSequence
- Returns:
- an iterator over the sequence of documents.
- Throws:
IOException
- See Also:
DocumentCollection
-
printAllDocuments
public static void printAllDocuments(DocumentSequence seq) throws IOException
Prints all documents in a given sequence.- Parameters:
seq
- the sequence.- Throws:
IOException
-
-