it.unimi.di.mg4j.document
Class AbstractDocumentCollection

java.lang.Object
  extended by it.unimi.di.mg4j.document.AbstractDocumentSequence
      extended by it.unimi.di.mg4j.document.AbstractDocumentCollection
All Implemented Interfaces:
DocumentCollection, DocumentSequence, SafelyCloseable, FlyweightPrototype<DocumentCollection>, Closeable
Direct Known Subclasses:
ConcatenatedDocumentCollection, FileSetDocumentCollection, JavamailDocumentCollection, JdbcDocumentCollection, SimpleCompressedDocumentCollection, SubDocumentCollection, TRECDocumentCollection, WikipediaDocumentCollection, ZipDocumentCollection

public abstract class AbstractDocumentCollection
extends AbstractDocumentSequence
implements DocumentCollection, SafelyCloseable

An abstract, safely closeable implementation of a document collection.

This class provides ready-made implementation of the iterator() method. Concrete subclasses are encouraged to provide optimised, reuse-oriented versions of the iterator() method. Note that since this implementation uses document() to implement the iterator, creating two iterators concurrently will usually lead to unpredictable results.

As a commodity, the ensureDocumentIndex(int) method can be called whenever it is necessary to check that a document index is not out of range.


Nested Class Summary
static class AbstractDocumentCollection.PropertyKeys
          Symbolic names for common properties of a DocumentCollection.
 
Field Summary
 
Fields inherited from interface it.unimi.di.mg4j.document.DocumentCollection
DEFAULT_EXTENSION
 
Constructor Summary
AbstractDocumentCollection()
           
 
Method Summary
protected  void ensureDocumentIndex(int index)
          Checks that the index is correct (between 0, inclusive, and DocumentCollection.size(), exclusive), and throws an IndexOutOfBoundsException if the index is not correct.
 DocumentIterator iterator()
          Returns an iterator over the sequence of documents.
static void main(String[] arg)
           
static void printAllDocuments(DocumentSequence seq)
          Prints all documents in a given sequence.
 String toString()
           
 
Methods inherited from class it.unimi.di.mg4j.document.AbstractDocumentSequence
close, filename, finalize, load
 
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface it.unimi.di.mg4j.document.DocumentCollection
copy, document, metadata, size, stream
 
Methods inherited from interface it.unimi.di.mg4j.document.DocumentSequence
close, factory, filename
 

Constructor Detail

AbstractDocumentCollection

public AbstractDocumentCollection()
Method Detail

ensureDocumentIndex

protected void ensureDocumentIndex(int index)
Checks that the index is correct (between 0, inclusive, and DocumentCollection.size(), exclusive), and throws an IndexOutOfBoundsException if the index is not correct.

Parameters:
index - the index to be checked.

iterator

public DocumentIterator iterator()
                          throws IOException
Description copied from interface: DocumentSequence
Returns an iterator over the sequence of documents.

Warning: this method can be safely called just one time. For instance, implementations based on standard input will usually throw an exception if this method is called twice.

Implementations may decide to override this restriction (in particular, if they implement DocumentCollection). Usually, however, it is not possible to obtain two iterators at the same time on a collection.

Specified by:
iterator in interface DocumentSequence
Returns:
an iterator over the sequence of documents.
Throws:
IOException
See Also:
DocumentCollection

toString

public String toString()
Overrides:
toString in class Object

printAllDocuments

public static void printAllDocuments(DocumentSequence seq)
                              throws IOException
Prints all documents in a given sequence.

Parameters:
seq - the sequence.
Throws:
IOException

main

public static void main(String[] arg)
                 throws Exception
Throws:
Exception