|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object it.unimi.di.mg4j.document.AbstractDocumentSequence it.unimi.di.mg4j.document.AbstractDocumentCollection it.unimi.di.mg4j.document.JdbcDocumentCollection
public class JdbcDocumentCollection
A DocumentCollection
corresponding to
the result of a query in a relational database.
An instance of this class is based on a query. The query should produce two fixed columns: the first, named id, must be an increasing integer which act as an identifier (i.e., as a key); the second, named title, must be a text field and will be used as a title. The remaining columns will be indexed, and the name of the corresponding field will be the name of the column (use judiciously AS).
In complex queries, the specification id for the first column could be ambiguous; in that case, you can provide an alternate (and hopefully more precise) specification.
At construction time, the query is executed, obtaining a bijection between
the values of the identifier and document indices. The bijection is exposed by the
methods id2doc(int)
and doc2id(int)
. The class tolerates
additions to the database (and they will be skipped), but deletions will cause errors.
This class provides a main method with a flexible syntax that serialises a query into a document collection.
Nested Class Summary | |
---|---|
protected class |
JdbcDocumentCollection.JdbcDocumentIterator
An iterator over the whole collection that performs a single DBMS transaction. |
Nested classes/interfaces inherited from class it.unimi.di.mg4j.document.AbstractDocumentCollection |
---|
AbstractDocumentCollection.PropertyKeys |
Field Summary | |
---|---|
protected Connection |
connection
The currently open connection, if any. |
protected String |
dbUri
The URI pointing at the database. |
protected int[] |
doc2id
The map (as an array) from documents to database identifiers. |
protected DocumentFactory |
factory
The factory to be used by this collection. |
protected Int2IntMap |
id2doc
The map from database identifiers to documents. |
protected String |
idSpec
The spec for the id field; by default it is id, but in complex query it could be ambiguous. |
protected String |
select
The query generating the collection (without the SELECT keyword). |
protected String |
where
The WHERE part of the query generating the collection (without the WHERE keyword), or null . |
Fields inherited from interface it.unimi.di.mg4j.document.DocumentCollection |
---|
DEFAULT_EXTENSION |
Constructor Summary | |
---|---|
JdbcDocumentCollection(String dbUri,
String jdbcDriverName,
String select,
String where,
DocumentFactory factory)
Creates a document collection based on the result set of an SQL query using id as id specifier. |
|
JdbcDocumentCollection(String dbUri,
String jdbcDriverName,
String select,
String idSpec,
String where,
DocumentFactory factory)
Creates a document collection based on the result set of an SQL query. |
Method Summary | |
---|---|
void |
close()
Closes this document sequence, releasing all resources. |
JdbcDocumentCollection |
copy()
|
int |
doc2id(int doc)
Returns the database identifier associated with a given document. |
Document |
document(int index)
Returns the document given its index. |
protected void |
ensureConnection()
|
DocumentFactory |
factory()
Returns the factory used by this sequence. |
int |
id2doc(int id)
Returns the document associated with a given database identifier. |
DocumentIterator |
iterator()
Returns an iterator over the sequence of documents. |
static void |
main(String[] arg)
|
Reference2ObjectMap<Enum<?>,Object> |
metadata(int index)
Returns the metadata map for a document. |
protected Reference2ObjectMap<Enum<?>,Object> |
metadata(int index,
CharSequence title)
Creates metadata with the given title; if the title is not available, it is fetched from the database. |
int |
size()
Returns the number of documents in this collection. |
InputStream |
stream(int index)
Returns an input stream for the raw content of a document. |
Methods inherited from class it.unimi.di.mg4j.document.AbstractDocumentCollection |
---|
ensureDocumentIndex, printAllDocuments, toString |
Methods inherited from class it.unimi.di.mg4j.document.AbstractDocumentSequence |
---|
filename, finalize, load |
Methods inherited from class java.lang.Object |
---|
clone, equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Methods inherited from interface it.unimi.di.mg4j.document.DocumentSequence |
---|
filename |
Field Detail |
---|
protected final Int2IntMap id2doc
protected final int[] doc2id
protected final String dbUri
protected final DocumentFactory factory
protected final String select
protected final String idSpec
protected final String where
null
.
protected transient Connection connection
Constructor Detail |
---|
public JdbcDocumentCollection(String dbUri, String jdbcDriverName, String select, String where, DocumentFactory factory) throws SQLException, ClassNotFoundException
Beware. This class is not guaranteed to work if the database is deleted or modified after creation!
dbUri
- a JDBC URI pointing at the database.jdbcDriverName
- the name of a JDBC driver, or null
if you do not want to load a driver.select
- the SQL query generating the collection (without the SELECT keyword), except for the WHERE part.where
- the WHERE part (without the WHERE keyword) of the SQL query generating the collection, or null
.factory
- the factory that will be used to create documents.
SQLException
ClassNotFoundException
public JdbcDocumentCollection(String dbUri, String jdbcDriverName, String select, String idSpec, String where, DocumentFactory factory) throws SQLException, ClassNotFoundException
Beware. This class is not guaranteed to work if the database is deleted or modified after creation!
dbUri
- a JDBC URI pointing at the database.jdbcDriverName
- the name of a JDBC driver, or null
if you do not want to load a driver.select
- the SQL query generating the collection (without the SELECT keyword), except for the WHERE part.idSpec
- the complete SQL spec for the id (necessary for complex queries with multiple tables).where
- the WHERE part (without the WHERE keyword) of the SQL query generating the collection, or null
.factory
- the factory that will be used to create documents.
SQLException
ClassNotFoundException
Method Detail |
---|
protected void ensureConnection() throws SQLException
SQLException
public void close() throws IOException
DocumentSequence
You should always call this method after having finished with this document sequence.
Implementations are invited to call this method in a finaliser as a safety net (even better,
implement SafelyCloseable
), but since there
is no guarantee as to when finalisers are invoked, you should not depend on this behaviour.
close
in interface DocumentSequence
close
in interface Closeable
close
in class AbstractDocumentSequence
IOException
public JdbcDocumentCollection copy()
copy
in interface DocumentCollection
copy
in interface FlyweightPrototype<DocumentCollection>
public DocumentFactory factory()
DocumentSequence
Every document sequence is based on a document factory that transforms raw bytes into a sequence of characters. The factory contains useful information such as the number of fields.
factory
in interface DocumentSequence
public int size()
DocumentCollection
size
in interface DocumentCollection
public Document document(int index) throws IOException
DocumentCollection
document
in interface DocumentCollection
index
- an index between 0 (inclusive) and DocumentCollection.size()
(exclusive).
index
-th document.
IOException
public int id2doc(int id)
id
- a database identifier.
public int doc2id(int doc)
doc
- a document index.
protected Reference2ObjectMap<Enum<?>,Object> metadata(int index, CharSequence title)
index
- a document index.title
- a suggested title, or null
.
index
.public Reference2ObjectMap<Enum<?>,Object> metadata(int index)
DocumentCollection
metadata
in interface DocumentCollection
index
- an index between 0 (inclusive) and DocumentCollection.size()
(exclusive).
public InputStream stream(int index) throws IOException
DocumentCollection
stream
in interface DocumentCollection
index
- an index between 0 (inclusive) and DocumentCollection.size()
(exclusive).
IOException
public DocumentIterator iterator() throws IOException
DocumentSequence
Warning: this method can be safely called just one time. For instance, implementations based on standard input will usually throw an exception if this method is called twice.
Implementations may decide to override this restriction
(in particular, if they implement DocumentCollection
). Usually,
however, it is not possible to obtain two iterators at the
same time on a collection.
iterator
in interface DocumentSequence
iterator
in class AbstractDocumentCollection
IOException
DocumentCollection
public static void main(String[] arg) throws com.martiansoftware.jsap.JSAPException, InvocationTargetException, NoSuchMethodException, IllegalAccessException, IOException, SQLException, ClassNotFoundException, InstantiationException
com.martiansoftware.jsap.JSAPException
InvocationTargetException
NoSuchMethodException
IllegalAccessException
IOException
SQLException
ClassNotFoundException
InstantiationException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |