5.4.3 -> 5.4.4

- New template parameter for WikipediaDocumentSequence.

5.4.2 -> 5.4.3

- Removed computation of the title list in Scan. It proved to be more
  harmful than useful.

- Made dsi -> di mapping in packages a bit less aggressive.

- New delimiter feature in HtmlDocumentFactory makes it possible to find
  exactly the text of an anchor.

- Fixed old bug in SimpleParser.copy(): term processors were not
  copy()'d.

5.4.1 -> 5.4.2

- Adapted to Sux4J 4.0.0.

- New delimiter feature in HtmlDocumentFactory and AnchorExtractor.

5.4 -> 5.4.1

- New WarcDocumentSequence (based on BUbiNG's WARC classes).

- TITLE metadata now overrides the HTML title.

5.3 -> 5.4

- Now Scan always generate a .titles file.

- Fixed incredible logic-inversion bug in QueryEngine.

5.2.1 -> 5.3

- New WikipediaDocumentSequence that can be used to index the
  standard Wikipedia dump (and build the entity graph, too!).

- Now ConcatenatedDocumentCollection has a commodity main method.

- URLMPHVirtualDocumentResolver does not invoke anymore URI.normalize() on
  the context or on the resolved URLs. It was probably a bad idea in the
  first place (we expect quite normalized URLs from the collection), and
  it was crashing some Wikipedia URLs such as http://en.wikipedia.org/wiki//b/ 
  (which are, nonetheless, broken and not RFC-compliant in the first place).

5.2 -> 5.2.1

- Made protected a number of fields in TRECDocumentCollection and
  HtmlDocumentFactory following a request by Roi Blanco.

5.0.1 -> 5.2

- With this release, this version (big) of MG4J becomes the official
  release. Subsequent development will happen only on this version, which
  is presently in line with version 5.2 of the standard version.

- A small revolution is taking place in MG4J: now most classes handling
  indices have an IOFactory parameter that makes it possible to open files
  in alternative filesystems, such as HDFS. Beware--the feature is very
  pervasive and there might be missing spots. Thanks to Tim Potter for
  useful discussions and for testing this new feature.

- InputStreamDocumentSequence was not behaving correctly in case of
  keyboard input (two EOFs were necessary).

- Fixed very subtle bug in documents returned from HtmlDocumentFactory.
  Unparsed document coming from streaming sources would have accessed the
  data source during finalization due to toString() returning the document
  title. This was causing random error reading, say, from WArc streams, if
  a document was not closed properly. Added blurb to AbstractDocument that
  warns about this issue.

- Fixed a bug in dynamic class naming ("Payload " was used instead of
  "Payload"). Thanks to Dmitri Portnov for fixing this bug.

- The Maven artifacts did not contain the Velocity templates. Thanks
  to Andrew MacKinlay for reporting this issue.

- Switched to SLF4J for logging.

5.0.1

- Because of a copy-and-paste error END_OF_LIST was an int set to
  Integer.MAX_VALUE instead of a long set to Long.MAX_VALUE. Thanks to
  Valentin Tablan for reporting this bug.

5.0

- WARNING: this release has source and binary incompatibilities with
  previous releases. Watch out. 

- nextDocument() now returns DocumentIterator.END_OF_LIST instead of -1 to
  denote list exhaustion. To avoid confusion and ease the transition, the
  package prefix of MG4J is now it.unimi.di.*, following the change of
  name of our department.

- The plethora of methods that accessed the positions of a term in an
  IndexIterator have been replaced by the single lazy nextPosition() call,
  which returns IndexIterator.END_OF_POSITIONS when the positions are
  exhausted. Some static methods in IndexIterators should help with the
  transition.

- MG4J is no longer based on gap-based indices. Classical interleaved indices
  are used for incremental index construction and high-performance indices
  are still supported for historical reasons, but all new indices are by
  default built using the new quasi-succinct format.

- DiskBasedIndex.getInstance() now return an Index instead of a
  BitStreamIndex. Old code should check with a reflective call whether the
  result is a BitStreamIndex and act accordingly, as now it might be a
  QuasiSuccinctIndex, too.

- Fixed SimpleParser.parse(MutableString), which was throwing a
  NullPointerException.

- Fixed some mismatch between interrelated implementations of
  next()/hasNext()/nextInterval() in some interval iterators. Thanks to
  Dmitri Portnov for reporting this bug.

- Compatibility with previous versions (standard or big) should be
  complete, even at the level of term/prefix maps.

4.0.2

- We now force the number of documents of a virtual index to be equal
  to that specified by the resolver. Collections in which the last few
  documents were not referred would have generated virtual indices with
  fewer documents than the standard ones.

- Fixed a small bug in the equal method of Term.

- Fixed bug in the equals and hashCode methods of Select (before, only 
  Index was taken into account, and not the actual subquery).

- Fixed several small inconsistencies in the Scorer hierarchy.

- Added the SubsetDocumentSequence class to extract a subset of documents
  from a given sequence.

- Fixed the DEFAULT_TEMPLATE of QueryServlet.

- The default target for skipping structures is now 1%.

4.0.1

- Fixed the names of big lists by adding "Big" where it was necessary.
  This should cause no problem.

4.0

- First release of big MG4J.

- it.unimi.dsi.big.mg4j.search.DocumentIterator is now strictly lazy;
  in particular, it does not implement java.util.Iterator.