More options

All tools and classes used so far have a large number of options that make them highly configurable. For instance, there are other properties of a factory that can be specified—please have a look at the Javadoc of the document factory you are using. For instance, a common property is wordreader, which makes it possible to specify a different instance of WordReader—the class that it used to segment text into words and non-words. The standard WordReader (FastBufferedReader) considers just letters and digits as part of a word, but you can choose your variant, and even specify it directly on the command line: for instance, -pwordreader=FastBufferedReader\(_\) specifies that underscores should be considered as part of a word. More generally, you can specify an expression that follows dsutils's ObjectParser conventions and that will be used to instantiate a WordReader.

All MG4J tools implement the standard --help option, which will display a detailed help text.