-
Notifications
You must be signed in to change notification settings - Fork 8
postprocessing
OGER offers a simple mechanism to apply arbitrary operations to the processed documents.
Through the postfilter parameter, one or more Python functions are specified, which can modify the documents in memory.
The specified function must accept a single parameter.
It is called with an oger.doc.document.Article or .Collection object, depending the iter-mode.
If multiple postfilters are given, they are called in sequence, in the order specified.
The parameter value is interpreted as the path to a module, optionally followed by a function name, separated by a colon, eg. "path/to/module.py:exclude_short". If no function name is given, it defaults to "postfilter".
If the special name "builtin" is used as the path portion, the built-in postfilters can be activated (eg. "builtin:remove_submatches").
Multiple filters can be given as a space-separated list or as a JSON array. When called from the command-line, care needs to be taken to properly escape special characters according to the shell used.