Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

DO NOT use tika-pipes if your tika-server is exposed to the internet or if you do not carefully restrict access to tika-server.

Consider adding two-way TLS encryption to your client and server, a beta version of which is available in 2.4.0: TikaServer#SSL(Beta).

Overview

The tika-pipes modules enable fetching data from various sources, running the parse and then emitting the output to various destinations.  These modules are built around the RecursiveParserWrapper output model (-J option in tika-app and /rmeta endpoint in tika-server-standard).  Users can specify content format (text/html/body) and set limits (number of embedded files, max content length) via FetchEmitTuples.  Further, users can add Metadata Filters to select and modify the metadata that is extracted during the parse before emitting the output.

...