Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Remember to replace x.x with the version you have built.

Running the Tika Server as a Jar file

The Tika Server binary is a standalone runnable jar.  Download the latest stable release binary from the Apache Tika downloads page, via your favourite local mirror. You want the tika-server-1.x.jar file, eg tika-server-1.23.jar

...

Using prebuilt Docker image

Also, you There is an unofficial image for Tika that has been available for years.  You can download and start it with

No Format
java docker pull logicalspark/docker-tikaserver # only on initial download/update
docker run --rm -p 9998:9998 logicalspark/docker-tikaserver

...

-jar tika-server-x.x.jar --host=intranet.local --port=12345 

With --rm option it will be deleted as soon as container stopped. Dockerfile can be found at Github.

There is also an in-progress effort to publish an official Tika Docker image.  That code can be found at https://github.com/apache/tika-docker and will eventually replace the version produced by LogicalSpark.

Running Tika Server as Unix Service

Shipping in Tika 1.24 is a new Service Installation Script that lets you install Tika as a service on Linux.  This script was heavily influenced by the Apache Solr project's script, so read up on that documentation if you want to customize the script..

Currently the script only supports CentOS, Debian, Red Hat, Suse and Ubuntu Linxu distributions. Before running the script, you need to determine a few parameters about your setup. Specifically, you need to decide where to install Tika and which system user should be the owner of the Tika files and process

To run the scripts, you'll need the 1.24 (or later) Solr distribution.  It will have a -bin suffice eg tika-server-1.24-SNAPSHOT-bin.tgz.  Extract the installation script from the distribution via:


No Format
tar xzf tika-server-1.24-bin.tgz tika-server-1.24-bin/bin/install_tika_service.sh --strip-components=2

This will extract the install_tika_service.sh script from the archive into the current directory.  If installing on Red Hat, please make sure lsof is installed before running the Solr installation script (sudo yum install lsof). The installation script must be run as root:

sudo bash ./install_tika_service.sh tika-server-1.24-bin.tgz

By default, the script extracts the distribution archive into /opt/tika, configures Tika to write files into /var/tika, and runs Tika as the tika user on the default port. Consequently, the following command produces the same result as the previous command:

sudo bash ./install_tika_service.sh tika-server-1.24-bin -i /opt -d /var/tika -u tika -s tika -p 9998

You can customize the service name, installation directories, port, and owner using options passed to the installation script. To see available options, simply do:


sudo bash ./install_tika_service.sh -help

Once the script completes, Tika will be installed as a service and running in the background on your server (on port 9998). To verify, you can do:


sudo service tika status

Your specific customization to Tika setup are stored in the /etc/init.d/tika file.


Services

All services that take files use HTTP "PUT" requests. When "PUT" is used, the original file must be sent in request body without any additional encoding (do not use multipart/form-data or other containers).

...