When planning new releases of Apache Tika, or looking to get feedback on proposed API changes, it's helpful to know who is "downstream of us" so we know who we might want to reach out to.
It is possible to find out who uses Tika as a Java library by querying Maven central, eg
https://mvnrepository.com/artifact/org.apache.tika/tika-core/usages
Some other uses, such as the Tika Server and Docker images are harder to spot automatically. To that end, we've put together this list of some of the bigger ASF + external Projects who depend on us. It isn't meant to be an exhaustive list!
Search
- Apache SOLR - https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-with-tika.html
- ElasticSearch - https://www.elastic.co/guide/en/elasticsearch/reference/current/attachment.html
- OpenSearch - https://opensearch.org/docs/latest/install-and-configure/additional-plugins/ingest-attachment-plugin/
- Apache Nutch - parse-tika plugin activated by default
Content Management
- Drupal
- Alfresco
Data Pipelines
- Apache Nifi - ExtractMediaMetadata + ExtractDocumentText
- Kestra - https://kestra.io/plugins/plugin-tika
Big Data
TODO Document