You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Current »

With the explosion of image models and embedding options for multimedia (including PDFs!), we should keep track of various tools and strategies for a few reasons. First so that we can direct users on how they might wrap some of these resources as Tika parsers to get the benefits of the Tika framework (esp. on embedded documents), and second, so that if there are a few emerging leaders/techniques, we might consider adding wrappers to the main Tika project.


General Info Blog Posts

"How to" Blog Posts


Conference Talks

Conferences


Open Source Tools (no particular order/not an endorsement)

Select Open Source Tools that haven't been updated in the last year (no particular order/not an endorsement)

Commercial Tools (no particular order/not an endorsement)


Non-FAANGM Companies Focusing on Advanced Document Processing (no particular order/not an endorsement)

Cloud Services (no particular order/not an endorsement)

Research Papers

  • Good luck. There are too many...unless you have a department of grad students at the ready
  • No labels