Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: link to READMEs

...

  • HTTP 1.1 or http/2 (property http.useHttp2)
  • usage of proxy servers
  • efficient by reusing connection with a configurable session connection pool (NUTCH-2896 and PR#697)

...

Nutch provides a couple of protocol plugins which fetch content not directly but using an intermediate web browser controlled via the Selenium browser automation library.

protocol-selenium

See README.

protocol-interactiveselenium

See README.

protocol-htmlunit




file:// access – protocol-file

...