Remote Python Debugging

Prerequisites:  A code editor or IDE with the Impala shell Python project set up.  The machine where the editor/IDE is running must have network connectivity to the machine where Impala is running.  If direct network connectivity is not available, then SSH port-forwarding will also work.

  1. Set up the Python remote debugger to connect to the machine where Impala is running.  Use port 5678 and configure the debugger to attach (instead of launch).
  2. On the Impala machine, run the Impala shell in remote debug mode:

    IMPALA_SHELL_DEBUG_PORT=5678 ./bin/impala-shell.sh
  3. Wait until this message is displayed: "impala python shell waiting for remote debugging connection on port 5678"
  4. In the editor/IDE, start the Python remote debugger.  You can now set breakpoints, pause execution, and view/modify variables.

Inspecting HTTP Traffic

These instructions explain how to view HTTP traffic as it flows between the Impala shell and the Impala daemon in a local development environment using the hs2 over http protocol.  These instructions will also be applicable to viewing HTTP traffic originating with other Impala clients.

  1. Install mitmproxy
    1. on Ubuntu, use apt: 

      sudo apt install mitmproxy
  2. Edit the ~/.impalarc file inserting the following configuration.  If port 28007 is already taken, then select any other unused port.

    [impala]
    protocol=hs2-http
    impalad=localhost:28007
  3. Run mitmproxy.  If a different port was selected in step 2, then use that port for the --listen-port command line parameter.  Additionally, this code assumes the Impala daemon is using port 28000 for it's hs2-http port.  If the Impala daemon is using a different port, then simply modify the --mode command line parameter to reflect the actual port.

    mitmdump --mode reverse:http://127.0.0.1:28000 --listen-port 28007
  4. Run Impala shell.  All communication to the local Impala daemon's hs2-http port will be written out to the terminal.


  • No labels