JIRA: KNOX-1623

Introduction

KnoxShell Kerberos support should be available in Apache Knox 1.3.0. KnoxShell is a Apache Knox module that has scripting support to talk to Apache Knox, more details on setting up KnoxShell can be found in this blog post. With kerberos support now we can use cached tickets or keytabs to authenticate with a secure (Kerberos enabled) topology in Apache Knox. This blog demonstrates examples of how this can be achieved.

Prerequisite

  1. In order to get started, download and setup KnoxShell , setup instructions.
  2. Configure Apache Knox to use Hadoop Auth, setup instructions.

Make sure to test by sending a curl request through Knox

curl -k -i --negotiate -u : "https://{knoxhost}:{knoxport}/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS”


Kerberos Authentication

Following is the methods that can be used to initialize a session in Knox Shell

session = KnoxSession.kerberosLogin(url, jaasConfig, krb5Conf, debug)

Where:

  • url is the gateway url
  • jaasConfig is jaas configuration (optional)
  • krb5Conf krb5 config file (optional)
  • debug turn on debug statements (optional)

or

session = KnoxSession.kerberosLogin(url)

where:

  • url is the gateway url
  • Default jaasConfig is used which looks for cached token on an OS specific path
  • Looks for krb5 conf file at default location
  • debug is false

Example

Following is an example groovy script to talk to Kerberos enabled cluster fronted by Apache Knox using Hadoop Auth

    import groovy.json.JsonSlurper
    import org.apache.knox.gateway.shell.KnoxSession
    import org.apache.knox.gateway.shell.hdfs.Hdfs

    import org.apache.knox.gateway.shell.Credentials

    gateway = "https://gateway-site:8443/gateway/secure"

    session = KnoxSession.kerberosLogin(gateway)

    text = Hdfs.ls( session ).dir( "/" ).now().string
    json = (new JsonSlurper()).parseText( text )
    println json.FileStatuses.FileStatus.pathSuffix
    session.shutdown()

Following is an example of relevant parts of "secure" topology

       <provider>
          <role>authentication</role>
          <name>HadoopAuth</name>
          <enabled>true</enabled>
          <param>
            <name>config.prefix</name>
            <value>hadoop.auth.config</value>
          </param>
          <param>
            <name>hadoop.auth.config.signature.secret</name>
            <value>some-secret</value>
          </param>
          <param>
            <name>hadoop.auth.config.type</name>
            <value>kerberos</value>
          </param>
          <param>
            <name>hadoop.auth.config.simple.anonymous.allowed</name>
            <value>false</value>
          </param>
          <param>
            <name>hadoop.auth.config.token.validity</name>
            <value>1800</value>
          </param>
          <param>
            <name>hadoop.auth.config.cookie.domain</name>
            <!-- Cookie domain for your site -->
            <value>your.site</value>
          </param>
          <param>
            <name>hadoop.auth.config.cookie.path</name>
            <!-- Topology path -->
            <value>gateway/secure</value>
          </param>
          <param>
            <name>hadoop.auth.config.kerberos.principal</name>
            <value>HTTP/your.site@EXAMPLE.COM</value>
          </param>
          <param>
            <name>hadoop.auth.config.kerberos.keytab</name>
            <value>/etc/security/keytabs/spnego.service.keytab</value>
          </param>
          <param>
            <name>hadoop.auth.config.kerberos.name.rules</name>
            <value>DEFAULT</value>
          </param>
        </provider>

Now we kinit and then run the groovy script.

Note on credential cache location: Credential cache location for macos is in-memory which means the credentials are held in memory and not written on disk. KnoxShell unfortunately does not have access to in-memory cache so -c FILE:<cache location> option should be used while doing a kinit.

The following ticket cache location is specific for my machine, it may or may not be same in your case.

kinit -c FILE:/tmp/krb5cc_502 admin/your.site@EXAMPLE.COM

Next we just invoke the groovy script using KnoxShell

bin/knoxshell.sh samples/ExampleWebHdfsLs.groovy

If everything is setup properly you should see HDFS LS output.

  • No labels