JIRA: KNOX-1623
Introduction
KnoxShell Kerberos support should be available in Apache Knox 1.3.0. KnoxShell is a Apache Knox module that has scripting support to talk to Apache Knox, more details on setting up KnoxShell can be found in this blog post. With kerberos support now we can use cached tickets or keytabs to authenticate with a secure (Kerberos enabled) topology in Apache Knox. This blog demonstrates examples of how this can be achieved.
Prerequisite
- In order to get started, download and setup KnoxShell , setup instructions.
- Configure Apache Knox to use Hadoop Auth, setup instructions.
Make sure to test by sending a curl request through Knox
curl -k -i --negotiate -u : "https://{knoxhost}:{knoxport}/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS”
Kerberos Authentication
Following is the methods that can be used to initialize a session in Knox Shell
session = KnoxSession.kerberosLogin(url, jaasConfig, krb5Conf, debug)
Where:
- url is the gateway url
jaasConfig is jaas configuration (optional)
krb5Conf krb5 config file (optional)
debug turn on debug statements (optional)
or
session = KnoxSession.kerberosLogin(url)
where:
- url is the gateway url
- Default jaasConfig is used which looks for cached token on an OS specific path
- Looks for krb5 conf file at default location
- debug is false
Example
Following is an example groovy script to talk to Kerberos enabled cluster fronted by Apache Knox using Hadoop Auth
import groovy.json.JsonSlurper import org.apache.knox.gateway.shell.KnoxSession import org.apache.knox.gateway.shell.hdfs.Hdfs import org.apache.knox.gateway.shell.Credentials gateway = "https://gateway-site:8443/gateway/secure" session = KnoxSession.kerberosLogin(gateway) text = Hdfs.ls( session ).dir( "/" ).now().string json = (new JsonSlurper()).parseText( text ) println json.FileStatuses.FileStatus.pathSuffix session.shutdown()
Following is an example of relevant parts of "secure" topology
<provider> <role>authentication</role> <name>HadoopAuth</name> <enabled>true</enabled> <param> <name>config.prefix</name> <value>hadoop.auth.config</value> </param> <param> <name>hadoop.auth.config.signature.secret</name> <value>some-secret</value> </param> <param> <name>hadoop.auth.config.type</name> <value>kerberos</value> </param> <param> <name>hadoop.auth.config.simple.anonymous.allowed</name> <value>false</value> </param> <param> <name>hadoop.auth.config.token.validity</name> <value>1800</value> </param> <param> <name>hadoop.auth.config.cookie.domain</name> <!-- Cookie domain for your site --> <value>your.site</value> </param> <param> <name>hadoop.auth.config.cookie.path</name> <!-- Topology path --> <value>gateway/secure</value> </param> <param> <name>hadoop.auth.config.kerberos.principal</name> <value>HTTP/your.site@EXAMPLE.COM</value> </param> <param> <name>hadoop.auth.config.kerberos.keytab</name> <value>/etc/security/keytabs/spnego.service.keytab</value> </param> <param> <name>hadoop.auth.config.kerberos.name.rules</name> <value>DEFAULT</value> </param> </provider>
Now we kinit and then run the groovy script.
Note on credential cache location: Credential cache location for macos is in-memory which means the credentials are held in memory and not written on disk. KnoxShell unfortunately does not have access to in-memory cache so -c FILE:<cache location> option should be used while doing a kinit.
The following ticket cache location is specific for my machine, it may or may not be same in your case.
kinit -c FILE:/tmp/krb5cc_502 admin/your.site@EXAMPLE.COM
Next we just invoke the groovy script using KnoxShell
bin/knoxshell.sh samples/ExampleWebHdfsLs.groovy
If everything is setup properly you should see HDFS LS output.