SAML v2 for Existing Hadoop Web Applications

Introduction

Apache Knox with KnoxSSO + pac4j provider enables the use of a number of new authentication and SSO solutions for accessing and developing KnoxSSO enabled applications including, Ambari, Ranger, Hadoop UIs and custom built applications that utilize REST APIs through Knox. These capabilities will be available in the Knox 0.8.0 release.


This paper illustrates the integration of the Okta identity service offering by leveraging the pac4j provider SAML 2 capabilities in Apache Knox. The same sort of flow that is described below would be available for Ranger, Hadoop UIs or any KnoxSSO participating application.

Okta

From their site (https://www.okta.com/):

Authentication you can trust

Behind the scenes, Okta enables SSO in one of two ways:

SAML

Security Assertion Markup Language (SAML) is a trusted format for exchanging authentication data. Okta enables single sign-on into SAML-enabled apps by brokering the information transfer between users and service providers.

Tutorial

This paper illustrates the use of Okta’s SSO with SAML option for an existing Hadoop ecosystem application called Apache Ambari. The following figure shows how Ambari only ever needs to be aware of KnoxSSO. The underlying authentication mechanism is isolated from the participating application. In this case, KnoxSSO negotiates an authentication with Okta via the SAML 2 protocol.


Note that the above figure includes 3.b for Hadoop UIs. This option is not covered in this paper and requires each of the UIs to be configured properly for SSO as well. Again, each of them only need to be configured for KnoxSSO and the underlying authentication mechanism is isolated from them.


Prerequisites

  1. Follow the Ambari Vagrant Quick Start guide (https://cwiki.apache.org/confluence/display/AMBARI/Quick+Start+Guide) to create a three node cluster with Centos 6.4 using Ambari 2.2 or greater

  2. Unzip the Apache Knox v0.8.0 release candidate to the {AMBARI_VAGRANT_HOME}/centos-6.4 directory which is a shared volume inside the vagrant machine /vagrant

  3. vagrant ssh into c6401

  4. Stop the Apache Knox instance that is already running (if there is one)

  5. Deploy the knoxsso.xml topology file from the doc into your local knox instance.  You can do this by copying the contents of the sample knoxsso.xml in this document into a new {GATEWAY_HOME}/conf/topologies/knoxsso.xml file.

  6. Change the knoxsso.cookie.secure.only param in knoxsso.xml to false. Ambari does not have SSL enabled by default and if we set the cookie to secure only it will not be presented to Ambari by the browser.  NOTE: THIS IS INSECURE AND ONLY USED FOR TESTING

  7. Start your v0.8.0 version of Knox via:  /usr/jdk64/jdk1.8.0_60/bin/java -jar bin/gateway.jar

  8. Configure Ambari for SSO with KnoxSSO through the SSO Wizard via the ambari-server CLI

    1. Get the gateway-identity public key from Apache Knox {GATEWAY_HOME}/data/security/keystores/gateway.jks via keytool or portecle (see Extracting Knox Public Key for SAML section for details)

    2. Get the SSO provider URL for the KnoxSSO websso endpoint

    3. su to root {pw: vagrant}

    4. start the sso wizard:

[root@c6401 knox-0.8.0]# ambari-server setup-sso
Using python  /usr/bin/python2
Setting up SSO authentication properties...
Do you want to configure SSO authentication [y/n] (y)?y
 
Provider URL [URL]: https://c6401.ambari.apache.org:8443/gateway/knoxsso/api/v1/websso
 
Public Certificate pem (stored) (empty line to finish input):
MIICOjCCAaOgAwIBAgIJANjgCshp4cP2MA0GCSqGSIb3DQEBBQUAMF8xCzAJBgNV
BAYTAlVTMQ0wCwYDVQQIEwRUZXN0MQ0wCwYDVQQHEwRUZXN0MQ8wDQYDVQQKEwZI
YWRvb3AxDTALBgNVBAsTBFRlc3QxEjAQBgNVBAMTCWxvY2FsaG9zdDAeFw0xNjAx
MzAxNDUwMTNaFw0xNzAxMjkxNDUwMTNaMF8xCzAJBgNVBAYTAlVTMQ0wCwYDVQQI
EwRUZXN0MQ0wCwYDVQQHEwRUZXN0MQ8wDQYDVQQKEwZIYWRvb3AxDTALBgNVBAsT
BFRlc3QxEjAQBgNVBAMTCWxvY2FsaG9zdDCBnzANBgkqhkiG9w0BAQEFAAOBjQAw
gYkCgYEAicKDXg/OXk1B1ttylj0PMvNpZc4cMagX6P2OEryzLQpMMagYVbpbL0zU
D3B3M86gEFIFmgebJ95v8EydB5g4CIbF3CmDORK77Fy265xLXv06bhVqjvU1Q+zg
gwu8YeH9ZQcgfCaDKG3Corb8mu3W9JBNYkdqrAxMsIeLvu4ASVUCAwEAATANBgkq
hkiG9w0BAQUFAAOBgQA0xEgGW1Ho6DXMYVJuyxQYyN/0/NWKZ2Ysrx7MwZd3pJBT
XVN7K9jnr4qFrw4ok2yHWQsziUjPHpXdZqGZcKf6+ATjoUIXV+UiX+Nj0R2ov/JQ
fDfhTSQIXakBKK3z/3q5/iQSFdsqIgL/Ce7iM9GsFesMhJnyk9aXb8ZROx4hvQ==
 
Do you want to configure advanced properties [y/n] (n) ?n
Ambari Server 'setup-sso' completed successfully.
  1. Restart Ambari server:

[root@c6401 knox-0.8.0]# ambari-server restart

Extracting Knox Public Key for SAML IdP Configuration

There are multiple ways that you can do this.

The following will use keytool to extract the gateway-identity cert to PEM encoding:

keytool -exportcert -alias gateway-identity -keystore data/security/keystores/gateway.jks -storepass {knoxsecret} -rfc -file gateway.pem

 

For the Ambari SSO wizard the content between

–—BEGIN CERTIFICATE–— and –—END CERTIFICATE–—

must be provided when requested. This is by the Ambari KnoxSSO integration point for verification of the SSO tokens issued by KnoxSSO.

Portecle

The free Portecle tool is great for extracting PEM encoded certs when the process doesn’t need to be automated. It may also be used when the keytool doesn’t allow passwords that are shorter than 6 chars to be used. Even when the keystore actually exists with shorter passwords <sigh>.


 

Apache Ambari

To demonstrate the integration between KnoxSSO and Okta for existing KnoxSSO aware Hadoop applications, Ambari will be used. This demonstrates Ambari’s ability to acquire and validate KnoxSSO tokens/cookies as a means to authenticate to its management capabilities and custom views.


Once logged in through KnoxSSO the resulting hadoop-jwt cookie is used to create an Ambari session. Apache Ambari only knows that it is relying on KnoxSSO and nothing about the underlying SSO provider (in this case Okta).

Test Integration with Okta


1. Open Apache Ambari in a broser at http://c6401.ambari.apache.org:8080 - you will initially be presented the Ambari login page but quickly redirected to the Okta login.



2. When presented with a login form, fill it out with these credentials (guest/Gu3stp@assword) and submit it to the Okta server. This will result in a SAML protocol POST to the callback URL for KnoxSSO.  The SAML assertion will be processed via the pac4j provider and the authenticated identity normalized into a Java Subject. The successful authentication continues the processing through the provider's chain and the identity assertion provider must use appropriate principal mapping to establish the effective username. The effective username is what the KnoxSSO service will put into the JWT token to be presented as a Cookie to all participating applications.

4. After a brief signing in page you should be redirected back to Ambari.  If you are interested you may find the hadoop-jwt cookie using Chrome’s developer tools - since the cookie is configured to not be secure only. It should be a session cookie set as HttpOnly and (normally) Secure.  The service parameter knoxsso.cookie.secure.only for the KnoxSSO service in the knoxsso.xml topology controls the secure only setting of the cookie.



Note how Ambari accepts successful authentication even when they are not existing users. The user is added to the Ambari database and they are assigned minimum privileges. As you can see above the authentication of guest was successful and they have been granted rights to their custom views - of which there are none.


An existing user with normal privileges would now have access to all of the Ambari capabilities and views for which they are permitted.

Topologies

The contents of these topology files can be copied into your {GATEWAY_HOME}/conf/topologies directory.

knoxsso.xml

The knoxsso.xml topology describes the manner in which a client acquires a KnoxSSO websso cookie/token. The pac4j federation provider allows the integration of a number of authentication solutions. In this case, the openid connect capability is being leveraged to integration the cloud based Privakey identity service.


<topology>
    <gateway>
      <provider>
          <role>federation</role>
          <name>pac4j</name>
          <enabled>true</enabled>
          <param>
            <name>pac4j.callbackUrl</name>
            <value>https://c6401.ambari.apache.org:8443/gateway/knoxsso/api/v1/websso</value>
          </param>
 
          <param>
            <name>clientName</name>
            <value>SAML2Client</value>
          </param>
 
          <param>
            <name>saml.identityProviderMetadataPath</name>
            <value>https://dev-122415.oktapreview.com/app/exk5quib9pnb5hW5S0h7/sso/saml/metadata</value>
          </param>

 

          <param>
            <name>saml.serviceProviderMetadataPath</name>
            <value>/tmp/sp-metadata.xml</value>
          </param>
                        
          <param>
            <name>saml.serviceProviderEntityId</name>     
            <value>https://c6401.ambari.apache.org:8443/gateway/knoxsso/api/v1/websso?pac4jCallback=true&amp;client_name=SAML2Client</value>
          </param>
      </provider>
      <provider>
          <role>identity-assertion</role>
          <name>Default</name>
          <enabled>true</enabled>
          <param>
            <name>principal.mapping</name>
            <value>guest@example.com=guest;</value>
          </param>
      </provider>
    </gateway>
 
    <service>
        <role>KNOXSSO</role>
        <param>
          <name>knoxsso.cookie.secure.only</name>
          <value>false</value>
       </param>
       <param>
         <name>knoxsso.token.ttl</name>
         <value>100000</value>
       </param>
       <param>
          <name>knoxsso.redirect.whitelist.regex</name>
          <value>^https?:\/\/(c6401\.ambari\.apache\.org|localhost|127\.0\.0\.1|0:0:0:0:0:0:0:1|::1):[0-9].*$</value>
       </param>
    </service>
</topology>
  • No labels