(This article is work in progress)
Apache Knox has always had LDAP based authentication through the Apache Shiro authentication provider which makes the configuration a bit easier and flexible. However there are a number of limitations with the KnoxLdapRealm (KNOX-536), for instance only a single Organizational Unit (OU) is currently supported. Group lookup will not return the groups that are defined within the tree structure below that single OU. Also, group memberships that are indirectly defined through membership in a group that is itself a member of another group are not resolved. In Apache Knox 0.10.0 Knox introduced the ability to leverage the Linux PAM authentication mechanism. KNOX-537 added a KnoxPAMRealm to the Shiro provider for PAM support. This blog post discusses how to set up LDAP using the new PAM support provided by Knox with Linux SSSD daemon and some of the advantages and key features of SSSD.
Some of the advantages of using this are:
Supported for nested OUs and nested groups
Support more complex LDAP queries
Reduce load on the LDAP/AD server (caching by SSSD)
There are two scenarios that were tested
- Nested groups
- Nested OUs
- Using Multiple Search Bases
Following diagram represents a nested groups structure used for testing
In the above diagram we have OU=data which has multiple nested groups (2 levels) and we have a user 'jerry' who belongs to the final group datascience-b explicitly, but implicitly belongs to all the other groups that nest it (i.e. datascience-a and datascience)
When SSSD is properly configured (as explained later in the post) we get the following results
When we try to access a resource secured by Knox using the user jerry we can see all the groups that user jerry belongs to are logged in gateway-audit.log (part of Knox logging)
Following diagram shows the nested OU structure used for testing
In this example we can see that the user kim is part of group 'processors' which is part of OU processing which is part of OU data which in turn is part of OU groups.
Following is the output of 'id' command, here we can see that our user kim and group that user belongs to are retrieved correctly.
Similarly, when we try to access a resource secured by Knox using the user kim we get the following entry in gateway-audit.log (part of Knox logging)
This demonstrates that Knox can authenticate and retrieve groups against nested OUs.
Using Multiple Search Bases
Following diagram shows nested parallel OUs (processing and processing-2)
In this test we will configure two different search bases
sssd.conf settings (relevant) for this test are as follows:
To check whether SSSD correctly picks up our users we use the id command
Similarly, when we try to access a resource secured by Knox using the user kim and jon we get the following entry in gateway-audit.log (part of Knox logging)
Also, if you take out 'processing2' service from sssd.conf file and restart sssd, user 'jon' will not be found but 'kim' can still be found:
Thanks to Eric Yang for pointing out this scenario.
Following diagram shows a high level set-up of the components involved.
Following are the component versions for this test
- OpenLDAP - 2.4.40
- SSSD - 1.14.1
- Apache Knox - 0.10.0
In order to support nesting of groups LDAP needs to support RFC 2307bis schema. For SSSD to talk to LDAP it has to be secure. Acquire a copy of the public CA certificate for the certificate authority used to sign the LDAP server certificate, you can test the certificate using the following openssl test command
SSSD is stricter than pam_ldap. In order to perform an authentication, SSSD requires that the communication channel be encrypted. This means that if sssd.conf has ldap_uri = ldap://<server>, it will attempt to encrypt the communication channel with TLS (transport layer security). If sssd.conf has ldap_uri = ldaps://<server>, then SSL will be used instead of TLS. This requires that the LDAP server
- Supports TLS or SSL
- Has TLS access enabled on the standard LDAP port (636) (or alternate port, if specified in the ldap_uri or has SSL access enabled on the standard LDAPS port (or alternate port).
- Has a valid certificate trust (can be relaxed by using ldap_tls_reqcert = never, but it is a security risk and should ONLY be done for development and demos)
Copy the public CA certs needed to talk to LDAP at /etc/openldap/certs
To configure sssd you can use the following 'authconfig' command
After the command executes you can see that sssd.conf file has been updated.
An example of sssd.conf file
The important settings to note are:
- ldap_schema = rfc2307bis - Needed if all groups are to be returned when using nested groups or primary/secondary groups.
- ldap_tls_cacertdir = /etc/openldap/certs - certs to talk to LDAP server
- ldap_id_use_start_tls = True - Secure communication with LDAP
- ldap_group_nesting_level = 5 - Enable group nesting up-to 5 levels
NOTE: You might need to add / change some options in sssd.conf file to suite your needs. like debug level etc. After updating just restart the service and changes should be reflected.
Some additional settings that can be used to control caching of credentials by SSSD are
|cache_credentials||Boolean||Optional. Specifies whether to store user credentials in the local SSSD domain database cache. The default value for this parameter is |
|entry_cache_timeout||integer||Optional. Specifies how long, in seconds, SSSD should cache positive cache hits. A positive cache hit is a successful query.|
Test SSSD is configuration
To check whether SSSD is configured correctly you can use the standard 'getent' or 'id' commands
Using the above commands you should be able to see all the groups that <ldap_user> belongs to. If you do not see the secondary groups check the 'ldap_group_nesting_level = 5' option and adjust it accordingly.
Setting up Knox is relatively easy, install Knox on the same machine as SSSD and update the topology to use PAM based auth
For more information and explanation on setting up Knox see the PAM Based Authentication section in Knox user guide.
For nested group membership SSSD and LDAP should use rfc2307bis schema
SSSD requires SSL/TLS to talk to LDAP
Apache KNOX provides a single gateway to many services in your Hadoop cluster. You can leverage the KNOX shell DSL interface to interact with services such as WebHdfs, WebHCat (Templeton), Oozie, HBase, etc. For example, using groovy and DSL you can submit Hive queries via WebHCat (Templeton) as simple as:
submitSqoop Job API
With version of Apache KNOX 0.10.0, you can now write application using KNOX DSL for Apache SQOOP and easily submit SQOOP jobs. The WebHCAT Job class in DSL language now supports submitSqoop() as follow:
submitSqoop Request takes the following arguments:
- command (String) - The sqoop command string to execute.
- files (String) - Comma separated files to be copied to the templeton controller job.
- optionsfile (String) - The remote file which contain Sqoop command need to run.
- libdir (String) - The remote directory containing jdbc jar to include with sqoop lib
- statusDir (String) - The remote directory to store status output.
which will return jobId as Response.
In this example we will run a simple sqoop job to extract scBlastTab table to HFDS from the public genome database (mySQL) at UCSC.
First, import the following packages:
Next, establish connection to KNOX gateway with Hadoop.login:
Define your SQOOP job (assuming SQOOP is already configured with mySql driver already):
You can now submit the sqoop_command to the cluster with submitSqoop:
You can then check job status and output as usual:
Here is sample output of the above example against Hadoop cluster. You need to have properly configured Hadoop cluster with Apache KNOX gateway, Apache Sqoop and WebHcat (Templeton). Test was ran against BigInsights Hadoop cluster.
From output above you can see the job output as well as the content of the table directory on HDFS which contains 5 parts (used 5 map tasks). WebHcat (Templeton) job console output will go to stderr in this case.
As part of compiling/running your code ensure you have the following dependency: org.apache.knox:gateway-shell:0.10.0.