Current state: In-progress
Discussion thread:
JIRA: KNOX-1006
There are a number of usability issues that are related to the current need to edit possibly large XML topology files within the Apache Knox Admin UI and within
Apache Ambari. Some of these issues are:
By providing a Service Discovery service in Knox, we can provide a more dynamic mechanism for generating, populating or realizing service details such as URL, HA enablement, etc.
By introducing a greatly simplified descriptor that can be combined with service discovery at deployment time, we can provide much simpler user experience in Ambari and the Knox Admin UI.
This KIP will propose an approach to achieving the above.
As some background information, in the early days of Apache Knox, we had the idea of a service registry as part of the gateway itself.
The idea was that as services spun up they would acquire a registrationCode and sign it for registration.
Service URLs could then be looked up for each "cluster" (topology) from this global registry.
There is an internal representation of this registry in Knox now and it is populated today by the deployment machinery - signed registration code and all.
The registry is serialized to the {GATEWAY_HOME}/data/security/registry file.
It is later deserialized on restart of the gateway in order to have the in-memory model needed by the gateway but since the deployment machinery doesn't run on restart for existing topologies we need to preserve the previous state.
There are a couple things that aren't great about this mechanism:
This whole thing may be able to be eliminated or replaced by Service Discovery or may need to continue to be supported with some improvements for deployments that do not have another registry like Ambari available.
We will need to make sure that we continue to meet the needs that this existing mechanism provides.
In order to simplify what management UI's need to manage, we should support a much simpler descriptor that contains what is needed to generate an actual topology file at deployment time (see item #1 from the Motivation section).
A simplified topology descriptor must allow the following details to be specified:
YAML offers more structure than a properties file, and is suitable for adminstrators/developers hand-editing simple descriptors.
# Discovery info source discovery-type: AMBARI discovery-address: http://sandbox.hortonworks.com:8080 discovery-user: maria_dev discovery-pwd-alias: ambari.discovery.password # Provider config reference, the contents of which will be # included in (or referenced from) the resulting topology descriptor. # The contents of this reference has a <gateway/> root, and # contains <provider/> configurations. provider-config-ref : sandbox-providers.xml # The cluster for which the service details should be discovered cluster: Sandbox # The services to declare in the resulting topology descriptor, # whose URLs will be discovered (unless a value is specified) services: - name: NAMENODE - name: JOBTRACKER - name: WEBHDFS - name: WEBHCAT - name: OOZIE - name: WEBHBASE - name: HIVE - name: RESOURCEMANAGER - name: KNOXSSO params: knoxsso.cookie.secure.only: true knoxsso.token.ttl: 100000 - name: AMBARI urls: - http://sandbox.hortonworks.com:8080 - name: AMBARIUI urls: - http://sandbox.hortonworks.com:8080 # UIs to be proxied through the resulting Knox topology (see KIP-9) #uis: # - name: AMBARIUI # url: http://sandbox.hortonworks.com:8080 |
While JSON is not really a format for configuration, it is certainly appropriate as a wire format, and will be used for API interactions.
{ "discovery-type":"AMBARI", "discovery-address":"http://sandbox.hortonworks.com:8080", "discovery-user":"maria_dev", "discovery-pwd-alias":"ambari.discovery.password", "provider-config-ref":"sandbox-providers.xml", "cluster":"Sandbox", "services":[ {"name":"NAMENODE"}, {"name":"JOBTRACKER"}, {"name":"WEBHDFS"}, {"name":"WEBHCAT"}, {"name":"OOZIE"}, {"name":"WEBHBASE"}, {"name":"HIVE"}, {"name":"RESOURCEMANAGER"}, {"name":"KNOXSSO", "params":{ "knoxsso.cookie.secure.only":"true", "knoxsso.token.ttl":"100000" } }, {"name":"AMBARI", "urls":["http://sandbox.hortonworks.com:8080"]} ], "uis":[ {"name":"AMBARIUI", "urls":["http://sandbox.hortonworks.com:8080"]} ] } |
Given that we will have a Service Discovery service that can integrate with Ambari as well as other sources of needed metadata, we should be able to start with a simplified topology descriptor.
Once the deployment machinery notices this descriptor, it can pull in the referenced provider configuration, iterate over each of the services, UIs, applications and lookup the details for each.
With the provider configuration and service details we can then generate a fully baked topology.
From the example descriptors in the Simplified Topology Descriptors section, the resulting topology should include the provider configuration from the reference, and each service and UI (another KIP) with its respective URLs:
<?xml version="1.0" encoding="UTF-8"?> <topology> <gateway> <provider> <role>authentication</role> <name>ShiroProvider</name> <enabled>true</enabled> <param> <!-- session timeout in minutes, this is really idle timeout, defaults to 30mins, if the property value is not defined,, current client authentication would expire if client idles contiuosly for more than this value --> <name>sessionTimeout</name> <value>30</value> </param> <param> <name>main.ldapRealm</name> <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value> </param> <param> <name>main.ldapContextFactory</name> <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory</value> </param> <param> <name>main.ldapRealm.contextFactory</name> <value>$ldapContextFactory</value> </param> <param> <name>main.ldapRealm.userDnTemplate</name> <value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value> </param> <param> <name>main.ldapRealm.contextFactory.url</name> <value>ldap://localhost:33389</value> </param> <param> <name>main.ldapRealm.contextFactory.authenticationMechanism</name> <value>simple</value> </param> <param> <name>urls./**</name> <value>authcBasic</value> </param> </provider> <provider> <role>identity-assertion</role> <name>Default</name> <enabled>true</enabled> </provider> <!-- Defines rules for mapping host names internal to a Hadoop cluster to externally accessible host names. For example, a hadoop service running in AWS may return a response that includes URLs containing the some AWS internal host name. If the client needs to make a subsequent request to the host identified in those URLs they need to be mapped to external host names that the client Knox can use to connect. If the external hostname and internal host names are same turn of this provider by setting the value of enabled parameter as false. The name parameter specifies the external host names in a comma separated list. The value parameter specifies corresponding internal host names in a comma separated list. Note that when you are using Sandbox, the external hostname needs to be localhost, as seen in out of box sandbox.xml. This is because Sandbox uses port mapping to allow clients to connect to the Hadoop services using localhost. In real clusters, external host names would almost never be localhost. --> <provider> <role>hostmap</role> <name>static</name> <enabled>true</enabled> <param><name>localhost</name><value>sandbox,sandbox.hortonworks.com</value></param> </provider> </gateway> <service> <role>AMBARIUI</role> <url>http://c6401.ambari.apache.org:8080</url> </service> <service> <role>HIVE</role> <url>http://c6402.ambari.apache.org:10001/cliservice</url> </service> <service> <role>WEBHCAT</role> <url>http://c6402.ambari.apache.org:50111/templeton</url> </service> <service> <role>AMBARI</role> <url>http://c6401.ambari.apache.org:8080</url> </service> <service> <role>OOZIE</role> <url>http://c6402.ambari.apache.org:11000/oozie</url> </service> <service> <role>JOBTRACKER</role> <url>rpc://c6402.ambari.apache.org:8050</url> </service> <service> <role>NAMENODE</role> <url>hdfs://c6401.ambari.apache.org:8020</url> </service> <service> <role>WEBHBASE</role> <url>http://c6401.ambari.apache.org:60080</url> </service> <service> <role>WEBHDFS</role> <url>http://c6401.ambari.apache.org:50070/webhdfs</url> </service> <service> <role>RESOURCEMANAGER</role> <url>http://c6402.ambari.apache.org:8088/ws</url> </service> <service> <role>KNOXSSO</role> <param> <name>knoxsso.cookie.secure.only</name> <value>true</value> </param> <param> <name>knoxsso.token.ttl</name> <value>100000</value> </param> </service> </topology> |
We should also consider how we will discover simple descriptors and I think that we may want to have multiple ways.
Just as is currently done for topology files, Knox can monitor a local directory for new or changed descriptors, and trigger topology generation and deployment upon such events.
This is great for development and small cluster deployments.
The Knox Topology Service will monitor two additional directories:
3.1.2 Remote
For production and larger deployments, we need to be able to accommodate multiple instances of Knox better. One proposal for such cases is a ZooKeeper-based discovery mechanism.
All Knox instances will pick up the changes from ZK as the central source of truth, and perform the necessary generation and deployment of the corresponding topology.
The location of these descriptors and their dependencies (e.g., referenced provider config) in ZK must be defined.
It would also be helpful to provide a means (e.g., Ambari, Knox admin UI, CLI, etc...) by which these descriptors can be easily published to the correct location in a znode.
Item #2 from the Motivation section addresses the usability issues associated with having to manually edit a topology descriptor to specify the various service URLs in a cluster.
Beyond the obvious tedium associated with this task, there is real potential for human error, especially when multiple Knox instances are involved. The proposed solution to this
is automated service discovery, by which the URLs for the services to be exposed are discovered and added to a topology.
In order to transform a Simplified Topology Descriptor into a full topology, we need some source for the relevant details and metadata. Knox is often deployed in clusters
with Ambari and ZooKeeper also deployed, and both contain some level of the needed metadata about the service URLs wtihin the cluster.
Ambari provides everything we need to flesh out a full topology from a minimal description of what resources that we want exposed and how we want access to those resources protected.
The ZooKeeper service registry has promise, but we need to investigate the level of use from a registered services perspective and details available for each service.
These are some excerpts of responses from the Ambari REST API which can be employed for topology discovery:
(CLUSTER_NAME is a placeholder for an actual cluster name)
{ "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/", "items" : [ { "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME", "Clusters" : { "cluster_name" : "CLUSTER_NAME", "version" : "HDP-2.6" } } ] } |
“items” : [ { “href” : "http://AMBARI_ADDRESS/api/v1/CLUSTER_NAME/CLUSTER_NAME/services/HIVE", “components” : [ { “ServiceComponentInfo” : { }, “host_components” : [ { “HostRoles” : { “cluster_name” : “CLUSTER_NAME”, “component_name” : “HCAT”, “host_name” : “c6402.ambari.apache.org” } } ] }, { “ServiceComponentInfo” : { }, “host_components” : [ { “HostRoles” : { “cluster_name” : “CLUSTER_NAME”, “component_name” : “HIVE_SERVER”, “host_name” : “c6402.ambari.apache.org” } } ] } ] }, { “href” : "http://AMBARI_ADDRESS/api/v1/CLUSTER_NAME/CLUSTER_NAME/services/HDFS", “ServiceInfo” : {}, “components” : [ { “ServiceComponentInfo” : { }, “host_components” : [ { “HostRoles” : { “cluster_name” : “CLUSTER_NAME”, “component_name” : “NAMENODE”, “host_name” : “c6401.ambari.apache.org” } } ] } ] } ] |
"items" : [ { "href" : "http://AMBARI_ADDRESS/api/v1/clusters/CLUSTER_NAME/configurations/service_config_versions?service_name=HDFS&service_config_version=2", "cluster_name" : "CLUSTER_NAME", "configurations" : [ { "Config" : { "cluster_name" : "CLUSTER_NAME", "stack_id" : "HDP-2.6" }, "type" : "ssl-server", "properties" : { "ssl.server.keystore.location" : "/etc/security/serverKeys/keystore.jks", "ssl.server.keystore.password" : "SECRET:ssl-server:1:ssl.server.keystore.password", "ssl.server.keystore.type" : "jks", "ssl.server.truststore.location" : "/etc/security/serverKeys/all.jks", "ssl.server.truststore.password" : "SECRET:ssl-server:1:ssl.server.truststore.password" }, "properties_attributes" : { } }, { "Config" : { "cluster_name" : "CLUSTER_NAME", "stack_id" : "HDP-2.6" }, "type" : "hdfs-site", "tag" : "version1", "version" : 1, "properties" : { "dfs.cluster.administrators" : " hdfs", "dfs.encrypt.data.transfer.cipher.suites" : "AES/CTR/NoPadding", "dfs.hosts.exclude" : "/etc/hadoop/conf/dfs.exclude", "dfs.http.policy" : "HTTP_ONLY", "dfs.https.port" : "50470", "dfs.journalnode.http-address" : "0.0.0.0:8480", "dfs.journalnode.https-address" : "0.0.0.0:8481", "dfs.namenode.http-address" : "c6401.ambari.apache.org:50070", "dfs.namenode.https-address" : "c6401.ambari.apache.org:50470", "dfs.namenode.rpc-address" : "c6401.ambari.apache.org:8020", "dfs.namenode.secondary.http-address" : "c6402.ambari.apache.org:50090", "dfs.webhdfs.enabled" : "true" }, "properties_attributes" : { "final" : { "dfs.webhdfs.enabled" : "true", "dfs.namenode.http-address" : "true", "dfs.support.append" : "true", "dfs.namenode.name.dir" : "true", "dfs.datanode.failed.volumes.tolerated" : "true", "dfs.datanode.data.dir" : "true" } } } ] }, { "href" : "http://AMBARI_ADDRESS/api/v1/clusters/CLUSTER_NAME/configurations/service_config_versions?service_name=YARN&service_config_version=1", "cluster_name" : "CLUSTER_NAME", "configurations" : [ { "Config" : { "cluster_name" : "CLUSTER_NAME", "stack_id" : "HDP-2.6" }, "type" : "yarn-site", "properties" : { "yarn.http.policy" : "HTTP_ONLY", "yarn.log.server.url" : "http://c6402.ambari.apache.org:19888/jobhistory/logs", "yarn.log.server.web-service.url" : "http://c6402.ambari.apache.org:8188/ws/v1/applicationhistory", "yarn.nodemanager.address" : "0.0.0.0:45454", "yarn.resourcemanager.address" : "c6402.ambari.apache.org:8050", "yarn.resourcemanager.admin.address" : "c6402.ambari.apache.org:8141", "yarn.resourcemanager.ha.enabled" : "false", "yarn.resourcemanager.hostname" : "c6402.ambari.apache.org", "yarn.resourcemanager.webapp.address" : "c6402.ambari.apache.org:8088", "yarn.resourcemanager.webapp.delegation-token-auth-filter.enabled" : "false", "yarn.resourcemanager.webapp.https.address" : "c6402.ambari.apache.org:8090", }, "properties_attributes" : { } }, ] }, ] |
"items" : [ { "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/HCAT", "ServiceComponentInfo" : { "cluster_name" : "CLUSTER_NAME", "component_name" : "HCAT", "service_name" : "HIVE" } }, { "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/HIVE_SERVER", "ServiceComponentInfo" : { "cluster_name" : "CLUSTER_NAME", "component_name" : "HIVE_SERVER", "service_name" : "HIVE" } }, { "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/NAMENODE", "ServiceComponentInfo" : { "cluster_name" : "CLUSTER_NAME", "component_name" : "NAMENODE", "service_name" : "HDFS" } }, { "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/OOZIE_SERVER", "ServiceComponentInfo" : { "cluster_name" : "CLUSTER_NAME", "component_name" : "OOZIE_SERVER", "service_name" : "OOZIE" } } ] |
Since the service URLs for a cluster will be discovered, Knox has the opportunity to respond dynamically to subsequent topology changes. For a Knox topology that has been generated and deployed, it's possible that the URL for a given service could change at some point afterward.
The host name could change. The scheme and/or port could change (e.g., http --> https). The potential and frequency of such changes certainly varies among deployments.
We should consider providing the option for Knox to detect topology changes for a cluster, and respond by updating its corresponding topology.
For example, Ambari provides the ability to request the active configuration versions for all the service components in a cluster. There could be a thread that checks this set, notices one or more version changes, and initiates the re-generation/deployment of that topology.
Another associated benefit is the capability for Knox to interoperate with Ambari instances that are unaware of the Knox instance. Knox no longer MUST be managed by Ambari.
KIP-1 touched on an improvement for Centralized Gateway Configuration for LDAP/AD and the ability to import topology fragments into other topologies by referencing them by name within the topology.
While we haven't actually implemented anything along these lines yet, we are touching on something very similar in this proposal. We need to be able to author a set of Provider Configurations that can be referenced by name in the simple descriptor, that can be enumerated for inclusion in a dropdown for UIs that need to be able to select a Provider Configuration and a set of APIs added to the Admin API for management of the configuration with CRUD operations.
Whether we import the common Provider Configuration or pull in the contents of it into a generated topology can certainly be a design discussion. I think that importing at runtime would be best as we wouldn't have to regenerate all the topologies that included them if something is changed in the Provider Configuration. Though there would need to be some way to reimport the Provider Config at runtime if something changes.
The discovery service must be authenticated by at least some of the service registries (e.g., Ambari). We need to define the means by which credentials are configured for this service in Knox.
HTTP Basic authentication is supported by Ambari, so the Ambari service discovery implementation can leverage the Knox Alias service to obtain the necessary credentials at discovery time.
The username can be specified in a descriptor, using the discovery-user property. The default user name can also be mapped to the alias ambari.discovery.user.
Similarly, the password alias can be specified in a descriptor, using the discovery-pwd-alias property. By default, the Ambari service discovery will lookup the ambari.discovery.password alias to get the password.
So, there are two ways to specify the username for Ambari interaction:
And, two ways to specify the associated password: