Status

Current state: In-progress

Discussion thread:

Motivation

There are a number of usability issues that are related to the current need to edit possibly large XML topology files within the Apache Knox Admin UI and within

Apache Ambari. Some of these issues are:

The hierarchical structure of our provider configuration
Having to specify the correct URL of the services being proxied
Redundantly configuring the same LDAP configuration across different topologies
Redundantly configuring and picking up changes to cluster configuration such as HA enablement for one or more proxied services
Difficulty in providing topology management within Ambari given the typical name value pair style of Ambari

By providing a Service Discovery service in Knox, we can provide a more dynamic mechanism for generating, populating or realizing service details such as URL, HA enablement, etc.

By introducing a greatly simplified descriptor that can be combined with service discovery at deployment time, we can provide much simpler user experience in Ambari and the Knox Admin UI.

This KIP will propose an approach to achieving the above.

Service Discovery

1. Service Registry

As some background information, in the early days of Apache Knox, we had the idea of a service registry as part of the gateway itself.

The idea was that as services spun up they would acquire a registrationCode and sign it for registration.

Service URLs could then be looked up for each "cluster" (topology) from this global registry.

There is an internal representation of this registry in Knox now and it is populated today by the deployment machinery - signed registration code and all.

The registry is serialized to the {GATEWAY_HOME}/data/security/registry file.

It is later deserialized on restart of the gateway in order to have the in-memory model needed by the gateway but since the deployment machinery doesn't run on restart for existing topologies we need to preserve the previous state.

There are a couple things that aren't great about this mechanism:

The fact that it is tied to topologies initial deployment necessitates the serialized registry. If we can remove this file it would be very good. It introduces backward compatibility challenges at times.
The fact that this gateway service is a global namespace for topology/services means that the management of the central registry is more tedious than it would be if each topology has its own specific namespace that came and went with the topology itself.
The whole signed registrationCode is not needed since when Knox is being its own registry it doesn't really need authentication to be done for each

This whole thing may be able to be eliminated or replaced by Service Discovery or may need to continue to be supported with some improvements for deployments that do not have another registry like Ambari available.

We will need to make sure that we continue to meet the needs that this existing mechanism provides.

2. Simplified Topology Descriptor

In order to simplify what management UI's need to manage, we should support a much simpler descriptor that contains what is needed to generate an actual topology file at deployment time (see item #1 from the Motivation section).

A simplified topology descriptor must allow the following details to be specified:

Service discovery type
An identifier indicating which type of discovery to apply (e.g., Ambari, etc...)
Service discovery address
The associated service registry address
A provider configuration reference (a unique name, filename, etc...)
A unique name mapped to a set of provider configurations (see item #3 from the Motivation section)
A list of services to be exposed through Knox (with optional URL value)
A list of UIs to be proxied by Knox (per KIP-9)

YAML offers more structure than a properties file, and is suitable for adminstrators/developers hand-editing simple descriptors.

Proposed YAML

# Discovery info source
discovery-type: AMBARI
discovery-address: http://c6401.ambari.apache.org:8080
discovery-user: ambariuser
discovery-pwd-alias: ambari.discovery.password

# Provider config reference, the contents of which will be
# included in (or referenced from) the resulting topology descriptor.
# The contents of this reference has a <gateway/> root, and
# contains <provider/> configurations.
provider-config-ref : ambari-cluster-policy.xml

# The cluster for which the service details should be discovered
cluster: mycluster

# The services to declare in the resulting topology descriptor,
# whose URLs will be discovered (unless a value is specified)
services:
    - name: NAMENODE
    - name: JOBTRACKER
    - name: WEBHDFS
    - name: WEBHCAT
    - name: OOZIE
    - name: WEBHBASE
    - name: HIVE
    - name: RESOURCEMANAGER
    - name: AMBARI
      urls:
        - http://c6401.ambari.apache.org:8080
    - name: AMBARIUI
      urls:
        - http://c6401.ambari.apache.org:8080

# UIs to be proxied through the resulting Knox topology (see KIP-9)
#uis:
#   - name: AMBARIUI
#     url: http://c6401.ambari.apache.org:8080

While JSON is not really a format for configuration, it is certainly appropriate as a wire format, and will be used for API interactions.

Proposed JSON

{
  "discovery-type":"AMBARI",
  "discovery-address":"http://c6401.ambari.apache.org:8080",
  "discovery-user":"ambariuser",
  "discovery-pwd-alias":"ambari.discovery.password",
  "provider-config-ref":"ambari-cluster-policy.xml",
  "cluster":"mycluster",
  "services":[
     {"name":"NAMENODE"},
     {"name":"JOBTRACKER"},
     {"name":"WEBHDFS"},
     {"name":"WEBHCAT"},
     {"name":"OOZIE"},
     {"name":"WEBHBASE"},
     {"name":"HIVE"},
     {"name":"RESOURCEMANAGER"},
     {"name":"AMBARI", "urls":["http://c6401.ambari.apache.org:8080"]}
  ],
  "uis":[
     {"name":"AMBARIUI", "urls":["http://c6401.ambari.apache.org:8080"]}
  ]
}

3.Topology Generation

Given that we will have a Service Discovery service that can integrate with Ambari as well as other sources of needed metadata, we should be able to start with a simplified topology descriptor.
Once the deployment machinery notices this descriptor, it can pull in the referenced provider configuration, iterate over each of the services, UIs, applications and lookup the details for each.
With the provider configuration and service details we can then generate a fully baked topology.

Let's assume a properties file for now:

Listing 1. sample.properties

provider.config=ProductionAD

service=NAMENODE,JOBTRACKER,WEBHDFS,WEBHCAT,OOZIE,WEBHBASE,HIVE,RESOURCEMANAGER,AMBARI

ui=AMBARIUI

The above properties file should result in the provider config being pulled in, each servicen and ui (an other KIP) with its appropriate URLs and details added and should results in something like the following:

Listing 2. sample.xml Sample Topology

<?xml version="1.0" encoding="UTF-8"?>
<topology>
 <name>sample</name>
 <gateway>
   <provider>
     <role>authentication</role>
     <name>ShiroProvider</name>
     <enabled>true</enabled>
     <param>
       <name>sessionTimeout</name>
       <value>30</value>
     </param>
     <param>
       <name>main.ldapRealm</name>
       <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value>
     </param>
     <param>
       <name>main.ldapContextFactory</name>
       <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory</value>
     </param>
     <param>
       <name>main.ldapRealm.contextFactory</name>
       <value>$ldapContextFactory</value>
     </param>
     <param>
       <name>main.ldapRealm.userDnTemplate</name>
       <value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
     </param>
     <param>
       <name>main.ldapRealm.contextFactory.url</name>
       <value>ldap://localhost:33389</value>
     </param>
     <param>
       <name>main.ldapRealm.contextFactory.authenticationMechanism</name>
       <value>simple</value>
     </param>
     <param>
       <name>urls./**</name>
       <value>authcBasic</value>
     </param>
   </provider>
   <provider>
     <role>identity-assertion</role>
     <name>Default</name>
     <enabled>true</enabled>
   </provider>
   <provider>
     <role>hostmap</role>
     <name>static</name>
     <enabled>true</enabled>
   </provider>
 </gateway>

 <service>
   <role>NAMENODE</role>
   <url>hdfs://localhost:8020</url>
 </service>
 <service>
   <role>JOBTRACKER</role>
   <url>rpc://localhost:8050</url>
 </service>
 <service>
   <role>WEBHDFS</role>
   <url>http://c6401.ambari.apache.org:50070/webhdfs</url>
 </service>
 <service>
   <role>WEBHCAT</role>
   <url>http://c6401.ambari.apache.org:50111/templeton</url>
 </service>
 <service>
   <role>OOZIE</role>
   <url>http://localhost:11000/oozie</url>
 </service>
 <service>
   <role>WEBHBASE</role>
   <url>http://localhost:60080</url>
 </service>
 <service>
   <role>HIVE</role>
   <url>http://localhost:10001/cliservice</url>
 </service>
 <service>
   <role>RESOURCEMANAGER</role>
   <url>http://localhost:8088/ws</url>
 </service>
 <service>
   <role>AMBARI</role>
   <url>http://c6401.ambari.apache.org:8080</url>
 </service>

 <ui>
   <role>AMBARIUI</role>
   <url>http://c6401.ambari.apache.org:8080</url>
 </ui>

</topology>

3.1 Simple Descriptor Discovery

We should also consider how we will discover simple descriptors and I think that we may want to have multiple ways.

3.1.1 Local

Just as is currently done for topology files, Knox can monitor a local directory for new or changed descriptors, and trigger topology generation and deployment upon such events.
This is great for development and small cluster deployments.

3.1.2 Remote

For production and larger deployments, we need to be able to accommodate multiple instances of Knox better. My proposal for such cases is a ZooKeeper-based discovery mechanism.
Then, all Knox instances will pick up the changes from ZK as the central source of truth, and perform the necessary generation and deployment of the corresponding topology.

The location of these descriptors and their dependencies (e.g., referenced provider config) must be defined.

It would also be helpful to provide a means by which these descriptors can be easily published to the correct location in a znode.

4. Service Discovery

Item #2 from the Motivation section addresses the usability issues associated with having to manually edit a topology descriptor to specify the various service URLs in a cluster.
Beyond the obvious tedium associated with this task, there is real potential for human error, especially when multiple Knox instances are involved. The proposed solution to this
is automated service discovery, by which the URLs for the services to be exposed are discovered and added to a topology.

In order to transform a Simplified Topology Descriptor into a full topology, we need some source for the relevant details and metadata. Knox is often deployed in clusters
with Ambari and ZooKeeper also deployed, and both contain some level of the needed metadata about the service URLs wtihin the cluster.

Ambari provides everything we need to flesh out a full topology from a minimal description of what resources that we want exposed and how we want access to those resources protected.

The ZooKeeper service registry has promise, but we need to investigate the level of use from a registered services perspective and details available for each service.

4.1 Apache Ambari API

These are some excerpts of responses from the Ambari REST API which can be employed for topology discovery:

(CLUSTER_NAME is a placeholder for an actual cluster name)

Identifying Clusters - /api/v1/clusters

/api/v1/clusters

{
  "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/",
  "items" : [
    {
      "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME",
      "Clusters" : {
        "cluster_name" : "CLUSTER_NAME",
        "version" : "HDP-2.6"
      }
	}
  ]
}

Service Component To Host Mapping - /api/v1/clusters/CLUSTER_NAME/services?fields=components/host_components/HostRoles

/api/v1/clusters/CLUSTER_NAME/services?fields=components/host_components/HostRoles

“items” : [
  {
    “href” : "http://AMBARI_ADDRESS/api/v1/CLUSTER_NAME/CLUSTER_NAME/services/HIVE",
    “components” : [
      {
        “ServiceComponentInfo” : { },
        “host_components” : [
          {
            “HostRoles” : {
              “cluster_name” : “CLUSTER_NAME”,
              “component_name” :  “HCAT”,
              “host_name” : “c6402.ambari.apache.org”
            }
          }
        ]
      },
      {
        “ServiceComponentInfo” : { },
        “host_components” : [
          {
            “HostRoles” : {
              “cluster_name” : “CLUSTER_NAME”,
              “component_name” :  “HIVE_SERVER”,
              “host_name” : “c6402.ambari.apache.org”
            }
          }
        ]
      }
	]
  },
  {
    “href” : "http://AMBARI_ADDRESS/api/v1/CLUSTER_NAME/CLUSTER_NAME/services/HDFS",
    “ServiceInfo” : {},
    “components” : [
      {
        “ServiceComponentInfo” : { },
        “host_components” : [
          {
            “HostRoles” : {
              “cluster_name” : “CLUSTER_NAME”,
              “component_name” :  “NAMENODE”,
              “host_name” : “c6401.ambari.apache.org”
            }
          }
        ]
      }
	]
  }
]

Service Configuration Details (Active Versions) - /api/v1/clusters/ CLUSTER_NAME /configurations/service_config_versions?is_current=true

/api/v1/clusters/ CLUSTER_NAME /configurations/service_config_versions?is_current=true

"items" : [
  {
      "href" : "http://AMBARI_ADDRESS/api/v1/clusters/CLUSTER_NAME/configurations/service_config_versions?service_name=HDFS&service_config_version=2",
      "cluster_name" : "CLUSTER_NAME",
      "configurations" : [
        {
          "Config" : {
            "cluster_name" : "CLUSTER_NAME",
            "stack_id" : "HDP-2.6"
          },
          "type" : "ssl-server",
          "properties" : {
            "ssl.server.keystore.location" : "/etc/security/serverKeys/keystore.jks",
            "ssl.server.keystore.password" : "SECRET:ssl-server:1:ssl.server.keystore.password",
            "ssl.server.keystore.type" : "jks",
            "ssl.server.truststore.location" : "/etc/security/serverKeys/all.jks",
            "ssl.server.truststore.password" : "SECRET:ssl-server:1:ssl.server.truststore.password"
          },
          "properties_attributes" : { }
        },
        {
          "Config" : {
            "cluster_name" : "CLUSTER_NAME",
            "stack_id" : "HDP-2.6"
          },
          "type" : "hdfs-site",
          "tag" : "version1",
          "version" : 1,
          "properties" : {
            "dfs.cluster.administrators" : " hdfs",
            "dfs.encrypt.data.transfer.cipher.suites" : "AES/CTR/NoPadding",
            "dfs.hosts.exclude" : "/etc/hadoop/conf/dfs.exclude",
            "dfs.http.policy" : "HTTP_ONLY",
            "dfs.https.port" : "50470",
            "dfs.journalnode.http-address" : "0.0.0.0:8480",
            "dfs.journalnode.https-address" : "0.0.0.0:8481",
            "dfs.namenode.http-address" : "c6401.ambari.apache.org:50070",
            "dfs.namenode.https-address" : "c6401.ambari.apache.org:50470",
            "dfs.namenode.rpc-address" : "c6401.ambari.apache.org:8020",
            "dfs.namenode.secondary.http-address" : "c6402.ambari.apache.org:50090",
            "dfs.webhdfs.enabled" : "true"
          },
          "properties_attributes" : {
            "final" : {
              "dfs.webhdfs.enabled" : "true",
              "dfs.namenode.http-address" : "true",
              "dfs.support.append" : "true",
              "dfs.namenode.name.dir" : "true",
              "dfs.datanode.failed.volumes.tolerated" : "true",
              "dfs.datanode.data.dir" : "true"
            }
          }
        }
	  ]
	},
	{
      "href" : "http://AMBARI_ADDRESS/api/v1/clusters/CLUSTER_NAME/configurations/service_config_versions?service_name=YARN&service_config_version=1",
      "cluster_name" : "CLUSTER_NAME",
      "configurations" : [
        {
          "Config" : {
            "cluster_name" : "CLUSTER_NAME",
            "stack_id" : "HDP-2.6"
          },
          "type" : "yarn-site",
          "properties" : {
            "yarn.http.policy" : "HTTP_ONLY",
            "yarn.log.server.url" : "http://c6402.ambari.apache.org:19888/jobhistory/logs",
            "yarn.log.server.web-service.url" : "http://c6402.ambari.apache.org:8188/ws/v1/applicationhistory",
            "yarn.nodemanager.address" : "0.0.0.0:45454",
            "yarn.resourcemanager.address" : "c6402.ambari.apache.org:8050",
            "yarn.resourcemanager.admin.address" : "c6402.ambari.apache.org:8141",
            "yarn.resourcemanager.ha.enabled" : "false",
            "yarn.resourcemanager.hostname" : "c6402.ambari.apache.org",
            "yarn.resourcemanager.webapp.address" : "c6402.ambari.apache.org:8088",
            "yarn.resourcemanager.webapp.delegation-token-auth-filter.enabled" : "false",
            "yarn.resourcemanager.webapp.https.address" : "c6402.ambari.apache.org:8090",
		  },
          "properties_attributes" : { }
        },
	  ]
	},
]

Cluster Service Components - /api/v1/clusters/CLUSTER_NAME/components

/api/v1/clusters/CLUSTER_NAME/components

"items" : [
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/HCAT",
    "ServiceComponentInfo" : {
      "cluster_name" : "CLUSTER_NAME",
      "component_name" : "HCAT",
      "service_name" : "HIVE"
    }
  },
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/HIVE_SERVER",
    "ServiceComponentInfo" : {
      "cluster_name" : "CLUSTER_NAME",
      "component_name" : "HIVE_SERVER",
      "service_name" : "HIVE"
    }
  },
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/NAMENODE",
    "ServiceComponentInfo" : {
      "cluster_name" : "CLUSTER_NAME",
      "component_name" : "NAMENODE",
      "service_name" : "HDFS"
    }
  },
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/OOZIE_SERVER",
    "ServiceComponentInfo" : {
      "cluster_name" : "CLUSTER_NAME",
      "component_name" : "OOZIE_SERVER",
      "service_name" : "OOZIE"
    }
  }
]

4.2 ZooKeeper Service Registry

TODO: provide some sense of completeness, APIs and examples

4.3 Topology Change Discovery

Since the service URLs for a cluster will be discovered, Knox has the opportunity to respond dynamically to subsequent topology changes. For a Knox topology that has been generated and deployed, it's possible that the URL for a given service could change at some point afterward.

The host name could change. The scheme and/or port could change (e.g., http --> https). The potential and frequency of such changes certainly varies among deployments.

We should consider providing the option for Knox to detect topology changes for a cluster, and respond by updating its corresponding topology.

For example, Ambari provides the ability to request the active configuration versions for all the service components in a cluster. There could be a thread that checks this set, notices one or more version changes, and initiates the re-generation/deployment of that topology.

5. Provider Configurations

KIP-1 touched on an improvement for Centralized Gateway Configuration for LDAP/AD and the ability to import topology fragments into other topologies by referencing them by name within the topology.

While we haven't actually implemented anything along these lines yet, we are touching on something very similar in this proposal. We need to be able to author a set of Provider Configurations that can be referenced by name in the simple descriptor, that can be enumerated for inclusion in a dropdown for UIs that need to be able to select a Provider Configuration and a set of APIs added to the Admin API for management of the configuration with CRUD operations.

Whether we import the common Provider Configuration or pull in the contents of it into a generated topology can certainly be a design discussion. I think that importing at runtime would be best as we wouldn't have to regenerate all the topologies that included them if something is changed in the Provider Configuration. Though there would need to be some way to reimport the Provider Config at runtime if something changes.

6. Discovery Service Authentication

The discovery service must be authenticated by at least some of the service registries (e.g., Ambari). We need to define the means by which credentials are configured for this service in Knox.

Space shortcuts

Child pages

Status

Motivation

Service Discovery

1. Service Registry

2. Simplified Topology Descriptor

3.Topology Generation

Listing 1. sample.properties

Listing 2. sample.xml Sample Topology

3.1 Simple Descriptor Discovery

3.1.1 Local

4. Service Discovery

4.1 Apache Ambari API

Identifying Clusters - /api/v1/clusters

Service Component To Host Mapping - /api/v1/clusters/CLUSTER_NAME/services?fields=components/host_components/HostRoles

Service Configuration Details (Active Versions) - /api/v1/clusters/ CLUSTER_NAME /configurations/service_config_versions?is_current=true

Cluster Service Components - /api/v1/clusters/CLUSTER_NAME/components

4.2 ZooKeeper Service Registry

4.3 Topology Change Discovery

5. Provider Configurations

6. Discovery Service Authentication

Space shortcuts

Child pages

KIP-8 Service Discovery and Topology Generation

Status

Motivation

Service Discovery

1. Service Registry

2. Simplified Topology Descriptor

3.Topology Generation

Listing 1. sample.properties

Listing 2. sample.xml Sample Topology

3.1 Simple Descriptor Discovery

3.1.1 Local

4. Service Discovery

4.1 Apache Ambari API

Identifying Clusters - /api/v1/clusters

Service Component To Host Mapping - /api/v1/clusters/CLUSTER_NAME/services?fields=components/host_components/HostRoles

Service Configuration Details (Active Versions) - /api/v1/clusters/ CLUSTER_NAME /configurations/service_config_versions?is_current=true

Cluster Service Components - /api/v1/clusters/CLUSTER_NAME/components

4.2 ZooKeeper Service Registry

4.3 Topology Change Discovery

5. Provider Configurations

6. Discovery Service Authentication