Topology Policy Separation
Since the very beginning of the Knox design though the current release of 0.4.0, the topology file used to deploy cluster topologies as consisted of both policy enforcement "provider" definitions and service definitions.
There are a couple problems with this approach.
- expected sources of topology information will not contain the information and configuration required for policy enforcement or provider selection
- the configuration of the providers within each topology are often redundant and can present a management issue when changes are required to deployed topologies
- the topology file ends up much more like a configuration file than a deployment descriptor
As Apache Knox matures it needs to start providing management capabilities consisting of:
- Management APIs
- Console Applications/UIs and/or Ambari Views
- Centralized Policy Management
- Topology Discovery through Ambari, ZooKeeper or other registries
This document will discuss policy management details and how to separate policy from topology information and organize it within a policy store.
High-level Reusable Policy Files
Let's start with a highly readable policy file syntax that encompasses all of the pertinent semantics without requiring the low level details for enforcement.
JSON presents a good choice for this as it is very readable yet structured.
The following non-normative example demonstrates the:
- removal of the notion of "role" - role becomes the policy type
- removal of the notion of "enabled" - inclusion implies enabled
- removal of the low-level config details
- reference to the needed details
- ability to compose reusable policies with reusable config
Let's consider this the default topology policy:
"name" : "shiro",
"config" : "basic-ldap-1"
"name" : "kerberos",
"config" : "kdc-1"
"name" : "AclsAuthz",
"config" : "default-authz"
"name" : "hostmap",
"config" : "sandbox"
Low-level Reusable Configuration Files
The low-level details of ldap bind, search-bind and acceptance of HTTP BASIC authentication are details that are required by the provider enforcing the declared policy and do not need to be seen or even understood by the topology policy author. All they need to know is which configuration to use for HTTP BASIC against a particular LDAP server instance.
The way that we abstract these details away from policy authors is by managing them as separate but capable of being referenced by the policy by name.
The shiro configuration for BASIC authentication against LDAP with a simple bind is an example of one such config file.
session timeout in minutes, this is really idle timeout,
defaults to 30mins, if the property value is not defined,,
current client authentication would expire if client idles contiuosly for more than this value
Policy Store Structure
The policy store can be simply structured as files within directories along the following lines:
The above illustrates the basic structure and ability to locate referenced configuration details from within a topology policy file.
Deployment Machinery Changes
In order to break apart the policy and service definitions, Knox will need to be able to bring them together at deployment time.
We can do a couple things here:
- introduce a new topology file that ends with topo or something other than xml
- upon discovering a new topo file the deployment machinery will resolve a referenced high level policy file or in the absence of a reference use "default-topology.json" as the policy
- it will then combine the two into the currently expected .xml file with providers and services in a single file
- everything will just work as it does now upon discovery of the .xml file
- change the topology parsing rules to have to dereference the referenced or implied policy and configurations files
- much more complex and adds obvious risk
- would essentially go right from a service definition file to gateway.xml as the enforceable policy
The ability to centrally manage these policy files for a cluster of Knox instances will require the use of ZooKeeper or some other synchronization across the instances.
Of course we could consider the use of an NFS mount or some other mechanism as well.
Changes to the policy or configuration files that are being used by deployed topologies will require redeployment of the topology file.
Keeping an index of those topologies that are using which would allow it to be automated when changes are made through the management APIs.
Manually changing the policy or configuration files will require manual restart of the topologies that are using them.