...

Move cluster-specific service HA configuration from the HaProvider configuration to the service declaration.
- This just makes more logical sense
- Moving cluster-specific details out of the provider configuration will make shared provider configurations applicable to more topologies (i.e., more reusable), across clusters.
- The cluster-specific configuration could be discovered along with the service URL details, rather than having to be hard-coded in a provider configuration.
- This will be treated as override configuration, so that the existing approach (i.e., complete configuration as HaProvider param values) will continue to work, to satisfy backward-compatibility requirements.

Support new enabled value "auto" to allow service-specific configuration to determine whether HA treatment is enabled for that service.

Code Block

language	xml
title	Example HaProvider Configuration with enabled=auto

  <provider>
    <role>ha</role>
    <name>HaProvider</name>
    <enabled>true</enabled>
    <param>
      <name>WEBHDFS</name>
      <value>maxFailoverAttempts=3;failoverSleep=1000;maxRetryAttempts=300;retrySleep=1000;enabled=auto</value>
    </param>
  </provider>

In this case, the topology generation will reference the dfs.ha.automatic-failover.enabled property in the HDFS configuration to determine whether the HaProvider should be enabled for WEBHDFS.

Open Items

Name the service-level override param for the service-specific HaProvider param component named 'enabled'
'enabled'
'haEnabled'

'ha.enabled'

Which services are currently purely topology-based (DefaultURLManager)?
- WEBHCAT
- OOZIE
- WEBHDFS
- ?
Which services are currently ZooKeeper-based (BaseZooKeeperURLManager)?
- HIVE (HS2ZooKeeperURLManager)
- HBASE (HBaseZooKeeperURLManager)
- Kafka (KafkaZooKeeperURLManager)
- SOLR (SOLRZooKeeperURLManager)
- ATLAS (AtlasZookeeperURLManager)
Could ZooKeeper support be added for any services which do no currently support it?
- These have zookeeper-related configuration (see table):
  - WEBHDFS
  - OOZIE
  - YARN
  - WEBHCAT
  - RESOURCEMANAGER

For the ZooKeeper-based-HA services, determine if the ZooKeeper details are available from the service's configuration via Ambari.
Can "HA mode" be determined for every service type from the cluster configuration details? Can Knox dynamically identify HA-configured services, and generate the topology accordingly?
Determine how to leverage the cluster discovery data to generate the ZooKeeper HA configuration for the relevant declared topology services.

...

Code Block

language	js
theme	Confluence
title	Proposed Simple Descriptor Service Params - YAML

---
discovery-address: http://localhost:8080
discovery-user: maria_dev
provider-config-ref: sandbox-providers
cluster: Sandbox

services:
    - name: NAMENODE
    - name: JOBTRACKER
    - name: WEBHDFS
      params:
        maxFailoverAttempts: 5
        maxRetryAttempts: 5
        failoverSleep: 1001
    - name: WEBHBASE
    - name: HIVE
      params:
        haEnabled: true
        maxFailoverAttempts: 5
        maxRetryAttempts: 5
        failoverSleep: 1001
        zookeeperNamespace: hiveserver2                                           # Optionally, omit this, and Knox could discover it
        zookeeperEnsemble: http://host1:2181,http://host2:2181,http://host3:2181  # Optionally, omit this, and Knox could discover it
    - name: RESOURCEMANAGER

...

Code Block

language	js
theme	Confluence
title	Proposed Simple Descriptor Service Params - JSON

{
  "discovery-address":"http://localhost:8080",
  "discovery-user":"maria_dev",
  "provider-config-ref":"sandbox-providers",
  "cluster":"Sandbox",
  "services":[
    {"name":"NAMENODE"},
    {"name":"JOBTRACKER"},
    {"name":"WEBHDFS",
       "params": {
        "maxFailoverAttempts": "5",
        "maxRetryAttempts": "5",
        "failoverSleep": "1001"
       }
    },
    {"name":"WEBHBASE"},
    {"name":"HIVE",
       "params": {
        "haEnabled": "true",
        "maxFailoverAttempts": "4",
        "maxRetryAttempts": "6",
        "failoverSleep": "5000",
        "zookeeperNamespace": "hiveserver2",
        "zookeeperEnsemble": "http://host1:2181,http://host2:2181,http://host3:2181"
       }
    },
    {"name":"RESOURCEMANAGER"}
  ]
}

Code Block

language	xml
theme	Confluence
title	Proposed Generated Topology XML

...
  <provider>
    <role>ha</role>
    <name>HaProvider</name>
    <enabled>true</enabled>
    <param>
      <name>WEBHDFS</name>
      <!-- No cluster-specific details here -->
      <value>maxFailoverAttempts=3;failoverSleep=1000;maxRetryAttempts=300;retrySleep=1000;enabled=true</value>
    </param>
    <param>
      <name>HIVE</name>
      <!-- No cluster-specific details here -->
      <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true</value>
    </param>
  </provider>
</gateway>

<service>
  <role>WEBHDFS</role>
  <url>http://host1:50070/webhdfs</url>
  <url>http://host2:50070/webhdfs</url>
</service>

<service>
  <role>HIVE</role>
  <!-- Cluster-specific details here -->
  <param>
    <name>zookeeperEnsemble</name>
    <value>host1:2181,host2:2181,host3:2181</value>
  </param>
  <param>
    <name>zookeeperNamespace</name>
    <value>hiveserver2</value>
  </param>
  <param>
    <name>haEnabled</name>
    <value>true</value>
  </param>



</service>
...

Space shortcuts

Child pages

Versions Compared

Old Version 29

New Version Current

Key

Open Items

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 29

New Version Current

Key

Open Items