This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Child pages
  • Enabling YARN ResourceManager High Availability
Skip to end of metadata
Go to start of metadata
Note: The following steps are for development purposes only. Ambari 1.7.0 and above exposes the ability to enable ResourceManager High Availability directly from the web client and should be used for real-world use cases.

Prerequisites

  • HDP 2.0+
  • YARN Installed
  • At least 2 hosts in the cluster where YARN is not present on 1 host. These steps assume there are 3 hosts in the cluster.

Assumptions

This table lists the placeholders in the various steps below.

 

PlaceholderDescriptionExample
ambari-serverThe server running the Ambari web clientc6401.ambari.apache.org:8080
cluster-nameThe name of the clustercluster1
target-hostThe host that will have the additional ResourceManagerc6403.ambari.apache.org
rm-host-activeThe host that will run the active ResourceManagerc6402.ambari.apache.org
rm-host-standbyThe host that will run the standby ResourceManagerc6403.ambari.apache.org
zk-host-1, zk-host-(n)The ZooKeeper hosts configured on the cluster.

c6401.ambari.apache.org:2181,

c6402.ambari.apache.org:2181,

c6403.ambari.apache.org:2181

yarn-cluster-nameThe YARN cluster name.yarn_cluster
yarn-site-tagThe tag property of Ambari's yarn-site
version1

 

Installing ResourceManager

  1. Stop all services except for HDFS. There are two methods that can be used to accomplish this:
    1. Using the Ambari web client to manually stop each service.
    2. Using the Ambari REST APIs directly. For this method, you can consult the documentation on starting and stopping services.

  2. Add a ResourceManager component to YARN on a host that does not already have ResourceManager installed. This will not actually install ResourceManager, but will setup the host component associations.

    curl -u admin:$PASSWORD -H "X-Requested-By: Ambari" -i -X POST -d '{"host_components" : [{"HostRoles":{"component_name":"RESOURCEMANAGER"}}] }' http://<ambari-server>/api/v1/clusters/<cluster-name>/hosts?Hosts/host_name=<target-host>
  3. Install ResourceManager on the same target_host used in the previous step.

    curl -u admin:$PASSWORD -H "X-Requested-By: Ambari" -i -X PUT -d '{"RequestInfo":{"context":"Install ResourceManager","operation_level":{"level":"HOST_COMPONENT","cluster_name":"<cluster-name>","host_name":"<target-host>","service_name":"YARN"}},"Body":{"HostRoles":{"state":"INSTALLED"}}}' http://<ambari-server>/api/v1/clusters/<cluster-name>/hosts/<target-host>/host_components/RESOURCEMANAGER

YARN Configuration

  1. The yarn-site configuration group (yarn-site.xml) must now be updated to reflect the backup ResourceManager. 

    propertyvalue
    yarn.resourcemanager.ha.enabledtrue
    yarn.resourcemanager.ha.rm-idsrm1,rm2
    yarn.resourcemanager.hostname.rm1<rm-host-active>
    yarn.resourcemanager.hostname.rm2<rm-host-standby>
    yarn.resourcemanager.recovery.enabledtrue
    yarn.resourcemanager.store.classorg.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
    yarn.resourcemanager.zk-address<zk-host-1>, <zk-host-2>, <zk-host-3>
    yarn.resourcemanager.cluster-id<yarn-cluster-name>
    yarn.resourcemanager.ha.automatic-failover.zk-base-path/yarn-leader-election

     

     

  2. There are two methods to update these configuration properties:
    1. /var/lib/ambari-server/resources/scripts/configs.sh
    2. REST API

  3. configs.sh

    $ ./configs.sh set <ambari-server> <cluster-name> yarn-site <param-name> <param-value>
     
    example
    $ ./configs.sh set c6401.ambari.apache.org cl1 yarn-site yarn.resourcemanager.ha.enabled true

    This will need to be execute for each parameter listed.

     

  4. REST API

    1. Get the existing yarn-site configuration version and tag.

      GET http://<ambari-server>/api/v1/clusters/<cluster-name>?fields=Clusters/desired_configs
      ...
      {
        "href" : "http://<ambari-server>/api/v1/clusters/<cluster-name>?fields=Clusters/desired_configs",
        "Clusters" : {
          "cluster_name" : "c1",
          "version" : "HDP-2.1",
          "desired_configs" : {
            ...
            "yarn-site" : {
              "tag" : "version1",
              "user" : "_anonymous",
              "version" : 1
            },
            ...
          }
        }
      }
    2. Using the value from the tag property of yarn-site, request the current YARN configuration. In this example, <yarn-site-tag> would be "version1".

      GET http://<ambari-server>/api/v1/clusters/<cluster-name>/configurations?(type=yarn-site&tag=<yarn-site-tag>)
      ...
        "href" : "http://<ambari-server>/api/v1/clusters/<cluster-name>/configurations?(type=yarn-site&tag=version1)",
        "items" : [
          {
            "href" : "http://<ambari-server>/api/v1/clusters/<cluster-name>/configurations?type=yarn-site&tag=version1",
            "tag" : "version1",
            "type" : "yarn-site",
            "version" : 1,
            "Config" : {
              "cluster_name" : "c1"
            },
            "properties" : {
              "yarn.acl.enable" : "false",
              "yarn.admin.acl" : "",
              ...
              "yarn.timeline-service.ttl-enable" : "true",
              "yarn.timeline-service.ttl-ms" : "2678400000",
              "yarn.timeline-service.webapp.address" : "c6402.ambari.apache.org:8188",
              "yarn.timeline-service.webapp.https.address" : "c6402.ambari.apache.org:8190"
            }
          }
        ]
      }
    3. The properties property contains the current yarn-site configuration. Append the new properties to this structure and then update yarn-site.

      PUT http://<ambari-server>/api/v1/clusters/<cluster-name>
      
      {  
         "Clusters":{  
            "desired_config":{  
               "type":"yarn-site",
               "tag":"<new-yarn-site-tag>",
               "properties"::{  
                  "yarn.acl.enable":"false",
                  "yarn.admin.acl":"",
                  ... 
                  "yarn.timeline-service.ttl-enable":"true",
                  "yarn.timeline-service.ttl-ms":"2678400000",
                  "yarn.timeline-service.webapp.address":"c6402.ambari.apache.org:8188",
                  "yarn.timeline-service.webapp.https.address":"c6402.ambari.apache.org:8190"
                  ...
                  <yarn-ha-site-properties>
                  "yarn.resourcemanager.ha.enabled": "true",
                  "yarn.resourcemanager.ha.rm-ids": "rm1,rm2",
      			...
               }
            }
         }
      }
  5. Restart all stopped services.
  • No labels