You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 17 Next »

Design details and discussion for KNOX-88 

 Definition 

Knox HA is a set of routines for transparent work with Hadoop service that stands in HA mode.

 Purpose of Knox HA service

  1. Automatic failover. (Example: switch request from not responding name-node to active name-node.)
  2. Pluggable support of failover strategies.
  3. Daemon-service for regular ping of Hadoop service state (Performance optimization to keeping actual state of service).

Architecture

 

New provider will be added (descendant ProviderDeploymentContributorBase class) with a set of filters. See Pic.#1 for common architecture.

Pic. #1 – Providers architecture

 

Definition.

Alias – set of Hadoop name-nodes configured for High Availability mode.

 

Definition.

High Availability Strategy – plan of defining active name-node and switching between active and stand-by name-nodes. Strategy may contain such parameters as retryCount and timeoutInterval. See Pic.#2 for class diagram for HA mode.

 

Pic.#2 Class diagram for HA mode.

 

 

See Table #1 for class description.

Table #1. – HA mode new classes description.

#Class nameDescription
1HaUrlRewriteFunctionDescriptorDescribes function that resolves URLs in HA mode
2HaUrlRewriteFunctionProcessorImplements main logic of defining active or standby URL
3HaBaseStrategyHostMapperImplements base strategy for HA mode. Contains parameters: retryCount, timeoutInterval.

 

See Pic.#2 for  UML sequence diagram for UrlRewriteProcessor.


Pic #3 – UML sequence diagram for UrlRewriteProcessor.

Provider configuration example

 

Enables or disables HA Provider and binds strategy and provider together. Alias contains list of Hadoop services (name-nodes in our case: active and standby) grouped into one entity.

 

Topology
<topology>
  <gateway>
    ...
    <provider>
    <role>ha</role>
    <name>HAProvider</name>
    <param>
        <name>webhdfs.ha</name>
        <value>failover_strategy=BaseStrategy;retryCount=3;timeoutInterval=5000;enabled=true</value>
    </param>
</provider>
    ...
  <gateway>
  ...
  <service>
    <role>WEBHDFS</role>
    <url>machine1.example.com:50070</url>
    <url>machine2.example.com:50070</url>
  </service>
  ...
<service>
    <role>NAMENODE</role>
    <url>machine1.example.com:50070</url>
    <url>machine2.example.com:50070</url>
</service>
...
</topology>

Parameters description:

  • failover_strategy – indicates how to define active service and contains some configuration parameters. Default value is BaseStrategy. BaseStrategy for failover has following parameters:
  • retryCount – indicates how many times knox will ping name-node before  knox decides that namenode is down.
  • timeoutInterval – interval for connection timeout. 
  • enabled – indicates whether  HAProvider  is active or not for service.

 

Example UML

Diagram Title DeploymentFactory(df)

Example Code Block

HaBaseStrategyHostMapper
public class HaBaseStrategyHostMapper implements HostMapper {

    @Override
    public String resolveInboundHostName(String inboundHost) {
 		//TODO: implement host resolution here
        return null;
    }
    @Override
    public String resolveOutboundHostName(String outboundHost) {
		//TODO: implement host resolution here
        return null;
    }
}
  • No labels