Design details and discussion for KNOX-88
Definition
Knox HA is a set of routines for transparent work with Hadoop service that stands in HA mode.
Purpose of Knox HA service
- Automatic failover. (Example: switch request from not responding name-node to active name-node.)
- Pluggable support of failover strategies.
- Daemon-service for regular ping of Hadoop service state (Performance optimization to keeping actual state of service).
Architecture
New provider will be added (descendant ProviderDeploymentContributorBase class) with a set of filters. See Pic.#1 for common architecture.
Pic. #1 – Providers architecture
Definition.
Alias – set of Hadoop name-nodes configured for High Availability mode.
Definition.
High Availability Strategy – plan of defining active name-node and switching between active and stand-by name-nodes. Strategy may contain such parameters as retryCount and timeoutInterval. See Pic.#2 for class diagram for HA mode.
Pic.#2 Class diagram for HA mode.
See Table #1 for class description.
Table #1. – HA mode new classes description.
# | Class name | Description |
---|---|---|
1 | HaUrlRewriteFunctionDescriptor | Describes function that resolves URLs in HA mode |
2 | HaUrlRewriteFunctionProcessor | Implements main logic of defining active or standby URL |
3 | HaBaseStrategyHostMapper | Implements base strategy for HA mode. Contains parameters: retryCount, timeoutInterval. |
See Pic.#2 for UML sequence diagram for UrlRewriteProcessor.
Pic #3 – UML sequence diagram for UrlRewriteProcessor.
Provider configuration example
Enables or disables HA Provider and binds strategy and provider together. Alias contains list of Hadoop services (name-nodes in our case: active and standby) grouped into one entity.
<topology> <gateway> ... <provider> <role>ha</role> <name>HAProvider</name> <param> <name>webhdfs.ha</name> <value>failover_strategy=BaseStrategy;retryCount=3;timeoutInterval=5000;enabled=true</value> </param> </provider> ... <gateway> ... <service> <role>WEBHDFS</role> <url>machine1.example.com:50070</url> <url>machine2.example.com:50070</url> </service> ... <service> <role>NAMENODE</role> <url>machine1.example.com:50070</url> <url>machine2.example.com:50070</url> </service> ... </topology>
Parameters description:
- failover_strategy – indicates how to define active service and contains some configuration parameters. Default value is BaseStrategy. BaseStrategy for failover has following parameters:
- retryCount – indicates how many times knox will ping name-node before knox decides that namenode is down.
- timeoutInterval – interval for connection timeout.
- enabled – indicates whether HAProvider is active or not for service.
Example UML
Example Code Block
class X { }