To be Reviewed By: 14 august 2020
Authors: Bruce Schuchardt, Jake Barrett, Bill Burcham
Status: Draft | Discussion | Active | Dropped | Superseded
Superseded by: N/A
Related: Client side configuration for an ingress proxy
Problem
Users would like to be able to put Geode clusters behind a proxy. This would allow users to avoid having to allocate public IP addresses for all of the locators and servers, as well as being able to control and monitor access to them.
Currently, this is not supported. Geode clusters discover servers using remote locators. It is possible to set the hostname-for-senders on all server GatewayReceivers to point to a proxy, but this is not ideal because the proxy cannot distinguish between the different Geode servers in the cluster. Thus, Locators and GatewaySenders need a way for them to tell the proxy the name of the server to which they wish to connect.
Anti-Goals
It is an anti-goal to deal with configuring the proxy itself.
It is an anti-goal to specify support for all different kinds of proxies.
We're not recommending any specific proxy.
This proposal applies to inter-cluster communications, not communications within a cluster itself.
Solution
We will add a way for the user to set a proxy for Locators and GatewaySenders to connect to a proxy and inform it of the desired Locator/GatewayReceiver. When this is done all connections from the one cluster’s Locators and GatewaySenders to remote Locators or remote GatewayReceivers will use the specified proxies.
The locators and servers in a cluster will share the same proxy address so that configuration can be kept to a minimum. That is, servers and locators will not need to know what the proxy is for their cluster or individual servers/locators in the cluster. The locators in one cluster will only need to know the proxy to access another cluster and will use that for all locators and servers in that other cluster.
Locator Configuration
WAN discovery in Locators is currently implemented by specifying one or more remote-locators as a connection property
remote-locators=L[port],M[port],N[port]
We will augment this to support a proxy
remote-locators=<proxy-protocol>://<proxy-address>:<proxy-port>/<locator-address>[(:<locator-port>|[<locator-port]>]]?<proxy-configuration-options>
The locator-port portion is optional, since it may not be needed for some proxies, and allows for either the old host[port] form or the more conventional host:port form.
For example
remote-locators=sniproxy://proxyhostname:10334/locatorhostname
remote-locators=socks5://sockshostname/locatorhostname:10334
Server Configuration
Servers will need no extra configuration. When GatewaySenders initialize their connection pool they query locators to get the server addresses associated with their configured remote-distributed-system-id. We will modify the responses to include the proxy configuration for each remote cluster so that the servers can properly configure their GatewaySenders’ connection pools.
Use Cases
The primary use case is replication of events from one cluster to another when direct communication between the nodes of one cluster to the other is only possible through a proxy. For instance, one might have two Kubernetes clusters using an ingress proxy (such as HAProxy or Envoy). Each proxy can resolve the host names of the locators and servers in its cluster but those host names are unresolvable outside of the cluster. In this situation one cluster communicates with its remote counterpart by connecting to the proxy and indicating the server/locator to which it wishes to connect.
Performance Impact
Connecting through a proxy will impact the performance of inter-cluster messaging, but it is up to the user to decide if they want to use this feature or not. An implementation may also require the use of TLS (e.g., for an SNI proxy) which would also affect performance vs non-TLS communications.
Documentation
Configuration of remote-locators will need to be revised.
Backwards Compatibility and Upgrade Path
TBD - depends on the proxy implementation.
Prior Art
There has been an effort to provide WAN access to a remote cluster without the use of an ingress proxy. In that effort all servers/locators pointed to the same service address using the hostname-for-clients and were connected to a locator or server without using the SNI server name option. See this document. We feel that this is not the ideal solution in that it does not provide correct communication with the intended server but instead hooks you up to a random server chosen by the service. Two connections could end up going to different servers, for instance.
FAQ
Is this platform specific?
No this feature is intended to be platform agnostic.
Does this require a firewall?
This is an implementation detail that will be left to the users discretion.
Does this support endpoint verification?
Yes, endpoint verification also uses the SNI extension to match against the Subject Alternative Name fields in a server’s certificate during the TLS handshake.
Errata
What are minor adjustments that had to be made to the proposal since it was approved?
References
https://cwiki.apache.org/confluence/display/GEODE/Allow+same+host+and+port+for+all+gateway+receivers
https://cwiki.apache.org/confluence/display/GEODE/Client+side+configuration+for+a+SNI+proxy
2 Comments
Alberto Bustamante Reyes
Does this RFC imply to do revert of changes introduced by the previous work? ( Allow same host and port for all gateway receivers )
Bruce J Schuchardt
Hi Alberto
this work will not undo the changes you made