This is a functional specification for the membership manager in Geode. This replaces the JGroupMembershipManager that is in the incubating version of Geode.
The primary functions of the membership manager are to implement membership for the distributed system and handle all message sending/receiving. It has a plug-in for the DistributionManager so it can receive messages and it must map between whatever internal identifiers are used for membership/messaging and Geode's InternalDistributedMember identifiers.
The membership manager can forcefully shut down a Geode cache if it detects it is no longer a member of the distributed system.
Interfaces
There are a number of existing interfaces in Geode that must be implemented by the membership manager:
MembershipManager - provides membership and messaging functionality to the DistributionManager
NetMember - represents a member ID in the membership manager, plugs into InternalDistributedMember
MemberServices - factory for creating a NetMember
NetView - this is actually a class that represents a membership set. It must be created by the membership manager for use by DistributionManager
QuorumChecker - used during AutoReconnect to poll to see if a quorum of a NetView is reachable
Internally there are interfaces in the membership manager that provide separation of concern for each of its components. This should allow us to plug in different implementations for each component such as ring-based or phi-accrual based health monitoring.
Service - interface implemented by all internal components
void init(Services s)
void start(CancelCriterion c) - called after all services have been initialized with init() and all services are available via Services
void started() - called after all servers have been started
void stop()
void stopped() - called after all services have been stopped
void installView(NetView v)
void beSick(), playDead(), beHealthy() - used for membership testing
ServiceFactory - used by the membership manager to instantiate its services
ServiceConfig create(Manager m, ServiceConfig sc)
Authenticator - authenticates a member
String rejectionMessage authenticate(NetMember m)
Object getCredentials()
HealthMonitor - monitors members and instigates removal of those deemed dead
void contactedBy(NetMember m) - tells the monitor that we've had contact with another member
void suspect(NetMember m) - tells the monitor that the member is suspected of being ill or dead
void checkSuspect(NetMember m) - requests a health check on another member. This should initiate removal of the member if it does not pass the test
JoinLeave - manages joining, leaving and removing members.
boolean join() - joins the distributed system and returns true if successful, false if not. Throws SystemConnectException and GemFireConfigException
void leave() - leaves the distributed system. Should be invoked before stop()
void remove(NetMember m) - force another member out of the system
Locator - used by TcpServer to handle peer-location requests. Implements TcpHandler
Manager - internal interface for working with the membership manager. Extends MessageHandler
void send(DistributionMessage m)
InternalDistributedMember getMemberID(NetMember m)
void forceDisconnect(String reason)
MessageHandler - receives messages from a Messenger
void handle(DistributionMessage m)
Messenger - sends and receives messages. All messages are delivered to the Manager unless there is a handler installed for the message's class
void addHandler(Class c, MessageHandler h) - adds a handler for the given class/interface of messages
void send(DistributionMessage m) - sends an asynchronous message
NetMember getMemberID() - returns the endpoint ID for this member
Services - provides access to ServiceConfig and a directory of the membership manager's internal components
get/setAuthenticator
get/setConfig
get/setHealthMonitor
get/setJoinLeave
get/setLocator
get/setManager
ServiceConfig - provides configuration information for the manager and its components
DistributionConfig getDistributionConfig()
Properties getProperties()
Implementation Notes
In order to preserve as much of the current membership behavior as possible, fostering adoption of Geode by the GemFire user base the existing JGroupMembershipManager will be copied and most of its code will be preserved. It will continue to hold the DirectChannel but will now also hold a ServiceConfig that it will use in place of the JGroups channel.
The implementation of each of the other components will be in separate packages to keep the code clean and possibly allow for different implementations to be plugged in.
The Authenticator implementation will use Geode's authentication API to authenticate another member and to get credentials for JoinLeave to use in sending membership views and join requests.
The HealthMonitor implementation will initially use the NetView to form a look-to-the-right ring for one member to monitor another. HealthMonitor will keep a record of the last time a message was received from each member in the system (note - this must be done without clock probes, possibly following the pattern in EventTracker). If the member it is watching has not made contact in the last member-timeout milliseconds it will request a heartbeat from the member and perform a timed attempt to connect to the members DirectChannel port (if available) and request a health response.If the member does not respond within member-timeout milliseconds HealthMonitor will remove it using the JoinLeave.removeMember() API. The implementation of removeMember will forward the request to the current membership coordinator who will perform its own health-check on the member before removing it (sending out a new NetView). When the ping request has been sent HealthMonitor will go on to examine the next member in the view.
TCPConduit will be modified to check for a health request and respond with its membership ID. The HealthMonitor will use this to ensure that the port hasn't been reused by another process.
The JoinLeave implementation will use Messenger, and possibly the membership manager, to communicate with other members. It will use TcpClient to contact Locators when joining in order to find the current membership coordinator. Once it knows the coordinator it will send it a Join message including authentication credentials. JoinLeave will also implement membership coordination functions (i.e., replace what we're doing with JGroups GMS). It will be responsible for detecting a network partition and invoking forceDisconnect() in the membership manager.
The Locator component will persist the current membership view and will respond to requests for the ID of the current membership coordinator. If there is no membership coordinator (meaning the Locator is booting up) then it will return its best guess of who the coordinator is based on who has contacted it. The name of the locator's state file will be changed to membershipView.dat
The Manager API is what should be used by all components to interact with the membership manager.
The Messenger component will use a trimmed-down modern JGroups channel to perform UDP messaging. JGroups will no longer be forked for use in Geode but will be added as a dependency. Messenger will be responsible for installing the current NetView in its JGroups protocol stack as a native JGroups View so that UDP broadcast works and multicast message garbage-collection can be properly performed. Note that this switch to using off-the-shelf JGroups means we will start seeing more log messages from JGroups than in the past.
Also note that we may not be able to switch to a newer version of JGroups without risking rolling upgrade support. If the new version of JGroups is not on-wire compatible with the previous version people will not be able to perform a rolling upgrade.
It will be Messenger's responsibility to install Geode's settings from the DistributionConfig (gemfire.properties) into its JGroups channel. The protocol stack should look something like this: <UDP> <BARRIER> <pbcast.NAKACK2> <UNICAST3> <pbcast.STABLE> <MFC> <UFC> <FRAG2>. Of course there will be lots of settings in each of these protocols to customize the stack. There is no requirement that the JGroups stack configuration be in an external file. It can be a string embedded in the Messenger implementation. XML will need to be used because the JGroups PlainConfigurator still uses a colon as a protocol separator and this is incompatible with IPv6 addresses.
All of the JGroups statistics in DistributionStats need to be removed or replaced with corresponding stats based on the new implementation.
Testing
Since this is implementing an existing interface in Geode there are already a lot of tests that exercise it. These tests will need some attention if they are referring to any JGroups code. The use of interfaces in this version of the MembershipManager should allow us to create real unit tests, as opposed to integration tests, for each component to achieve a higher level of code coverage.
Jepsen testing should be performed to ensure that the membership manager behaves as expected during network failures, GC pauses, etc. All releases of Geode should require Jepsen testing.