v0.9.4 Node Classes
- FlumeNode
- FlumeVMInfo
- SystemInfo
- LivenessManager (issues heartbeats)
- MasterRPC (interface for all client to master rpcs)
- LogicalNodeManager (manages logical nodes lifecycle)
- ReportManager (gets node status/metrics reports)
- MasterReportPusher (sends node status/metrics reports to master)
- field walMans (mapping from logical node name to WAL Manager instance)
- field dfoMans (mapping from logical node name to DFO manager instance)
- ChokeManager (manages experimental throttling mechanism)
- LivenessManager
- MasterRPC (reference to FlumeNode's instance)
- LogicalNodeManager (reference to FlumeNode's instance)
- HeartbeatThread
- CheckConfigThread
- WALAckManager
- WALCompletionNotifier
- field fcdQ (flume config data queue)
How a heartbeat works
- LivenessManager instantiated
- LivenessManager starts, starting CheckConfigThread and HeartbeatThread
- HeartbeatThread periodically perform heartbeat checks asking the master:
- Does this physical node have all logical nodes instantiated?
- If not instantiated, get configuration and spawn it. (possible issue, same thread)
- If instantiated, skip
- If node exists but not on master, decommission it.
- Does the current logical nodes have their configs up-to-date?
- If so, skip.
- If not, queue up heartbeat info so CheckConfigThread can attempt to update atomically.
- Are there any new e2e ack groups the node should check for?
- Ask master if the acks that the WALManager have registered are safe.
- Remove ack from pending queue and signal WALManagers that ackgroup is safe
- Are there any old e2e ack groups that the node should resend?
- Pending ack groups have a timestamp. If retransmit time has elapsed, signal to WALManagers that ackgroup has expired
How a logical node gets spawned
{"serverDuration": 67, "requestCorrelationId": "1070a6084e9505dc"}