To do this, the proposal is to:

allow multiple invokers to operate, but only a single is active

when the active invoker fails, an inactive one will become active

when an invoker becomes active, attempt to resurrect the free and prewarm pool members, so that existing usable containers are still useable. (may only apply to ContainerFactory impls that consider a cluster-wide view of containers)

Required Changes

To support running the mesos framework in an HA mode, where a failure will recover without losing existing containers, the following OpenWhisk changes will be proposed:

optionally allow invokers to join a cluster

optionally allow invokers to initialize with the same instance id (all invokers are id 0)

optionally allow activation feeds to operate as an Akka ClusterSingleton

optionally allow invokers to use Maps that replicates pool data to other invokers in the cluster, for use in failover scenarios

optionally allow ContainerFactory impls to create Container instances using an "attach()" function, for connecting to pre-existing containers.

Details

Background on required changes:

optionally allow invokers to join a cluster

needed to establish a cluster-wide singleton invoker and replicate data using Akka DistributedData

optionally allow invokers to initialize with the same instance id (all invokers are id 0)

needed to coax all invokers to consumer from the same activation topic (e.g. invoker0)

optionally allow activation feeds to operate as an Akka ClusterSingleton

needed so that feed consumers do not initiate consumption on multiple invokers (but invokers still become active, to receive replicated data)

optionally allow invokers to use Maps that replicates pool data to other invokers in the cluster, for use in failover scenarios

needed to cause replication of prewarm + free pool data to other inactive invokers

optionally allow ContainerFactory impls to create Container instances using an "attach()" function, for connecting to pre-existing containers.

needed to manufacture "ContainerProxy" actors using existing container metadata. (or allow ContainerFactory impls to avoid this scenario if desired.)

The changes proposed are specifically to cope with failure scenarios, so it is simplest to describe these in a sequence diagram to illustrate how these changes affect the outcome:

singleton-invoker singleton-invoker.gliffy

Space shortcuts

Page tree

Required Changes

Details

Space shortcuts

Page tree

Clustered Singleton Invoker for HA on Mesos

Required Changes

Details