The current workflow of queueing activations to per-invoker topics has some limitations like:
- early routing: invokers are all treated the same, and once an invoker is selected to process an activation, it is queued for that invoker, even though a different invoker may end up having more capacity to process it earlier
- all actions are forced through a serial processing scheme: even if we want to allow an action to process multiple activations concurrently, we cannot because the only tracking currently done is "how many activations have been sent to the invoker"
- more generally, the impact of execution of an activation on the invoker/container is fixed, but in reality different actions may have different impact (may/not support concurrent activations, may/not support longer running activations, etc)
Proposal
Provide a unified overflow topic and route an overflowed activation to an invoker ONLY after an invoker has become available to process the activation.
This provides benefits like: queueing behavior is predictable (since invokers won't be subjected to loads they cannot handle), and scheduling activations to a specific invoker will be more fair (instead of some guessing).
- phase 1
- introduce a unified overflow topic - activations flow here in case all invokers reach execution capacity
- existing topics become more specifically "waiting for execution", instead of "waiting for capacity”+"waiting for execution"
- phase 2
- advertise "real" capacity and container state to controller via health messages
- allow controller to leverage these specific values to evaluate more precise invoker availability
Diagrams
Before:
After: