General Questions
How can we easily track the large epics of work that need to be accomplished?
How can we track which items are currently being worked on?
Areas/epics of work and Actions for each prioritized:
Short Term Epics
Epics/issues to make components better deployable / manageable on Kube.
"Dockerization" of OpenWhisk Components
Goal: Minimize reliance on Ansible (for building in configurations)
Goal would be to Dockerize as much as possible
Have Kube YAML files as the primary source of configuration
Utilize ENVIRONMENT vars
“bake” other configs into Docker images
Use shared storage via Kubernetes volumes:
3rd party volumes are managed through YAML files (e.g NFS volume mounts or any persistent infrastructure mount options. see here)
Determine if we need to do better health checking to determine if we should start. Do not fail to run a process
Action Items:
Build Nginx with all OpenWhisk specific requirements (wsk, blackbox) pre-built into the Docker image.
- helps generate certificates
- create a Kube ConfigMap or Secrets Resource from those certs and a static Nginx.conf file. Where this nginx.conf file is specific to an environment
- Have yaml file(s) for the Kube Deployment and Service which uses the generated ConfigMap
Provide the ability for the controller to receive updates that new invoker instances are able to be used.
This already happens by default. Currently Kafka can receive new topics to automatically be created and used.
Need to make sure we use stateful sets so controller has unique names.
On the initial startup, Kafka should register the “health” and “command” topics.
Ensure that Kafka is able to receive topic creation requests from Invoker instances
Have the Invoker register its Kafka topics by interacting with Kafka.
Have the Invoker register itself with the Controller:
The Invoker must register itself directly to the controller <or>
The Invoker registers all key-value pair information about itself into Consul
Have only one Invoker instance be deployed to a Kube node and ensure that no other Kube Pods run alongside it as well
Goal: Come up with a standardized way to setup and configure CouchDB as OW’s default document store.
This component is somewhat unique on the OpenWhisk deployment strategy as it only has to be done once and does not have rolling updates
How can I configure CouchDB to with seed information for OpenWhisk?
Can we better leverage public Docker image by wrapping it for our needs (config)?
How can I have the OpenWhisk components talk to CouchDB?
Over time we are working towards a “pluggable” document store approach, but this is beyond short-term scope. Despite this approach we still need to “Dockerize” our init/config as the “default”.
Prebuilt couch DB then init script that edits the authentication for unique credentials. Also edit the entries within the database with unique credentials
CI and load testing of OpenWhisk on Kubernetes
- Goal: standup testing resources at Apache and utilize for public CI and performance testing of OpenWhisk on Kubernetes
- Needed resources:
- A 5 worker node kubernetes cluster. Each worker nodes can be fairly modest (2-4 virtual cores; 4-8GB of memory)
- 2 nodes for control plane: controller, kafka, nginx
- 1 node for couchdb
- 2 nodes for invokers
- A 5 worker node kubernetes cluster. Each worker nodes can be fairly modest (2-4 virtual cores; 4-8GB of memory)
Medium Term Epics
E.g., Work items / issues (by component) that improve component’s Clustering or HA enablement
E.g., Work items that allow for “pluggability”
Kube Deployment Variations
Goal: Ensure that various Kubernetes configurations are supported.
Make sure there are Kubernetes environment specific docs to help setup and use the following infrastructures:
Ensure that all commands care copy pastable to get an initial OpenWhisk deployment onto Kubernetes
Document a common set of problems that can be seen from any Kubernetes environment and their solutions
Tests and Integration
CIs will need to be created for each Kubernetes environment so that we ensure multiple configuration and deployment strategies work
These Integration tests should catch regressions where OpenWhisk could require an update to Kubernetes infrastructure
E.g Invoker mounts hosts Docker socket and so there are minimum requirements or deployment configurations that must be met.
Kube Configuration Options
Goal: Make components more adaptable to to running on a cloud platform
Action Items:
Enable the Invoker to use all available system resources
Long-Term Epics
More generalized goals / more investigation needed (per-component and system-wide basis)
Investigations for making components HA (i.e Kafka)
HA Epic:
Goal: All “core” OpenWhisk components should have HA considerations that work on a Kube deployment.
Fixed vs. Pluggable components:
Over-time, some components that are provided as “defaults” may very well be replaced with services that are already “HA-enabled”.
Providers may choose to use replacements based upon their available services or preferences. Pluggability is key.
Some components that fall under this category include:
CouchDB (document store)
NGINX (edge server)
Understanding this affects what we prioritize (i.e., prioritize components that are viewed as a “fixed” part of the architecture vs. pluggable).
Assumption: unless otherwise noted, HA (cluster) enabling components will result in scalable components.
Can my (Docker) component be scaled based upon resource usage (by Kube) memory/disk usage (not process/cpu)?
Questions (to ask of each component):
Clustering components
Can the component be started without any (Ansible-driven) configuration injection or dependencies other components (boot-order).
I.e., In many cases this is asking how can we remove any Ansible specific setup and include those configurations into the Docker image?
Or do components have to register themselves with whom they wish to interact with?
Intra-component Communication: Can there be multiple instances of that one component and have them communicate with each other
Inter-component Communication: How is DNS (routing, spraying) registered/handled?
Message Queueing: How are queue topics registered/handled (if they are used)?
Soft Kill (tested)
Rolling update (primary use case): Can the process handle rolling updates
Example: Updat Kubernetes (etcd) version update:
e.g., Kube 1.5 to 1.6 update
Matching CLI kubectl (should not matter)
Example: Hypervisor / VM host update
What happens when soft killing a process? Can it recover?
I.e. is able to recover any requests/messages (data “in motion”) upon restart
Hard Kill (tested)
What happens when hard killing a process? Can it recover?
I.e. is able to recover any requests/messages (data “in motion”) upon restart
Benjamin M. Browning
Another Medium or Long-Term Epic I'd like to explore is using the Kubernetes API from the Invoker instead of Docker directly. This would give Kubernetes-level visibility into the function containers, let Kubernetes handle scheduling of those containers to nodes instead of only running on the Invoker nodes, and reduce the need to scale-out invokers beyond what's needed for HA. There may be some performance tradeoffs that we'd have to consider.
Daniel Lavine
Would you want the Invoker to communicate with the Kubernetes API or the OpenWhisk Controller to communicate with the Kubernetes API. I would think we want the Controller to talk with the Kube API directly. That way we do not waste time trying to schedule actions once in the controller, and then again in Kubernetes. I'm not super familiar with all of the internals for each component, but what would the reason for having the Invoker talk with the Kube API over the Controller?