OpenWhisk / Kubernetes proposed epics/action items/questions

Created by Daniel Lavine, last modified by Dave Grove on Nov 08, 2017

General Questions

How can we easily track the large epics of work that need to be accomplished?

How can we track which items are currently being worked on?

Areas/epics of work and Actions for each prioritized:

Short Term Epics

Epics/issues to make components better deployable / manageable on Kube.

"Dockerization" of OpenWhisk Components

Goal: Minimize reliance on Ansible (for building in configurations)

Goal would be to Dockerize as much as possible

Considerations:

Have Kube YAML files as the primary source of configuration

Utilize ENVIRONMENT vars
“bake” other configs into Docker images
Use shared storage via Kubernetes volumes:

3rd party volumes are managed through YAML files (e.g NFS volume mounts or any persistent infrastructure mount options. see here)

Determine if we need to do better health checking to determine if we should start. Do not fail to run a process

Action Items:

Nginx:

Build Nginx with all OpenWhisk specific requirements (wsk, blackbox) pre-built into the Docker image.
helps generate certificates
create a Kube ConfigMap or Secrets Resource from those certs and a static Nginx.conf file. Where this nginx.conf file is specific to an environment
Have yaml file(s) for the Kube Deployment and Service which uses the generated ConfigMap

Controller:

Provide the ability for the controller to receive updates that new invoker instances are able to be used.

This already happens by default. Currently Kafka can receive new topics to automatically be created and used.

Need to make sure we use stateful sets so controller has unique names.

Kafka:
- On the initial startup, Kafka should register the “health” and “command” topics.
- Ensure that Kafka is able to receive topic creation requests from Invoker instances
Zookeeper:

None?

Invoker:

Have the Invoker register its Kafka topics by interacting with Kafka.
Have the Invoker register itself with the Controller:

The Invoker must register itself directly to the controller <or>
The Invoker registers all key-value pair information about itself into Consul

Have only one Invoker instance be deployed to a Kube node and ensure that no other Kube Pods run alongside it as well

CouchDB:

Goal: Come up with a standardized way to setup and configure CouchDB as OW’s default document store.
Considerations:

This component is somewhat unique on the OpenWhisk deployment strategy as it only has to be done once and does not have rolling updates

Questions:

How can I configure CouchDB to with seed information for OpenWhisk?
Can we better leverage public Docker image by wrapping it for our needs (config)?
How can I have the OpenWhisk components talk to CouchDB?

Assumptions:

Over time we are working towards a “pluggable” document store approach, but this is beyond short-term scope. Despite this approach we still need to “Dockerize” our init/config as the “default”.

Implementation:
- Prebuilt couch DB then init script that edits the authentication for unique credentials. Also edit the entries within the database with unique credentials

CI and load testing of OpenWhisk on Kubernetes

Goal: standup testing resources at Apache and utilize for public CI and performance testing of OpenWhisk on Kubernetes
Needed resources:
- A 5 worker node kubernetes cluster. Each worker nodes can be fairly modest (2-4 virtual cores; 4-8GB of memory)
  - 2 nodes for control plane: controller, kafka, nginx
  - 1 node for couchdb
  - 2 nodes for invokers

Medium Term Epics

E.g., Work items / issues (by component) that improve component’s Clustering or HA enablement
E.g., Work items that allow for “pluggability”

Kube Deployment Variations

Goal: Ensure that various Kubernetes configurations are supported.

Documentation

Make sure there are Kubernetes environment specific docs to help setup and use the following infrastructures:

Minikube
Kubeadm
Local-Up-Cluster

Ensure that all commands care copy pastable to get an initial OpenWhisk deployment onto Kubernetes

Troubleshooting

Document a common set of problems that can be seen from any Kubernetes environment and their solutions

Tests and Integration

CIs will need to be created for each Kubernetes environment so that we ensure multiple configuration and deployment strategies work
These Integration tests should catch regressions where OpenWhisk could require an update to Kubernetes infrastructure

E.g Invoker mounts hosts Docker socket and so there are minimum requirements or deployment configurations that must be met.

Kube Configuration Options

Goal: Make components more adaptable to to running on a cloud platform

Action Items:

Invoker

Enable the Invoker to use all available system resources

Long-Term Epics

More generalized goals / more investigation needed (per-component and system-wide basis)
Investigations for making components HA (i.e Kafka)

HA Epic:

Goal: All “core” OpenWhisk components should have HA considerations that work on a Kube deployment.
Considerations:

Fixed vs. Pluggable components:

Over-time, some components that are provided as “defaults” may very well be replaced with services that are already “HA-enabled”.
Providers may choose to use replacements based upon their available services or preferences. Pluggability is key.
Some components that fall under this category include:

CouchDB (document store)
NGINX (edge server)

Understanding this affects what we prioritize (i.e., prioritize components that are viewed as a “fixed” part of the architecture vs. pluggable).

Scaling

Assumption: unless otherwise noted, HA (cluster) enabling components will result in scalable components.
Can my (Docker) component be scaled based upon resource usage (by Kube) memory/disk usage (not process/cpu)?

Questions (to ask of each component):

Clustering components

Can the component be started without any (Ansible-driven) configuration injection or dependencies other components (boot-order).

I.e., In many cases this is asking how can we remove any Ansible specific setup and include those configurations into the Docker image?
Or do components have to register themselves with whom they wish to interact with?

Intra-component Communication: Can there be multiple instances of that one component and have them communicate with each other
Inter-component Communication: How is DNS (routing, spraying) registered/handled?
Message Queueing: How are queue topics registered/handled (if they are used)?

Soft Kill (tested)

Rolling update (primary use case): Can the process handle rolling updates

Example: Updat Kubernetes (etcd) version update:

e.g., Kube 1.5 to 1.6 update
Matching CLI kubectl (should not matter)

Example: Hypervisor / VM host update

What happens when soft killing a process? Can it recover?

I.e. is able to recover any requests/messages (data “in motion”) upon restart

Hard Kill (tested)

What happens when hard killing a process? Can it recover?

I.e. is able to recover any requests/messages (data “in motion”) upon restart

2 Comments

Benjamin M. Browning
Another Medium or Long-Term Epic I'd like to explore is using the Kubernetes API from the Invoker instead of Docker directly. This would give Kubernetes-level visibility into the function containers, let Kubernetes handle scheduling of those containers to nodes instead of only running on the Invoker nodes, and reduce the need to scale-out invokers beyond what's needed for HA. There may be some performance tradeoffs that we'd have to consider.
- Permalink
- Jun 09, 2017
- Delete comments
1. Daniel Lavine
  Would you want the Invoker to communicate with the Kubernetes API or the OpenWhisk Controller to communicate with the Kubernetes API. I would think we want the Controller to talk with the Kube API directly. That way we do not waste time trying to schedule actions once in the controller, and then again in Kubernetes. I'm not super familiar with all of the internals for each component, but what would the reason for having the Invoker talk with the Kube API over the Controller?
  Permalink
  
  Jun 21, 2017
  
  Delete comments