...
Main/core OpenWhisk (Carlos//Markus/Tyson, etc.)
https://github.com/apache/incubator-openwhisk/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Amerged
PR Review:
Markus: Tyson, containers will not be thrown away when app. errors are detected; called “developer errors”. Only deleted on system errors.
Markus: discussed on dev list, Tyson implemented. Close and throw away entire container pool on pause, recreate on unpause. helps with weird connection errors.
Runtimes updates:
No updates
Recent topics
Concurrency PR discussion (Tyson) (PR 2795)
Update from what has happened since June
Now includes per-action limits, for concurrency (default 1)
default max of 1 as well, operators would have to activate thois for users to access
again, this is only affecting the core repo changes that CLI would also need to support these as part of action limits schema
NOTE: Will required CLI change once merged (Issue/PR to be created)
NodeJS runtime added support for this… DONE!
Allows concurrent activations
Log collection concurrently requires a custom timeline,
example shown...
Does not “work out-of-the-box”, operators would have to have a runtime that supports log collection that works with conc. activations
what this mean, runtime that allows for logs to be “interleaved"
what we did is added the activation ID and timestamp, as well as the log message from action…
Not a great solution that makes everyone happy….
currently recommend operator customize runtime to better support logging..
Perhaps “core” repo. needs to better support this need/feature (logging for concurrency)
Controller/Invoker progress..
Built support into current LoadBalancer impl.
Concurrency support for an action avoids additional “slot” allocations
Synchronized vs Semaphore lock for container starts?
Scheduling becomes much more complicated
activations around slots, accounting becomes more complicated. Please give me feedback and opinions on this.
Invoker: "concurrent-peek-factor" allows increased message peeking.
additional messaged pulled off of kafka, discussed this in previous interchange calls...
2 controversial pieces
scheduling piece in controller
advertising runtime support of concurrency (lots of comments in PR and discussion)
Runtime manifest should somehow indicate concurrency > 1
affects how we report failures
Runtime supportFuture
Future: “black box” containers… should we allow them to support concurrency
Enhancing init. protocol, to indicate back to OpenWhisk, what type of actions its supports
e.g., limits, special resources, features, etc.
introduces another set of problems… that is these values would only be affected when action invoked… so not avail./does not match Action lifecycle/needs (for publication)
option 1: runtime manifest indicates concurrency support per
kind
option 2: container startup enhancement...
option 3: ? others
Markus: on runtime support, “do nothing for now” is fine for
now
further, might be fine not do anything at all… make those runtimes that have single conc.
more likely that all runtimes have conc. support (at a single provider).
single containers can report busy/failure
still need to resolve the logging thingy
Need to track # request sent plus memory used...
can these parts be made into separate PRs ?
tracking containers in LoadBalancer seems to be a largish change
Tyson: do not know how to break up the PR...
only do tracking when concurrency enabled
Markus: how consistent is the tracking?
Tyson: now strictly consistent
one change in PR was I did refactor the sharding in load balancer to allow for testing of workflow from publish to completing an activity,,
added test there, to attempt to saturate with diff. batches of requests
no containers, with conc. slots, without, etc.
batch sizes up to batch sizes that overloads the entire system.
relies on sync. of memory slots as well as allocation of conc. slots
Markus: so if you invoke an action on invoker, you look up a map if you have a container for that action, if not create one, (semaphore X)?
Tyson: yes
Markus: before you ask if you have free slots (memory)
a resizable semaphore? yes...
Tyson: once conc. slots used up, a new container is created
owe everyone a diagram...
There is a “double check” there
Markus: how is deletion handled?
are both algo. synced up?
Tyson: only works on the completion
Once conc. slots are released to a value that equals max for that cont. then they are released as a whole...
assumes that completion process is reliably called for each action (as it is for actions today)
Markus: If LB pt-view, if conc. slots drop to zero, then allocate a new cont.?
This is why i asked for breaking of PR...
what you describe... is that you do not really track state of invoker, if conc. falls to well know value then you throw the state away… (not real tracking, but simplified)
Tyson: simplified making assumptions of container as we do today.
Dave: worried about pushing Container state back into LB, that is counter to what we are trying to do to achieve a more scalable system...
Tyson: exact precise state of invokers is not possible to be tracked at controller level
more interesting (future) is to better track diff. types of resources...
Better tracking of diff invokers/containers and the resources they have. That better track health state of warm containers (and allow better use/reuse)
Dave: with Kube hat on… even tracking memory in Container, is counter to this model...
Tyson: ignoring that for (Invoker having memory is a BAD metaphor)
should be “is invoker cap. of handling this action with its resources)
Markus: thinking again, this could fit our future models better...
agree we cannot break up PR...
Dave: back to runtime support… like OPTION 2… return a dictionary of runtimes caps… does not “block” moving forward
Chetan: useful to have better support for broadcasting support of caps.
want to leverage cap. of container “puling” action code directly from storage and not having it pushed/store by system
Tyson: draws live diagram
1) Check Concurrent slots
if fails...
create new container (which allocates memory slots)
check concurrent slots (again)
2) attempt create container (against allocated memory slots)
allocate concurrent slots...
Dave: only create 1 now?
Tyson: yes. that brings us to other options?
burst of 100 requests… could end up with 100 containers…
leaning towards “pay up front” for sync. costs
first batch of 100 will have a latency penalty, but next 100 benefits
Markus: asking myself if a conc. data structure can help.
a conc. map has a means to atomically check if something there, if not create it… but now we have a semaphore that needs to be checked...
Release process: (Vincent) / Roadmap (Ben)
Matt: Vincent is moving forward with Runtimes ‘component” release.
wskdeploy completed and at Incubator stage/vote
Next-gen architecture (Markus):
No update
Mesos/Compose/Splunk update: (Dragos/Tyson)
No update
OpenShift update: (Brendan/Ben)
No update
Kubernetes: (Dave Grove/Daisy)
Dave: upgraded to Kube 1.9 as min. level for support, in accordance with Helm charts, testing on 1.10 started as well.
Provider charts are a nice feature
API Gateway (Matt Hamann/Dragos)
No update
Catalog/Packages/Samples (anyone)
No update
Tooling/Utilities (Carlos (CLI), Priti/Matt (wskdeploy))
https://github.com/apache/incubator-openwhisk-devtools/pulls
wskdeploy fixed issue with JSON parameters and nested maps that had interpolated strings (values values pulled form env. vars.) now interpolation works recursively on the JSON
Confirm moderator for next call
- Dragos will volunteer, for Sep 26th meeting
- adjourn 11:00 AM US Central