I encountered several issues when implementing the Stream Oscilloscope (StreamScope) runtime facility. Those issues are captured here so we can discuss them and reach consensus on a course of action.

Stream Oscilloscope feature / use summary:

Just to set some context...

Notes for the Console:

Functions can't really use runtime services / ControlService

I started by developing a StreamScope Consumer<T> for use as a seeker oplet function.  This function has all of the smarts of the tuple capture code.

The function needs to register at runtime with a “streamId” which, as a result of provider-wide services, needs to include the jobId.  It also needs the opletId.  These values aren’t known until later in the submission processing (later than when the graph is augmented with StreamScope seekers).

We don’t have a way for functions to access runtime services (e.g., to get the StreamScopeRegistry), nor job/oplet context.

<dan> Topology.getRuntimeServiceSupplier() provides access to runtime services.

Hence I was forced to create a StreamScope subtype of Peek.  It’s initialize() has access to the necessary info.

Is it appropriate to require any such use/needs to subclass an oplet to gain access to this sort of info?  Note, ControlService is doc’d as being expected to be usable by applications.

I’ve run across this need in other situations as well.  e.g., a function that wanted to create a thread pool / Executor but couldn’t get access to the ThreadFactoryService.

It feels like we want analogous Function level support.  e.g., an Initializable interface (with initialize(FunctionContext)) that functions that need this info to gain access to it.  Agreed?

ControlService Unregistration woes

Related: JobMXBeans aren't getting unregistered with the net effect of being a memory leak.  Right now, StreamScopeRegistry has a similar affliction. TStream.poll() PeriodMXBean also isn't getting unregistered. <dlaboss: looks like PR-153 is addressing the Job bean issue>

Poll also has a bug in that it's not registering the control with a provider-wide control alias.  Hence there can be collisions among multiple topologies/jobs in a provider.  I don't think it's reasonable to expect TStream.poll().alias(alias) clients to come up with provider-wide alias values. 

At a high level I wasn't initially aware that Topology.getRuntimeSourceSupplier() returned a supplier for provider-wide services - i.e. that the ControlService service wasn't topology/job-specific.

Provider-wide services are good for some things (e.g., AppService bean, maybe Job beans) but not particularly scalable / helpful for per-job controls.  Maybe just something to keep in mind.

When I started to write some code to unregister the StreamScopeRegistryMBean, I had to store away the controlId returned by ControlService.registerControl() because there’s no ControlService.unregister(type,alias).  <dlaboss: looks like PR-153 is adding getControlId()>

Similarly my StreamScopeRegistryBean has to store away the controlId for StreamScopeMXBeans that it registers so that it can unregister them (ultimately via a ServiceContainer.addCleaner() hook.  Of course it would have to store something regardless as the cleaner hook only provides <jobId, opletId> and a oplet could have multiple StreamScope instances (one per oport) to unregister.  Still, the omission of unregister(type,alias) means that even more info must be stored.

While there’s a ServiceContainer.addCleaner() hook, it’s not at all apparent to ControlService clients that that exists.

IotProvider can't support Console

The Console is hardcoded to use JMX APIs.  The IotProvider uses JsonControlService, which doesn’t publish to JMX.  Note, Android doesn’t support JMX.  Probably other “edge device” environments don’t either.

There are a couple of ways to look at this.  One way…

Agreed?

JSON based control service request handling

IMO json encoded control service request handling is a separate concern from control service mbean registration & lookup.  Hence seems like:

Agreed?

JsonControlService.controlRequest() is too restrictive wrt legal MBean constraints

The StreamScopeRegistryMXBean has a lookup() method that returns a StreamScopeMXBean.  JsonControlService doesn’t support that.

org.apache.edgent.execution.services.Controls.isControlServiceMBean(), used by JsonControlService but thankfully not JMXControlService, restricts mbean method parameter and return types to primitive types.

It, like JMX, should support Arrays,List/Set Collections, Maps, and MXBeans.  Agreed?

Instrumentation controls can’t register with the id for it’s “origin stream”

I think we want to make instrumentation control services (StreamScope, Metrics?) more transparent to the user in the Console.  e.g., the user should be able to hover on an “application stream” connecting two application oplets, even when there are one or more instrumentation oplets between them, and the Console generated hover should provide something like a “Show tuples” link and metrics info for the application stream.

It’s may not be required that that instrumentation oplet controls register themselves with the “streamId” of the “origin stream” they were created for but it strikes me that it might be easier for the Console if the controls were able to register that way.

At instrumentation oplet graph insertion time (via Graph.peekAll()), the instrumentation oplets don’t really have the info they need.

It’s easy enough to add a peekAllFn(BiFunction<Vertex,Integer/*oport*/, Peek>) so the fn creating the instrumentation oplet has access to the “origin stream”. One can get the oplet from the Vertex (getInstance() - rename to getOplet?). However, Oplet doesn’t know its opletId.  It’s present in its OpletContext, given to it later in Oplet.initialize(OpletContext).

I’m uneasy with the idea of an instrumentation oplet having to walk back through its source stream to find the “origin stream”.  Rather I think it would be better if the instrumentation oplet could, for example, retain a reference to the “origin stream” oplet and then in its start() (initialize() is too soon?), be able to do something like originOplet.getContext() to be able to get the opletId.  I can imagine that might imply too many constraints on a provider impl but you get the idea.

What’s the best way to be able to make instrumentation oplet controls register themselves with the origin stream’s id?   Or… ???

streamId, ContrtolService registration, mbean object viewing / hierarchy

When I was looking at registrations via JConsole, noticed that StreamScope from multi-jobs are all in a single folder.  JMXControlService can’t form a more hierarchical ObjectName jobId=J, opletId=O, oport=n for streams, etc given the limited registerControl(type,alias,bean).  Well… not unless there’s a some additional wk structure to aliases that it can parse (e.g.,   j[jobId].c[controlId]   j[jobId].op[opletId].c[controlId]  j[jobId].op[opletId].o[oportIndex].c[controlId])

Related, my StreamScopeRegistry has its own mkStreamId(jobId, opletId, oportIndex) to form keys to register a StreamScope with the StreamScopeRegistry.  (this was definitely a necessity pre-StreamScope oplet where it might be able to leverage OpletContext.uniquify()).  Sort of feels like we might benefit from some higher level utility for generating various common ids?