Edgent provides a runtime etiao to execute Edgent applications. Etiao executes a data flow graph consisting of oplets connected by streams. A stream is an endless sequence of tuples or data items.
Each oplet in a data flow graph is a stream processor, with four common types:
- Source oplet : Sources (originates) streams typically creating tuples from sensors etc.
- Processing oplet : Processes tuples on its input(s) and submit tuples to it output(s) based upon its processing.
- Sink oplet : Sinks (terminates) a stream by processing its tuples and sending them to an external system, such as an IoT scale message hub for further analytics by back-end systems.
- Peek oplet : Oplet that peeks at an input tuple and always sends the tuple unmodified to its output port.
Here's a simple example of a pipeline of four oplets, a source, S, say reading data from a sensor. S feeds a filter F that discards tuples of no interest. F is connected to M, a map operation that maps (transforms) each input tuple from F to a different tuple that is then consumed by the sink (or termination) oplet T.
Note: Other runtimes may be contributed in the future, there is no requirement for there to be a single runtime for Edgent.
An oplet is a stream processor that can have 0-N input ports and 0-M output ports. Tuples on streams connected to an oplet's input port are delivered to the oplet for processing. The oplet submits tuples to its output ports which results in the tuples being present on the connected streams.
As far as the runtime is concerned an oplet is generally a black box.
Etiao is a micro-kernel runtime with the premise that EveryThing Is An Oplet.
Thus the philosophy is that the runtime remains very simple, and any new functionality is implemented by new oplets and addition of new or existing oplets to the graph. So the generic oplet programming model itself is used to provide additional functionality.
The Etiao runtime performs this fundamental functionality:
- Support execution of a connected graph of oplets including start, stop and in the future pause, resume.
- Support a direct single connection between a oplet output port and another oplet's input port. Namely fan-out and fan-in is not supported by the runtime itself, but instead by oplets.
- Support submission of a tuple to an oplet's output port is processed by the connected downstream oplet as an input tuple, on the same thread. Again threading models are implemented by oplets inserted into the graph, not the runtime.
- Support unconnected input and output ports for an oplet.
- Support exception handling - probably more work needed here.
Additional functionality is added through the addition of oplets to the graph, in one of two ways.
- A higher level api may implement a single logical function by inserting one or more oplets into the graph. Basically the function results from a combination of existing oplets, a sub-graph. This means while say a developer using the topology api calls a single method to include a function, the resultant graph has a number of oplets, rather a one-to-one correspondence.
- A transformation of the graph. For example tuple flow metrics can be inserted into the graph by inserting a peek operator into each connection, or a sub-set of connections.
A Edgent data flow implements a logical application, before it is finally executed it may be transformed into a different physical graph (collection of connected oplets) that implement the same logical application. Examples of possible transforms are:
- Addition of metrics gathering
- Addition of tuple view capabilities (to allow developers to see tuples on a stream in a console)
- Optimization such as collapsing a pipeline into a single oplet
- Modification of the threading model, for example insertion of queue oplets to a stream.
As Edgent evolves it is anticipated there will be more stream transformations and utilities provided.
The threading model for execution is actually driven by operators. The core functionality executes immediate downstream processing on the same thread. Source operators are the starting point for a flow, and so the source operator defines its threading model. For example, again with this pipeline:
- If S was an instance of ProcessSource then the single thread S creates would execute the complete pipeline, executing S to create the tuple, then processing it in order of F, M & T.
- If was an instance of PeriodicSource then S,F,M & T are executed as a single task in an executor. So S,F,M & T are executed by the same thread for a given source tuple, but potentially on different threads for each tuple if the executor has a thread pool.
Additional threading currently can be manually inserted using the topology api, say for example by using the isolate() utility. If this was executed after F then the graph would look like:
In this case S,F would always be executed on the same thread, but M,T are executed on a different thread, as the I oplet performs the hand off from one thread to another.
In the future standard transformations might exist that could:
- Simply ensure each oplet was executed independently
- Intelligently modify the threading to optimize performance.