ID | IEP-119 |
Author | |
Sponsor | |
Created |
|
Status | COMPLETED |
It is proposed to implement the EventLog for Apache Ignite 3. This will allow us to say “Log this set of events into this filepath” supported.
There are many examples of how this feature can be used. One of them is an Authentication Audit: configure all authentication events to be logged into file.
Event – a piece of information about something that happened in the system.
User – the actor who caused an event. For those events where it is possible to identify the username we should set it to the event. For those events where it is not possible we should set SYSTEM username.
Channel – groups a series of events into a named set. Also, the channel defines the semantics of the event listening process.
Sink – the endpoint for the channel. This could be a variety of locations, including a file system, a remote S3 storage, a webhook, or a kafka topic. There is a one-to-many relationship. The sink can have only one channel but the channel can have many sinks.
Events will include the information about the user who caused an event if there is such. For “system events” the user will be “SYSTEM”. File logging should be configurable (what events, where to log, how to rotate files). We plan to utilize existing java logging infrastructure for file logging. Events are delivered to the sink in a synchronous way. It is a sink’s responsibility to write it immediately or do it asynchronously.
User-defined events are not supported. All events provided out of the box.
As an extension, we might add the following types of sinks: webhook sink, table sink, kafka sink, etc.
The performance degradation is not expected to happen for no-op sink. It means that there must not be an operational overhead in eventlog itself.
The “fire and forget” principle for every producer. Any piece of code that is supposed to be a source of some events must not filter or prepare an event in some custom manner. Everything should be done on the EventLog component side. It should be similar to logging heavy stuff in debug mode: lazy calculations done through lambda.
An event is a record of an occurrence within the system. It contains:
Field | Type | Description |
---|---|---|
Type | EventType | Event type enum. |
User | User | The subject of the event. If there is no user available in the context of producer, set SYSTEM. |
Timestamp | Timestamp | Nanoseconds since the Unix epoch. |
ProductVersion | String | Ignite version (can help with migrations). |
Fields | Map<String, Object> | Data object that contains some specific for EventType info. |
This is an enum defining the nature of the event. The complete set of EventTypes is beyond the scope of this proposal. This should be an extendable type.
Examples of EventType: TABLE_CREATED, TOPOLOGY_CHANGED, CLUSTER_CONFIG_UPDATED, NODE_CONFIG_UPDATED.
Event Type | Related data filed | Description |
---|---|---|
CONNECTION_ESTABLISHED | nodeId, connection info | Every time the connection is established with the client. Connection info contains information about the client. |
CONNECTION_CLOSED | nodeId, connection info | Every time the connection is closed. Same data as in CONNECTION_ESTABLISHED. |
This refers to the initiator of an event. Can contain the following information: username, authProvider, IP, etc.
Field | Type | Description |
---|---|---|
username | String | Plain username that is the same as in cluster configuration. If it is a “system event” (there is no user in the context), the SYSTEM user should be used. |
authenticationProvider | String | The name of the authentication provider that was used. |
{ "type": "CONNECTION_ESTABLISHED", "user": { "username": "apakhomov", "authenticationProvider": "basic" }, "time": 12324325, "verion": "3.0.0", "data": { "nodeId": "sidf-124f0-asdf-1wedf", "connectionInfo": { "clientIp": "123.123.123.123", // more fields here } } }
A Channel is a configurable set of Events. Channels are configured via cluster configuration.
From configuration part it should allow to define the following properties:
This is a user-configurable entity that is also configured via cluster configuration. A Sink defines where events from Channel(s) should be sent to. Examples: logger sink, kafka sink, webhook sink, table sink. Sink has one method that can be called by channel: write(Event). Write does not guarantee that the Event is really written and won’t be lost. But a user can configure a sink that does flush on every write (log4j with synchronous appender and immediateFlush = true).
The eventlog.sinks.<name>.type = “log” defines the Log Sink where <name> is the name of the unique cluster-wide name. NamedList should be used to define sinks and their names. Here are properties that are Log Sink specific:
We should not use the EventLog to implement some internal stuff. That might be useful from the first point of view: events are well-defined and a piece of java code can be imported and used as a “source of events in the system”. But the thing is that this is the user-configurable feature. They can modify/delete channel/sink configurations. The feature itself is disabled by default. To limit the usage of the EventLog in Ignite 3 codebase we should use well-defined modules with minimal public API exposed. Arch Unit tests should be written in order to prevent the misuse.
For now we allow only cluster-wide configuration of EventLog. It means that each node will have channels and sinks that are defined in cluster configuration. As a future improvement we might want to have some sink on some nodes.
eventlog: { channels.authenticationChannel: { enabled: true, events: [CONNECTION_ESTABLISHED, CONNECTION_CLOSED] }, sinks.localLogSink: { type: "log", category: "authLogCategory", level: "INFO", channel: "authenticationChannel" } }