DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
Status
Current state: Under Discussion
Discussion thread: https://lists.apache.org/thread/5vw6b6w6lctd2p7vfo9w9fcpzs7834fj
JIRA: here (<- link to https://issues.apache.org/jira/browse/FLINK-XXXX)
Released: <Flink Version>
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Current Flink autoscaling approach FLIP-271: Autoscaling follows a reactive model. As a result, scaling adjustments may lag behind sudden workload changes, leading to temporary inefficiencies. This challenge initially led to exploring ways to incorporate predicted input rate into autoscaling decision to improve responsiveness. However, a more flexible and extensible solution is to provide an interface for plugging in custom evaluation logic for scaling metrics. This allows users to tailor autoscaling decisions to their specific workloads, enabling business-aware and proactive scaling strategies.
Proposed Changes
Draft PR: https://github.com/apache/flink-kubernetes-operator/pull/953
Goals
- Allow plugging in custom evaluation logic in-to Autoscaler.
- Allow job-level configuration for using a specific Custom Evaluator and provide evaluator specific parameters.
- Integrate custom evaluation plugin into scaling metrics evaluator.
Non-Goals (for now)
- Provide plugins for some commonly used predictive algorithms, such as scheduled, forecasted, using REST call etc. → to be done as part of different PR.
- Allow overriding of internally collected scaling metrics.
- Allow plugging custom scaling summary computation algorithm.
Public Interfaces
Scaling Metric Custom Evaluator plugin
A Scaling Metric Custom Evaluator is a plugin that allows integrating custom logic to Autoscaler for evaluating scaling metrics. This plugin provides flexibility in how scaling decisions are made, enabling workload-specific optimisation's beyond the default model.
- Custom Evaluator loaded as a plugin.
- Custom Evaluator is invoked at end of evaluateMetrics for vertices.
- The Custom Evaluator to return Map<ScalingMetric, EvaluatedScalingMetric> which would be merged with the current evaluated metrics.
- Pass evaluation context when Custom Evaluator is called.
- We maintain the topological order of evaluation for the vertices.
Job Level Config For Custom Evaluator
As part of this FLIP, we propose the following job-level Autoscaler options to control which Custom Evaluator is used, if any, along with evaluator-specific configurations. All named Custom Evaluator configurations will be available to the evaluator.
| Key | Default | Description |
| job.autoscaler.metrics.custom-evaluator.name | Specifies a name of the custom evaluator to be used. | |
| job.autoscaler.metrics.custom-evaluator.<evaluator name>.class | Specifies the class name of the custom evaluator to be used. Required to be passed if job.autoscaler.metrics.custom-evaluator.name is set. | |
| job.autoscaler. metrics.custom-evaluator.<evaluator name>.<other evaluator specific config> | Additional configuration parameters specific to the selected custom evaluator. These settings allow fine-tuning of the custom evaluator behavior based on the model used. |
Information To Be Provided To Custom Evaluator
Custom Evaluator Invocation Signature
Map<ScalingMetric, EvaluatedScalingMetric> evaluateVertexMetrics(
JobVertexID vertex,
Map<ScalingMetric, EvaluatedScalingMetric> evaluatedMetrics,
Context evaluationContext)
The evaluationContext encapsulates the necessary details (as below) required for the evaluation process.
Context {
Configuration jobConf;
SortedMap<Instant, CollectedMetrics> metricsHistory;
Map<JobVertexID, Map<ScalingMetric, EvaluatedScalingMetric>> evaluatedVertexMetrics;
JobTopology topology;
boolean processingBacklog;
Duration restartTime;
Configuration customEvaluatorConf;
} Note:
We propose making evaluatedMetrics, jobConf, metricsHistory, and evaluatedVertexMetrics unmodifiable views when passed to the evaluator. They must not be modified by the Custom Evaluator. Any attempt to modify them will result in an UnsupportedOperationException.
However, this design choice is optional and can be reconsidered if it is unnecessary.
Sample Implementation
Compatibility, Deprecation, and Migration Plan
New feature, no compatibility issues
Test Plan
We will provide unit and integration tests to validate and verify the proposed changes.
Rejected Alternatives
If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.