This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1.     Create a data-structure on the Job-/TaskManager containing a metrics snapshot
  2.     Transfer this snapshot to the WebInterface backendback-end
  3.     Store the snapshot in the WebRuntimeMonitor in an easily accessible way
  4.     Expose the stored metrics to the WebInterface via REST API

...

The MetricRegistry will contain a MetricDumper, which act similarly as acts like an unscheduled reporter.
The Dumper creates and returns can be used by the Job-/TaskManager to create a Key-Value representation of the entire metric space when queried by the Manager.

The keys represent the name of the metric; formatted according to the following scope format strings:

metrics.scope.jm0:<user_scope>.<name>
metrics.scope.tm1.:<tm_id>:<user_scope>.<name>
metrics.scope.jm.job2.:<job_id>:<user_scope>.<name>
metrics.scope.tm.job2.:<job_id>:<user_scope>.<name>
metrics.scope.tm.task3.:<job_id>.:<task_id>.:<subtask_index>:<user_scope>.<name>
metrics.scope.tm.operator4.:<job_id>.:<task_id>.:<subtask_index>.:<operator_name>:<user_scope>.<name>

The initial number serves as a category for the WebInterface, and allows for faster handling as we don't have to parse the entire string before deciding what category it belongs to.
  0 = JobManager
  1 = TaskManager
  2 = Job
  3 = Task
  4 = Operator

For this to work we need to be able to use a different format than the one configured in the configuration, and also cache the resulting strings.
For now we can hard-code a separate scopeString field in the AbstractMetricGroup; a more general solution would be to allow separate ScopeFormat configurations for each reporter, which is a natural follow-up to

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-4246
The scope generation will be hard-coded into the separate metric groups, as ScopeFormats are a bit overkill for this. The created scopes are cached to avoid frequent re-computation.

The Value is the value returned by the metric , or a method of the given metric (as Histograms expose multiple methods).
Whether the value is stringified is TBD. Using strings would solve the serialization problem for Gauge metrics, but will require the generation of many short-lived objects on the JM/TM and additional parsing if we want to aggregate metrics in the WebInterface.

The Key-Value pairs can be stored in simple list-like data structure like an Object array.

Transfer to the WebInterface

...

This will be done in a separate Thread inside the WebRuntimeMonitor, which also has the responsiblity responsibility to merge the returned dumps.

...

Storage in the WebRuntimeMonitor

My (rough) proposal for a datastructure data-structure is the following:

MetricStore {
	void addMetric(String name, Object value);

	JobManager jobManager;

	class JobManager {
		Map<String, Object> metrics;
	}

	Map<String, TaskManager> taskmanagers;

	class TaskManager {
		Map<String, Object> metrics;
	}

	Map<String, Job> jobs;

	class Job {
		Map<String, Object> metrics;
		Map<String, Task> tasks;
	}

	class Task {
		Map<String, Object> metrics;
		Map<String, Subtask>;
	}

	class Subtask {
		Map<String, Object> metrics;
	}
}

...