the top priority is separate job handler layers from scheduler layer.

since different company has it own scheduler preference, we need to consider how apache griffin can leverage existing scheduler system.


# job layer core modes

## Metric:
 - name
 - labels
 - number
 - timestamp

## Metrics
metric_count_$table
 - implement by SQL
metric_distinct_count_$table
 - implement by SQL 
...


## business layer 
CompareJobRequirement
    - src table
    - src table partition [log_date, log_hour, other partition dimension]
    - src table metrics(defined by recording SQL) -- select count(1) from $table a where $log_date=xxx and $log_hour =yyy and other_filter_condition
    - target table
    - tartet table partition
    - target table metrics(define by recording SQL)
    - rule: some predicate eg $src.metrics > $target.metrics
    - alert: send alert through alertmanager


## implementationt
 RecordingMetricsJob(sql, MetricWriter(table=srctable, metrics=count, timestamp=xxx))

 RecordingMetricsJob(sql, MetrictWriter(table=targettable, metrics=count, timestamp=xxx))

--> 
 CheckingAndAlertJob(rule, MetricReader(metric_name, timestamp))

  • No labels