You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 42 Next »

Steps to Add a Meter

  • Clarify the Purpose of the Meter
  • Define the Measurement
  • Design the Meter
  • Evaluate Sources of Information
  • Instrument the Code

Each step is described more fully in the sections below.

Clarify the Purpose of the Meter

Clarify the purpose for adding this meter to Geode.

  • Who is the audience for the measurement?
  • What goals will the measurement help them achieve?
  • How does the measurement help them achieve these goals?

Rationale. Clarifying the purpose of the meter will help you:

  • Name and describe the meter.
  • Determine what tags to add to the meter to allow the audience to filter and sort measurements.
  • Identify and evaluate potential instrumentation sites.
  • Ensure that measurements are relevant to your audience's needs, rather than merely easy to measure.

Define the Measurement

Describe as precisely as possible the attribute measured by the meter.

Key questions:

  • What attribute does meter measure?
  • What operations, events, or conditions cause the attribute to change?
  • In what scope (see below) does the meter measure this attribute?
  • What selection criteria (see below) govern whether to measure the attribute and how to report the measurements?

Audience focus. Define the measurement entirely in terms that the audience understands. Define the measurement in such a way that the audience can easily understand which measurements relate most directly to their current goals, questions, and challenges.

Rationale. Answering these questions will help you:

  • Name and describe the meter.
  • Define the units in which the meter reports measurements.
  • Identify tags to add to the meter.
  • Identify and evaluate potential instrumentation sites.
  • Ensure that measurements are accurate and meaningful, rather than merely easy to measure.

Define the Scope of the Measurement

When you define a measurement, clearly identify the key scope or scopes in which the measurement is made. Look for two common kinds of scopes:

  • The entity about which the measurement is made.
  • The boundaries within which the measurement is made.

The entity. Each measurement is about some entity. Each meter measures some attribute of that entity or on behalf of that entity or in relation to that entity.

Example: Each geode.cache.entries gauge reports the number of entries in a particular region. The measurement is about that region. The gauge includes a region tag that identifies which region it measures.

Boundaries. Each meter measures within one or more boundaries of interest to your audience.

Example: Each geode.cache.entries gauge measures within several boundaries:

  • The region that holds the entries counted by the meter.
  • The cache server in which the region holds the entries counted by the meter.
  • The host on which the server is running.
  • The cluster in which the server is a member.

As the example shows, there are several kinds of boundaries to consider:

  • The region is an example of an entity boundary.
  • The cache server is an example of a process boundary.
  • The host is an example of a hardware or virtual machine boundary.
  • The cluster is an example of a conceptual boundary.

The example also shows that:

  • Boundaries may be nested. A given host may encompass several cache servers.
  • Boundaries may overlap. A region holds entries across numerous servers, and a server may hold entries for numerous regions.

Audience focus. Identify scopes of interest to your audience—those scopes that your audience may wish to use to select and sort measurements for display and analysis.

Rationale. Defining the scope of the measurement will help you:

  • Name and describe the meter.
  • Identify tags to add to the meter.
  • Identify and evaluate potential instrumentation sites.

Define the Selection Criteria for the Measurement

You may wish to report measurements selectively, either by reporting a measurement only in certain circumstances, or by reporting a given measurement differently in different circumstances.

Key questions:

  • Under what conditions do you want to measure the attribute?
  • Under what conditions do you want to report the measurement.
  • What criteria govern which meter to use to report the measurement?

Selecting whether to measure. You may wish to measure the attribute (or whether to report a measurement) only under certain conditions.

Example: As we initially defined the geode.function.executions timer, we intended to report only executions of user-defined functions, and not functions defined internally by Geode. Though we have not implemented this distinction, it is an example of the kind of distinction we considered.

Selecting among meters. You may wish to create multiple meters for the same attribute, and select among them to record measurements in different circumstances.

Example: Geode defines two geode.cache.gets timers for each region. One timer reports cache hits, and one reports cache misses.

Example: Geode defines two geode.function.executions timers for each function. One timer reports successful executions, and one reports failed executions.

Rationale. Defining the selection criteria for the measurement will help you:

  • Name and describe the meter.
  • Identify tags to add to the meter.
  • Identify and evaluate potential instrumentation sites.

Design the Meter

  • Select the type of meter to use to record and report measurements
  • Name the meter
  • Describe the meter
  • Identify the unit of measure reported by the meter
  • Define tags that identify the scope, circumstances, and other details of the meter's measurements

Select the Type of Meter

Select the general type of meter you want to use to report measurements:

  • A gauge reports a quantity that can go up or down. Example: The number of entries in a region.
  • A counter reports a quantity that can only go up. Example: The number of gateway events received by a gateway receiver.
  • A timer reports the number and durations of completed tasks, operations, and other events. Example: The number and durations of get operations processed by a server.
  • A long task timer reports the number and durations of in-progress tasks. Example (not implemented): The number and durations of get operations in progress on a server.

Select the category of meter that best suits the nature of the measurement.

The Micrometer library defines Java interfaces and classes that represent several variations of these categories. For details, see Instrument the Code, below.

Name the Meter

Identify the attribute. Name the meter in a way that clearly identifies the attribute it measures.

Example: jvm.memory.used identifies that the gauge reports some amount of JVM memory used.

Example: geode.function.executions identifies that the timer reports the number and durations of function executions.

Example: geode.cache.entries identifies that the gauge reports a number of entries.

Consider (with caution) identifying the entity type. Consider including the entity type in the name, though it is often (or usually) better to omit it.

Example: geode.function.executions identifies that the meter reports executions of a function. Executions is the attribute being reported. Function is the type of entity whose executions are being reported.

Before including the entity type in the meter name, consider:

  • You will also likely want a tag that identifies the particular entity.
  • The tag's key will likely be exactly the same word or words (e.g. region) that you would include in the meter name.
  • If that tag makes the scope of the measurement sufficiently clear, then including the entity type in the meter name would be redundant.

Example: We considered (and rejected) geode.cache.region.entries, which would identify that the meter reports not on the cache as a whole, but on a particular region. In the end, we decided that the region tag sufficed to identify the kind of entity whose entry count the meter reports.

Style. After reviewing the naming conventions of meters packaged with Micrometer, we have adopted these style guidelines for naming meters:

  • Brevity. Name the meter using as few words as possible without sacrificing clarity.
  • Prefix. Start the meter's name with the prefix geode to indicate that the meter reports a geode-specific attribute.
  • Multiple words. Separate words with dots.
  • Capitalization. Spell each word using only lower case letters.

Describe the Meter

Concisely describe the meter, including all key details of your definition.

Example (geode.cache.gets): "Total time and count for GET requests from Java or native clients."

Note how this description identifies an important boundary of measurement: It measures only those GET requests from Java clients and native clients. Including such details in your description helps your audience understand what is included in the measurement and what is excluded.

Identify the Unit of Measure

If the unit of measure is not obvious from the meter name, you may wish to identify the unit of measure.

Define Tags

General advice (details TBD):

  • Add a tag to identify each key scope.
  • Add a tag to represent each key measurement condition. (e.g. hit/miss, failed/succeeded)
  • Take care when adding tags. Each combination of tag values results in a separate meter.

Pre-defined tags. Geode's metrics framework automatically adds several tags to each meter:

  • member: The name of the member in which the meter is registered.
  • host: The name of the host on which the member is running.
  • cluster: The ID of the cluster that includes the member.

Rationale. TBD.

Evaluate Sources of Information

General advice (details TBD):

  • Before looking for instrumentation sites:
    • Define the purpose of the meter as well as you can. Without a clear purpose to guide instrumentation, it is distressingly easy to select instrumentation sites that are incomplete, incorrect, or otherwise inappropriate.
    • Define the measurement as well as you can. Without a clear definition of the measurement—and especially a clear definition of the scope of the measurement—it is distressingly easy to select instrumentation sites that are incomplete, incorrect, or otherwise inappropriate.
    • Identify candidate sources of information.
  • To identify candidate sources of information:
    • Identify each class that forms part or all of a boundary, entity, or other scope that you identified in your definition of the measurement.
    • Identify each class that participates in the kind of event or operation that you want to measure.
  • Use extreme caution when considering existing stats classes as candidate sources of information. Existing stats classes:
    • Are never primary sources of information.
    • Are often surprisingly unreliable sources of information.
    • Can be useful starting points for identifying potential sources of information. If a stats class appears to report some or all of the desired measurement:
      • Identify the stats class methods that update the stat.
      • Identify the Geode code that calls those methods.
      • Consider each caller of those methods (and not the stats class itself) a candidate source of information.
  • Ask of each candidate source of information:
    • Does this source already compute exactly the quantity you want to measure?
    • Does this source know all of the information required to make and report a measurement?
    • Does this source already apply the desired selection criteria to decide whether and how to report a measurement?
    • Does this source observe all of the events you want to measure? If not, you will need to identify sites that observe the remaining events.
    • Does this source observe only the events you want to measure? If not, does the site have sufficient information to decide whether it is the kind of event you want to measure?
  • The challenge is to find a set of instrumentation sites that, together, observe all and only those events you want to measure, with sufficient information to select whether and how to report a the measurement as you have defined it, for the purpose you have described.

Rationale. TBD.

Instrument the Code

General advice (details TBD):

  • Prefer constructing and interacting with meters only in stats classes.
    • You may need to create a new stats class to implement your measurements.
    • You may need to add a meter registry parameter to an existing stats class's constructor.
    • You may need to add new instrumentation methods to existing stats class.
    • You may need to add new parameters to existing stats instrumentation methods.
    • Edit or add the stats class's close() method to remove all meters from the registry and close them.
    • Ensure the owner of the stats calls stats.close() when the subject of the stats is destroyed, removed, or closed.
  • Micrometer Counters and Timers record measurements in response to method calls.
  • Micrometer Gauges, FunctionCounters, and FunctionTimers do not record measurements, but instead fetch them from some source on demand.
    • If a single source already computes the desired quantity (or can be appropriately made to do so), use a Gauge, FunctionCounter, or FunctionTimer that fetches its measurements from that source.
  • If a counter or timer reports exactly the same value as a stat, use a LegacyStatCounter or LegacyStatTimer to link the meter and the stat.
  • If a gauge reports exactly the same value as a stat, use a supplier.
  • Do not compute rates. Let the user's monitoring system do that.
  • Do not compute aggregates. Let the user's monitoring system do that.

Rationale. TBD.

  • No labels