Improve metric pingdom.response_time from Gauge to Distribution to allow applying percentiles (p90, p95)

Hello team,

We used Pingdom <-> Datadog integration. Pingdom's role (in our context) is to provide external uptime and latency information.

To fully cover our requirements, we need to apply percentile approach like 'p75/p90/p95 latency should be less than 500ms'.

With current implementation (pingdom.response_time metric type = Gauge):

1) We can monitor only the average over some period of time.

2) We can not build latency SLO at all.

In order to resolve these two points, it's needed to convert pingdom.response_time metric type from Gauge to Distribution. Some details in Datadog site about this: https://docs.datadoghq.com/metrics/types/?tab=distribution#metric-types

It's important to use percentiles instead of average. This is kind of standard these days (f.e. https://sre.google/workbook/implementing-slos/ )

Appreciate it if this could be implemented in the near future

Find more posts tagged with

Pingdom

percentile

integration

latency

datadog

Status: None

There are no comments yet