Hello team,
We used Pingdom <-> Datadog integration. Pingdom's role (in our context) is to provide external uptime and latency information.
To fully cover our requirements, we need to apply percentile approach like 'p75/p90/p95 latency should be less than 500ms'.
With current implementation (pingdom.response_time metric type = Gauge):
1) We can monitor only the average over some period of time.
2) We can not build latency SLO at all.
In order to resolve these two points, it's needed to convert pingdom.response_time metric type from Gauge to Distribution. Some details in Datadog site about this: https://docs.datadoghq.com/metrics/types/?tab=distribution#metric-types
It's important to use percentiles instead of average. This is kind of standard these days (f.e. https://sre.google/workbook/implementing-slos/ )
Appreciate it if this could be implemented in the near future