Telemetry: Add meta-metric support

Description

Add support for meta-metric using Golang client library for Telemetry.

How to test

To test telemetry you just need to run PMM Server in docker container with following ENV variables:

--env PMM_DEBUG=1
--env PERCONA_TEST_TELEMETRY_DISABLE_SEND=0
--env PERCONA_TEST_TELEMETRY_DISABLE_START_DELAY=0
--env PERCONA_TEST_TELEMETRY_INTERVAL=10s

This will allow telemetry be sent on server start and every 10 seconds after that. Check PMM managed logs and see what metrics and what values were sent by PMM managed.

1. Run PMM instance with telemetry ENV variables
2. Run MongoDB instance and add it to monitoring (make sure to use --enable-all-collectors flag to allow exporter to collect metrics from all collectors)
3. Wait ~1 min
4. Check PMM ManageD logs for following metrics sending:

  • mongodb_collector_scrape_time_general

  • mongodb_collector_scrape_time_diagnostic_data

  • mongodb_collector_scrape_time_collstats

  • mongodb_collector_scrape_time_dbstats

  • mongodb_collector_scrape_time_indexstats

  • mongodb_collector_scrape_time_top

  • mongodb_collector_scrape_time_replset_status

How to document

None

Attachments

1

Smart Checklist

hide

Activity

Show:

Ihor Cherkasov October 20, 2022 at 9:13 AM

Verified on FB – https://github.com/Percona-Lab/pmm-submodules/pull/2793#issuecomment-1275945709

Now I see that all mentioned metrics are sent in PMM ManageD logs:

shashank.sinha September 21, 2022 at 1:14 PM

Existing implementation of telemetry has 2 shortcomings for passing meta-metrics information:

  1. Configuration for telemetry doesn't support any form of templating. This causes 2 issues. Firstly, it forces user to resort to copy-pasting configuration data for every new metric. Any update in future will require update for multiple configuration entries instead of a single entry. Secondly, every time a new collector is added for an exporter, it is not automatically picked up by telemetry. Hence developers need to update configuration for telemetry whenever a new collector is added for an exporter.

  2. Telemetry data exists as a simple key value pair. It is not possible to add more dimension to the data point (i.e. more context about the data). As a result, using the data for any sort of analysis will be difficult. For example, say a collector takes 2s for a scrape run. With just this information, it is difficult to assert that collector is running slow. It may be acceptable if collector interacts withs 100+ tables for the run but unacceptable if only 1 tables are involved. This dimension of data is provided in Prometheus using labels, but such capability is lacking in our telemetry implementation.

Done

Details

Assignee

Reporter

Priority

Components

Needs QA

Yes

Needs Doc

No

Planned Version/s

Fix versions

Story Points

Smart Checklist Progress

Smart Checklist

Created August 22, 2022 at 12:32 PM
Updated August 8, 2024 at 5:00 AM
Resolved November 9, 2022 at 8:11 AM