Done
Details
Details
Assignee
Dmytro Liakhov
Dmytro LiakhovReporter
C W
C WPriority
Components
Labels
Needs QA
Yes
Planned Version/s
Fix versions
Story Points
1
Created November 15, 2019 at 4:37 PM
Updated March 6, 2024 at 5:12 AM
Resolved August 29, 2022 at 12:17 PM
User story
As a PMM user, I want to see the status information when pmm-agent can't connect to the pmm-server instead of error message And additionally to see the connection up time value between agent and server during specified time window (by default it will be 24 hours)
Current behavior
When user runs command
pmm-admin status --json
when agent can't connect to server response look like this:Acceptance criteria:
when pmm-agent isn't connected to the pmm-server and when command
pmm-admin status --json
is executed the response should be in JSON format and contain information about connected up time. For example:where 'connection_uptime' field means percentage of how much time agent has connection to server during predefined time window (by default it will be 1 hour).
In case when agent can't connect to the server `node_id`, `server_clock_drift`, `server_latency` and list of available agents will be empty, because agent can't get info from the server.
In case when connection between agent and server is established response from
pmm-admin status --json
will have `up_connected_time` field as well. For example:Algorithm for implementation calculation of connection uptime:
We will store in memory set of events when connection status was changed, like this
For example:
Then we can calculate connected time as interval between connected and disconnected events
Here is example how it works.
When we have such set of events in connection set `f1 s1 f2`
where f1 - first event of failed connection
s1 - first event of successful connection
f2 - second event of failed connection
we can calculate result using next formula
where
time_between(s1, f2) - connection up time
time_between(f1, now) - total time between first event (f1) and current moment
where time_between(s1, f2) - connection up time
time_between(f1, now) - total time between first event (f1) and current moment