provide in debug/metrics endpoint metrics on http responses and api endpoint responses
Description
Environment
Activity

Simon Mudd December 11, 2024 at 12:36 PM
In terms of endpoint location I was thinking of just extending: https://orchestrator.example.com/debug/metrics. An explicit API endpoint also works.

Simon Mudd December 11, 2024 at 12:34 PM
I tend to think in MySQL terms. So the 2 a / b look good.
I think additionally it’s useful to also record the latency of each call and sum it over time. Similar to MySQL’s P_S.<some_table>.SUM_TIMER_WAIT.
This can then be plotted and you can generate the delta over time and this helps give an idea of whether all calls (as you can collect the counters over time in a similar way) latency changes over time. Right now we have no insight into orchestrator load. We call the API quite frequently as orchestrator is integrated into our tooling so being able to see metrics on the different api endpoints and their behaviour over time would be useful.
I’d see this as: endpoint / response code / { count, latency, success/ failure indicator }

Kamil Holubicki October 30, 2024 at 8:37 AM
Hi ,
If I understand correctly, what is requested is:
add a new endpoint that provides metrics in form of JSON
needed metrics are:
array of httpresponse codes
httresponse code : count
array of all endpoints
endpointN : success count
endpointN: failures count
I’m not sure about total latency gauge and max latency value. Do you mean two global metrics or two per endpoint?
Additionally, how to calculate total latency gauge? Average of all previous requests, average of N previous requests?

Aaditya Dubey September 27, 2024 at 11:28 AM
Hi
Thank you for the report and feedback.

Simon Mudd September 26, 2024 at 1:47 PM
I guess endpoint code is at:
web:
http:
Ideally this can be handled by wrapping the endpoint registration in something that can collect these metrics.
Also adding a total latency gauge would be useful as would a max_latency value.
Details
Assignee
UnassignedUnassignedReporter
Simon MuddSimon MuddLabels
Needs QA
YesComponents
Priority
Medium
Details
Details
Assignee
Reporter

Labels
Needs QA
Components
Priority
Smart Checklist
Open Smart Checklist
Smart Checklist
Open Smart Checklist
Smart Checklist

It would be convenient to have counters on debug/metrics api endpoint counters by http response codes, also counters of success/failure of each endpoint location.
This allows orchestrator to provide better SLI metrics on orchestrator behaviour and to also be able to better determine if orchestrator is healthy for each of the endpoints it services.
This is a suggestion of a nice to have improvement to add to orchestrator so we can have more confidence in how well it is behaving.