QAN agent for MySQL keeps adding workers
Description
How to test
How to document
AFFECTED CS IDs
Attachments
is blocked by
is duplicated by
Smart Checklist
Activity

Kamil Dziedzic August 17, 2018 at 4:18 PMEdited
I just wanted also to note, that as you can see, logs in `/var/log/pmm-mysql-queries-0.log` are different than those in UI. And you can't get both in one place. And the ones in UI are better than those in the file. Kind of an issue if we talk about collecting all PMM logs to logs.zip Maybe we should fix this.

Kamil Dziedzic August 17, 2018 at 4:15 PM
So the logs doesn't look clear. Why multiple "Updated mysql PS_NODE-1" and "Interval out of sequence"? Not even really sure what does that mean. Will need to check this. https://github.com/percona/qan-agent/blob/master/qan/analyzer/mysql/worker/perfschema/worker.go#L332-L338 I suspect that SetConfig func is called several times creating multiple never ending tickers and spawning never ending goroutines. This seems like a bad design. Quick hack could be to add mutex to protect access to w.ticker and w.collectExamples and stop old ticker before replacing it with new one. This would also stop old goroutine before starting new one. The proper solution would be to figure out why we have those strange logs. I think we could firstly ensure SetConfig never starting multiple goroutines and then worry about the other issues.

Lalit Choudhary August 17, 2018 at 3:58 PM
Logs generated after __
qan-agent logs:

Kamil Dziedzic August 17, 2018 at 3:49 PM
Yes. Thanks.
Timestamp | Service | Level | Msg |
---|---|---|---|
2018-08-17T15:37:06.054212859Z | agent | info | Cmd begin: Cmd[Service:agent Cmd:GetDefaults Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:8f0e6e7d-8960-458c-483a-6fc60fe42b73] |
2018-08-17T15:06:23.573917784Z | agent | info | Cmd ok: Reply[Cmd:GetDefaults Error:'' Id:4f3959a6-6a15-44ee-513f-f09b20f95b7d] |
2018-08-17T15:06:23.570908034Z | agent | info | Cmd begin: Cmd[Service:agent Cmd:GetDefaults Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:4f3959a6-6a15-44ee-513f-f09b20f95b7d] |
2018-08-17T14:35:54.980705125Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:fc68e0fb-813d-4096-66c5-210e864b6174] |
2018-08-17T14:35:54.979183731Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:509b9271-029f-4fb8-719d-533c53fb7e85] |
2018-08-17T14:35:54.978656361Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:fc68e0fb-813d-4096-66c5-210e864b6174] |
2018-08-17T14:35:54.976886043Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:509b9271-029f-4fb8-719d-533c53fb7e85] |
2018-08-17T14:35:44.232403535Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:3071558f-23df-464d-5a03-bc3dc1579f00] |
2018-08-17T14:35:44.227670566Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:3071558f-23df-464d-5a03-bc3dc1579f00] |
2018-08-17T14:35:44.227312211Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:3d2ed9ea-d044-4d51-6089-76c079521c02] |
2018-08-17T14:35:44.225445995Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:3d2ed9ea-d044-4d51-6089-76c079521c02] |
2018-08-17T14:35:36.369995572Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:3fcaeeca-8338-4360-4703-8318c7cdeb30] |
2018-08-17T14:35:36.368224604Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:92a6f850-743d-47f6-6922-2879ee2343e2] |
2018-08-17T14:35:36.366756106Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:3fcaeeca-8338-4360-4703-8318c7cdeb30] |
2018-08-17T14:35:36.366539998Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:92a6f850-743d-47f6-6922-2879ee2343e2] |
2018-08-17T14:35:22.579953701Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:ea28861f-ad5e-40a8-6521-e2134ecd693f] |
2018-08-17T14:35:22.578276113Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:bd6e2e40-24b2-4685-4845-2973f7d7bbf6] |
2018-08-17T14:35:22.576693995Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:ea28861f-ad5e-40a8-6521-e2134ecd693f] |
2018-08-17T14:35:22.57646567Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:bd6e2e40-24b2-4685-4845-2973f7d7bbf6] |
2018-08-17T14:35:14.097997856Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:a50fc649-a40b-413a-435c-58649eabe8d5] |
2018-08-17T14:35:14.096274901Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:e836b7e6-0a06-4fd0-7095-fb4c9878a161] |
2018-08-17T14:35:14.094839352Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:a50fc649-a40b-413a-435c-58649eabe8d5] |
2018-08-17T14:35:14.094615462Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:e836b7e6-0a06-4fd0-7095-fb4c9878a161] |
2018-08-17T14:35:06.724229002Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:0ec21d16-4db8-44f6-627c-231fdf162c53] |
2018-08-17T14:35:06.722599621Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:0ec21d16-4db8-44f6-627c-231fdf162c53] |
2018-08-17T14:35:06.722219443Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:f5a8e58e-2c35-4808-68c0-d59aa5462a5f] |
2018-08-17T14:35:06.720269637Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:f5a8e58e-2c35-4808-68c0-d59aa5462a5f] |
2018-08-17T14:30:19.812788849Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:434bb9c3-cadd-450c-5c1f-b10cf1f392da] |
2018-08-17T14:30:19.811283892Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:d2ef07c7-543f-4949-4195-389ffbf9df05] |
2018-08-17T14:30:19.811078518Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:434bb9c3-cadd-450c-5c1f-b10cf1f392da] |
2018-08-17T14:30:19.809476961Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:d2ef07c7-543f-4949-4195-389ffbf9df05] |
2018-08-17T14:30:19.05464536Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:aab00419-800c-4830-4854-d046f39cfda1] |
2018-08-17T14:30:19.053207739Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:aab00419-800c-4830-4854-d046f39cfda1] |
2018-08-17T14:30:19.050433557Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:2ed0dce5-80b6-41cc-655d-9db8a68ad284] |
2018-08-17T14:30:19.048602613Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:2ed0dce5-80b6-41cc-655d-9db8a68ad284] |
2018-08-17T10:32:15.085008912Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:d5abd2b6-1ed6-4466-782e-70af1b486b6d] |
2018-08-17T10:32:15.083504025Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:d5abd2b6-1ed6-4466-782e-70af1b486b6d] |
2018-08-17T10:32:15.081924035Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:b1f31d47-accf-4d7c-7a74-03c607d71455] |
2018-08-17T10:32:15.080171697Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:b1f31d47-accf-4d7c-7a74-03c607d71455] |
2018-08-17T10:32:13.056279413Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:431dea73-c816-4b7c-74df-3fafb41051f6] |
2018-08-17T10:32:13.054923288Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:431dea73-c816-4b7c-74df-3fafb41051f6] |
2018-08-17T10:32:13.053374526Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:e7d12907-ca75-4209-76a8-4befd8c6bfb2] |
2018-08-17T10:32:13.051795538Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:e7d12907-ca75-4209-76a8-4befd8c6bfb2] |
2018-08-17T10:32:12.745563118Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:8b9f1de2-e9cc-4f22-7598-1206c7c2556e] |
2018-08-17T10:32:12.744128279Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:8b9f1de2-e9cc-4f22-7598-1206c7c2556e] |
2018-08-17T10:32:12.742340651Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:3712d821-3354-442c-4541-dda583b46c60] |
2018-08-17T10:32:12.740353464Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:3712d821-3354-442c-4541-dda583b46c60] |
2018-08-17T10:30:23.301006884Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:f4656569-9b2c-4975-6d77-e385f3481919] |
2018-08-17T10:30:23.299602067Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:59f96000-9c16-4808-7607-bd6898605f90] |
2018-08-17T10:30:23.299354771Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:f4656569-9b2c-4975-6d77-e385f3481919] |
2018-08-17T10:30:23.297510864Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:59f96000-9c16-4808-7607-bd6898605f90] |
2018-08-17T10:30:13.26221588Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:c39a1fa7-cde8-4b5d-6442-5d7bfe505ed0] |
2018-08-17T10:30:13.260634415Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:c39a1fa7-cde8-4b5d-6442-5d7bfe505ed0] |
2018-08-17T10:30:13.255427657Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:ce5c28b7-1d52-4505-7cab-b231724f5749] |
2018-08-17T10:30:13.253721468Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:ce5c28b7-1d52-4505-7cab-b231724f5749] |
2018-08-17T10:30:12.911053331Z | agent | info | Cmd ok: Reply[Cmd:TableInfo Error:'' Id:5085b769-faae-4ea7-6b3c-c6cf592a0825] |
2018-08-17T10:30:12.909486376Z | agent | info | Cmd begin: Cmd[Service:query Cmd:TableInfo Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:5085b769-faae-4ea7-6b3c-c6cf592a0825] |
2018-08-17T10:30:12.906433142Z | agent | info | Cmd ok: Reply[Cmd:Explain Error:'' Id:6891c0c9-807a-4b97-682e-5d5eab355c4a] |
2018-08-17T10:30:12.904183698Z | agent | info | Cmd begin: Cmd[Service:query Cmd:Explain Ts:'0001-01-01 00:00:00 +0000 UTC' User: AgentUUID:6c5864b48f584fb3524eec2f46a41992 Id:6891c0c9-807a-4b97-682e-5d5eab355c4a] |
2018-08-17T04:51:00.000789534Z | qan-analyzer-mysql-37fe975d-worker | warning | Interval out of sequence: got 1, expected 289 |
2018-08-17T04:50:38.005849644Z | instance-manager | info | Updated mysql PS_NODE-1 |
2018-08-17T04:50:38.00280117Z | qan-analyzer-mysql-37fe975d | info | First interval begins in 22.0 seconds |
2018-08-17T04:50:38.001532186Z | mrms-monitor | info | mysql instance 37fe975d76074b3f7eefeeb965ab7813 restarted |
2018-08-17T00:03:00.000755689Z | qan-analyzer-mysql-37fe975d-worker | warning | Interval out of sequence: got 1, expected 564 |
2018-08-17T00:02:17.004177017Z | instance-manager | info | Updated mysql PS_NODE-1 |
2018-08-17T00:02:17.000905683Z | qan-analyzer-mysql-37fe975d | info | First interval begins in 43.0 seconds |
2018-08-17T00:02:17.000368284Z | mrms-monitor | info | mysql instance 37fe975d76074b3f7eefeeb965ab7813 restarted |
2018-08-16T14:40:00.000735908Z | qan-analyzer-mysql-37fe975d-worker | warning | Interval out of sequence: got 1, expected 244 |
2018-08-16T14:39:45.007974893Z | instance-manager | info | Updated mysql PS_NODE-1 |
2018-08-16T14:39:45.003683058Z | qan-analyzer-mysql-37fe975d | info | First interval begins in 15.0 seconds |
2018-08-16T14:39:45.002318682Z | mrms-monitor | info | mysql instance 37fe975d76074b3f7eefeeb965ab7813 restarted |
2018-08-16T10:37:00.000764525Z | qan-analyzer-mysql-37fe975d-worker | warning | Interval out of sequence: got 1, expected 18 |
2018-08-16T10:36:24.021212004Z | instance-manager | info | Updated mysql PS_NODE-1 |
2018-08-16T10:36:24.003158692Z | qan-analyzer-mysql-37fe975d | info | First interval begins in 36.0 seconds |
2018-08-16T10:36:24.002045892Z | mrms-monitor | info | mysql instance 37fe975d76074b3f7eefeeb965ab7813 restarted |
2018-08-16T10:19:45.91682691Z | qan-analyzer-mysql-37fe975d | info | First interval begins in 14.1 seconds |
2018-08-16T10:19:45.915409358Z | agent | info | Cmd ok: Reply[Cmd:StartTool Error:'' Id:41df0b7f-3051-46b2-6c2a-7f2618007b58] |
2018-08-16T10:19:45.915078776Z | qan | info | config.QAN{UUID:"37fe975d76074b3f7eefeeb965ab7813", CollectFrom:"perfschema", Interval:0x3c, ExampleQueries*bool)(0xc4201a1b2c), MaxSlowLogSize:0, SlowLogRotation*bool)(0xc4201a1b2d), RetainSlowLogs*int)(0xc4201a1b30), Start:[]string(nil), Stop:[]string(nil), ReportLimit:0x0} |
2018-08-16T10:19:45.914992394Z | agent | info | Cmd begin: Cmd[Service:qan Cmd:StartTool Ts:'0001-01-01 00:00:00 +0000 UTC' User:pmm-admin@ip-10-178-0-236 AgentUUID: Id:41df0b7f-3051-46b2-6c2a-7f2618007b58] |
2018-08-16T10:19:44.930532948Z | agent | info | Cmd ok: Reply[Cmd:GetAllConfigs Error:'' Id:13da30eb-c382-4743-6d12-3fb163fa521c] |
2018-08-16T10:19:44.9303082Z | agent | info | Cmd begin: Cmd[Service:agent Cmd:GetAllConfigs Ts:'2018-08-16 10:19:44.930091943 +0000 UTC' User:api AgentUUID: Id:13da30eb-c382-4743-6d12-3fb163fa521c] |
2018-08-16T10:19:44.92264758Z | instance-manager | info | Update all MySQL instances |
2018-08-16T10:19:44.922645095Z | instance-manager | info | Updated mysql PS_NODE-1 |
2018-08-16T10:19:44.917014508Z | log | info | Connected to API |
2018-08-16T10:19:44.916873636Z | agent | info | Connected to API |
2018-08-16T10:19:44.913867393Z | instance-manager | info | Updating MySQL instances |
2018-08-16T10:19:43.913938862Z | agent | info | Connecting to API |
2018-08-16T10:19:43.913782296Z | agent | info | Started version 1.0.5 |
2018-08-16T10:19:43.913282351Z | qan | info | Started |
2018-08-16T10:19:43.913260841Z | query | info | Started |
2018-08-16T10:19:43.913256795Z | data | info | Started |
2018-08-16T10:19:43.91325587Z | data-sender | info | Started |
2018-08-16T10:19:43.913233385Z | data-spooler | info | Started |
2018-08-16T10:19:43.913066761Z | instance-manager | info | Started |
2018-08-16T10:19:43.913062895Z | instance-repo | info | Loaded os ip-10-178-0-236 from /usr/local/percona/qan-agent/instance/ea62c78bb9a7491a6014216c213845d7.json |
2018-08-16T10:19:43.913027676Z | instance-repo | info | Loaded mysql PS_NODE-1 from /usr/local/percona/qan-agent/instance/37fe975d76074b3f7eefeeb965ab7813.json |
2018-08-16T10:19:43.91291674Z | mrms-manager | info | Started |
2018-08-16T10:19:43.912908071Z | log | info | Started |

Lalit Choudhary August 17, 2018 at 3:14 PMEdited
I'm able to reproduce the issue with following steps.
Restarting monitoring service will cause this issue, where a reported query will execute twice per second.
Steps to reproduce:
PMM version 1.13.0
Added mysql monitoring using pmm-admin
2. checking general logs for the query entries
So far a query executing 1/sec
3. Restarting monitoring service
4. checking general logs for query entries (Now query executing 2/sec)
I am using the QAN-agent for my home database in the performance schema mode. Normally, the agent should do this query:
SELECT DIGEST, CURRENT_SCHEMA, SQL_TEXT FROM performance_schema.events_statements_history
at every second since the ticker for the worker is defined as:
w.ticker = time.NewTicker(time.Millisecond * 1000)
At about every hours, it seems that a new worker is instantiated and the rate increase by 1/s (see the graph). I confirmed additional threads using "strace -fp pid-of-qan-agent" After a few weeks, my home database was doing 300+/s such query.