Issues

Select view

Select search mode

 
37 of 37

peer-list is not restarting haproxy if PXC pod re-created during dns TTL

Done

Description

https://github.com/percona/percona-xtradb-cluster-operator/blob/1.11.0-CUSTOM-142/cmd/peer-list/main.go#L141

v1.26.7-gke.500

  1. pxc-2 pod deleted & re-created at 2023-12-20T20:46:27Z

  2. restart + ready happened while dns cache was still valid

  3. pxc-monit container of haproxy pod is not initiating haproxy reload/restart, because while PXC pod restarted a cached value was used.

  4. haproxy container marking pxc-2 as down, because it uses IP address for checks

  5. the process repeats with other two pods: pxc1 at 2023-12-20T20:47:13Z, pxc0 at 2023-12-20T20:46:27Z

  6. haproxy servers marking all pxc servers as down, pxc-monit not reloaded haproxy containers to flush dns cache

  7. full cluster outage

A similar issue is described in a haproxy bug tracker:

https://github.com/haproxy/haproxy/issues/1278

Developers suggested to tune resolver properly:

https://docs.haproxy.org/2.6/configuration.html#5.3

 

Possible solution:

a) change haproxy configuration to refresh pxc IP address automatically

b) modify pxc-monit to collect IP addresses + dns names pair and do a reload if IP addresses changed

Environment

None

AFFECTED CS IDs

CS0042115

Details

Assignee

Reporter

Needs QA

Yes

Needs Doc

Yes

Fix versions

Affects versions

Priority

Smart Checklist

Created January 2, 2024 at 3:00 PM
Updated June 11, 2024 at 7:57 AM
Resolved January 15, 2024 at 8:35 AM

Activity

Show:

Slava SarzhanJanuary 15, 2024 at 8:35 AM

good news. Thank you for your testing. We have included this fix in the next PXCO release.

Nickolay IhalainenJanuary 11, 2024 at 2:53 PM

I can confirm that having resolvers kubernete in .spec.haproxy.configuration and HA_SERVER_OPTIONS: cmVzb2x2ZXJzIGt1YmVybmV0ZXMgY2hlY2sgaW50ZXIgMzAwMDAgcmlzZSAxIGZhbGwgNSB3ZWlnaHQgMQ==

solves the issue.

  1. Setup pxc cluster with resolver enabled

  2. Add cluster1-env-vars-haproxy secret

  3. freeze peer-list on harpoxy-0

  4. Connect from haproxy-0 to the localhost (connected via haproxy-0 to cluster1-haproxy-0

  5. Delete pxc-0

  6. new queries hang, after pxc-0 restart new queries working as expected, even if peer-list is not reloading the configuration.

ege.gunesJanuary 9, 2024 at 8:29 AM

I added a custom resolver to HAProxy config:

resolvers kubernetes parse-resolv-conf

and configured servers to use this resolver:

backend galera-nodes mode tcp option srvtcpka balance roundrobin option external-check external-check command /usr/local/bin/check_pxc.sh server cluster1-pxc-0 cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local:3306 resolvers kubernetes check inter 10000 rise 1 fall 2 weight 1 on-marked-up shutdown-backup-sessions server cluster1-pxc-2 cluster1-pxc-2.cluster1-pxc.pxc.svc.cluster.local:3306 resolvers kubernetes check inter 10000 rise 1 fall 2 weight 1 backup server cluster1-pxc-1 cluster1-pxc-1.cluster1-pxc.pxc.svc.cluster.local:3306 resolvers kubernetes check inter 10000 rise 1 fall 2 weight 1 backup backend galera-admin-nodes mode tcp option srvtcpka balance roundrobin option external-check external-check command /usr/local/bin/check_pxc.sh server cluster1-pxc-0 cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local:33062 resolvers kubernetes check inter 10000 rise 1 fall 2 weight 1 on-marked-up shutdown-backup-sessions server cluster1-pxc-2 cluster1-pxc-2.cluster1-pxc.pxc.svc.cluster.local:33062 resolvers kubernetes check inter 10000 rise 1 fall 2 weight 1 backup server cluster1-pxc-1 cluster1-pxc-1.cluster1-pxc.pxc.svc.cluster.local:33062 resolvers kubernetes check inter 10000 rise 1 fall 2 weight 1 backup backend galera-replica-nodes mode tcp option srvtcpka balance roundrobin option external-check external-check command /usr/local/bin/check_pxc.sh server cluster1-pxc-0 cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local:3306 resolvers kubernetes check inter 10000 rise 1 fall 2 weight 1 server cluster1-pxc-1 cluster1-pxc-1.cluster1-pxc.pxc.svc.cluster.local:3306 resolvers kubernetes check inter 10000 rise 1 fall 2 weight 1 server cluster1-pxc-2 cluster1-pxc-2.cluster1-pxc.pxc.svc.cluster.local:3306 resolvers kubernetes check inter 10000 rise 1 fall 2 weight 1 backend galera-mysqlx-nodes mode tcp option srvtcpka balance roundrobin option external-check external-check command /usr/local/bin/check_pxc.sh server cluster1-pxc-0 cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local:33060 resolvers kubernetes check inter 10000 rise 1 fall 2 weight 1 on-marked-up shutdown-backup-sessions server cluster1-pxc-2 cluster1-pxc-2.cluster1-pxc.pxc.svc.cluster.local:33060 resolvers kubernetes check inter 10000 rise 1 fall 2 weight 1 backup server cluster1-pxc-1 cluster1-pxc-1.cluster1-pxc.pxc.svc.cluster.local:33060 resolvers kubernetes check inter 10000 rise 1 fall 2 weight 1 backup

To simulate the same scenario I’m doing the following:

  1. attach gdb to peer-list processes in each HAProxy pod (via `kubectl debug -it --profile general pod/cluster1-haproxy-0 --image debian --target pxc-monit -- bash`)

  2. delete cluster1-pxc-1 and cluster1-pxc-2

  3. wait until ready, wait for additional 30 seconds (dns TTL)

  4. continue peer-list execution

  5. peer-list is not detecting that two servers restarted, no messages in a log after continue

After these steps I observe that svc/cluster1-haproxy-replicas sends queries to cluster1-pxc-1 and cluster1-pxc2. So problem seems fixed by custom resolver but I’m not fully confident with this fix.

HAProxy docs say that it creates a resolver named default:

At startup, HAProxy tries to generate a resolvers section named "default", if no section was named this way in the configuration. This section is used by default by the httpclient and uses the parse-resolv-conf keyword. If HAProxy failed to generate automatically this section, no error or warning are emitted.

If there’s a resolver called default that uses parse-resolv-conf it should already behave like above because I’m not doing anything fancy in my resolver, it also just uses parse-resolv-conf. I’m bit confused. You can also check it yourself using perconalab/percona-xtradb-cluster-operator:k8spxc-1339-haproxy-4 as HAProxy image.

Nickolay IhalainenJanuary 3, 2024 at 10:42 AM

during RCA for such cases we should know how long peer-list spent in a loop. Current period is 1 second, but real loop time could be significantly longer for slow DNS/network/pxc-monit pod. It will be great to have info/warning message (with possible throttling “occurred N times”) for loops longer than 10-20 seconds.

Nickolay IhalainenJanuary 3, 2024 at 8:52 AM

steps to simulate the problem (it’s hard to get exactly the same behavior as on production system)

Situation A:

  1. Freeze peer-list process (I’m attaching gdb on haproxy-0)

  2. kubectl delete -n pxc cluster1-pxc-1 cluster1-pxc-2

  3. wait until ready, wait for additional 30 seconds (dns TTL)

  4. continue peer-list execution

  5. peer-list is not detecting that two servers restarted, no messages in a log after continue

  6. haproxy-0 is not sending any queries to secondaries (pxc-1 and pxc-2)

  7. this happens forever if pxc is stable

 

Situation B:

  1. Freeze peer-list process (I’m attaching gdb on haproxy-0)

  2. kubectl delete -n pxc cluster1-pxc-1 cluster1-pxc-2

  3. wait until ready

  4. kubectl delete -n pxc cluster1-pxc-0

  5. haproxy is not unavailable

  6. continue peer-list execution

  7. haproxy still not ready

  8. haproxy container restarted due to liveness check