Issues

Select view

List view

Detail view

Select search mode

Basic

JQL

DNS resolution problem forces haproxy to remove all pxc nodes, including alive ones
K8SPXC-1220
Resolved issue: K8SPXC-1220
k8s worker fault causes endless Terminating status for pxc pod and huge messages in logs
K8SPXC-1216
mc: <ERROR> Unable to initialize new alias from the provided credentials. The secret key required to complete authentication could not be found. The region must be specified if this is not the home region f │ │ or the tenancy.
K8SPXC-1128
Resolved issue: K8SPXC-1128
PiTR support for self-signed S3 certificates
K8SPXC-1105
Resolved issue: K8SPXC-1105
mysqld_exporter is not restarted after monitor mysql password change
K8SPXC-1101
Resolved issue: K8SPXC-1101
CrashLoopBackOff after password change with password_history or password validation
K8SPXC-1099
Resolved issue: K8SPXC-1099
Operator uses "insecure" passwords not passing validation_plugin policies and password_history
K8SPXC-1097
Resolved issue: K8SPXC-1097
Defining a sidecarVolume yields a broken StatefulSet
K8SPXC-1048
Resolved issue: K8SPXC-1048
Add labels/annotations to services
K8SPXC-1046
Resolved issue: K8SPXC-1046
Provide mysqld_exporter image
K8SPXC-1045
Resolved issue: K8SPXC-1045
Liveness check fails when XtraBackup is running and wsrep_sync_wait is set
K8SPXC-1036
Resolved issue: K8SPXC-1036
cert manager certificate renew is not working after delete+apply
K8SPXC-1030
Resolved issue: K8SPXC-1030
Can't backup 20k+ tables database xbcloud: Failed to upload object. Error: Couldn't connect to server on amazonaws.com
K8SPXC-1024
Resolved issue: K8SPXC-1024
Enable super_read_only on replicas
K8SPXC-1009
Full backups fail with socat error
K8SPXC-1004
Resolved issue: K8SPXC-1004
Log container starts failing with invalid stream_id could not append content to multiline context
K8SPXC-1002
Resolved issue: K8SPXC-1002
Misleading backup finished message
K8SPXC-1000
Resolved issue: K8SPXC-1000
PODs are running out of memory
K8SPXC-995
Resolved issue: K8SPXC-995
get-pxc-state uses root connection
K8SPXC-994
Resolved issue: K8SPXC-994
Restore is failing - PXC Cluster and Xtrabackup versions are not in sync
K8SPXC-993
Creating (SST) backups seem to fail on example configuration
K8SPXC-989
Resolved issue: K8SPXC-989
PITR fails due to incorrect binlog filtering logic
K8SPXC-985
Resolved issue: K8SPXC-985
Port 3307 is missing from services
K8SPXC-980
Resolved issue: K8SPXC-980
[BUG] xtradb-operator fails to delete the PVCs and secrets if it crashes and restarts in the middle of deleteStatefulSet()
K8SPXC-979
Resolved issue: K8SPXC-979
typo `xtrabcupUser`
K8SPXC-975
Resolved issue: K8SPXC-975
Cannot apply annotations, labels, or resource limitations to backup pods
K8SPXC-965
Resolved issue: K8SPXC-965
document that both full backup and binlogs should be on S3
K8SPXC-960
Resolved issue: K8SPXC-960
replicasServiceType set in helm chart not passed through to operator
K8SPXC-957
Resolved issue: K8SPXC-957
HA proxy doesn't allow connections after minimum size of pxc cluster is formed
K8SPXC-951
Resolved issue: K8SPXC-951
MySQL broken after adding a sidecar to smart-updated cluster
K8SPXC-950
Resume doesn't work for pxc cluster.
K8SPXC-938
Resolved issue: K8SPXC-938
Create secret for system users even if 'secretsName' option is commented in CR
K8SPXC-934
Resolved issue: K8SPXC-934
xtradb operator don't apply kube-api-access (volume mount) to pxc statefulset
K8SPXC-930
Resolved issue: K8SPXC-930
failed smart update for one cluster makes the operator unusable for other clusters
K8SPXC-926
Updating the Percona Operator to 1.9.0 or 1.10.0 does not delete existing backup cronjobs
K8SPXC-925
Resolved issue: K8SPXC-925
[BUG] Operator always configures validation webhook with namespace percona
K8SPXC-923
Resolved issue: K8SPXC-923
CRD's not deployed by helm chart on createCRD=true
K8SPXC-922
Pods Are Not Cleaned Up When Deleting Failed Backup Resources
K8SPXC-921
Resolved issue: K8SPXC-921
Backup Jobs Fail Intermittently
K8SPXC-920
Resolved issue: K8SPXC-920
Error After Upgrading to v1.10.0 When Configured Without A Proxy Enabled
K8SPXC-919
Resolved issue: K8SPXC-919
[BUG] xtradb operator does not delete PVC after scaling-down leading to resource leak
K8SPXC-918
Xtrabackup fails on primary node, causing SST failure
K8SPXC-912
Resolved issue: K8SPXC-912
Operator gets into crashloop on OpenShift
K8SPXC-911
Resolved issue: K8SPXC-911
operator constantly prints error msg in the logs
K8SPXC-910
Resolved issue: K8SPXC-910
PITR test issues
K8SPXC-905
Resolved issue: K8SPXC-905
operator tries to add SYSTEM_USER privilege on 5.7 for monitor user
K8SPXC-890
Resolved issue: K8SPXC-890
'/var/lib/mysql/pxc-entrypoint.sh': Permission denied Error.
K8SPXC-879

47 of 47

DNS resolution problem forces haproxy to remove all pxc nodes, including alive ones

Done

General

Escalation

General

Escalation

Description

The test case is the same as in:
https://jira.percona.com/browse/K8SPXC-1216

If dead node runs core dns, it causes DNS timeouts:

kube-system   coredns-7796b77cd4-nz9f9                           1/1     Terminating   0             3h6m   10.42.1.2   k3d-ihanick-cluster1-agent-1    <none>           <none>
pxc           cluster1-pxc-2                                     3/3     Terminating   0             139m   10.42.1.8   k3d-ihanick-cluster1-agent-1    <none>           <none>

2023/03/10 14:58:34 lookup cluster1-pxc on 10.43.0.10:53: read udp 10.42.0.6:55441->10.43.0.10:53: i/o timeout
2023/03/10 14:58:34 Peer list updated
was [cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local cluster1-pxc-1.cluster1-pxc.pxc.svc.cluster.local cluster1-pxc-2.cluster1-pxc.pxc.svc.cluster.local]
now []

And all haproxy nodes going down, while the cluster still alive (It has two pxc members in ready state):

kubectl -n pxc get pods
NAME                                               READY   STATUS    RESTARTS        AGE
percona-xtradb-cluster-operator-566848cf48-s6lm4   1/1     Running   0               3h4m
cluster1-pxc-1                                     3/3     Running   0               136m
cluster1-pxc-0                                     3/3     Running   0               136m
cluster1-pxc-2                                     3/3     Running   0               138m
cluster1-haproxy-0                                 1/2     Running   1 (2m50s ago)   143m
cluster1-haproxy-2                                 1/2     Running   1 (2m50s ago)   144m
cluster1-haproxy-1                                 1/2     Running   1 (2m50s ago)   144m

There is a similar https://jira.percona.com/browse/K8SPXC-953
But it shouldn't be related to the problem, because the fault is simulated by killing pxc+haproxy pods, not by putting k8s workers down.

Environment

None

AFFECTED CS IDs

CS0034556

Details
Assignee
ege.gunes
Reporter
Nickolay Ihalainen(Deactivated)
Needs QA
Yes
Needs Doc
Yes
Fix versions
1.13.0
Affects versions
1.10.0
Priority
Critical

Smart Checklist

Created March 10, 2023 at 3:20 PM

Updated March 5, 2024 at 5:27 PM

Resolved July 11, 2023 at 6:37 PM

Activity

Show:

Nickolay IhalainenJune 14, 2023 at 4:25 PM

Hi @Slava Sarzhan, Thank you for the suggestion, it was helpful for the issue isolation.

The network policy allows to isolate only haproxy servers from dns and existing connections are able to access the database.
Scaling coredns deployment to zero pods disables DNS everywhere, including both haproxy and PXC pods.
PXC pods are not require dns communication for the galera communication, but every new connection is checked against reverse dns name. This feature is not useful for kubernetes due to the random nature of domain names of the real application and creates a redundant load on the DNS servers.

The permanent solution for the unstable DNS setups is disable such reverse lookup with skip-name-resolve mysql option:
https://dev.mysql.com/doc/refman/8.0/en/host-cache.html

pxc:
    affinity:
      antiAffinityTopologyKey: kubernetes.io/hostname
    autoRecovery: true
    configuration: |
      [mysqld]
      skip-name-resolve

I think we should mention this in a documentation and close the bug without peer-list changes applied to the main tree.

Slava SarzhanJune 12, 2023 at 3:05 PM

@Nickolay Ihalainen in my test I scale down kube-dns to 0 to check it.

Nickolay IhalainenJune 12, 2023 at 2:19 PM

The previous policy was incorrect (only dns allowed instead of allowing mysql

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-dns
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/component: haproxy
  policyTypes:
  - Egress
  egress:
#  - to:
#    - namespaceSelector:
#        matchLabels:
#          kubernetes.io/metadata.name: kube-system
#      podSelector:
#        matchLabels:
#          k8s-app: kube-dns
#    ports:
#      - port: 53
#        protocol: UDP
#      - port: 53
#        protocol: TCP
  - to:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: pxc
      podSelector:
        matchLabels:
          app.kubernetes.io/part-of: percona-xtradb-cluster

Slava SarzhanJune 8, 2023 at 3:38 PM

@Nickolay Ihalainen I have tried to improve it but without any results. As I can see root of the issue is the liveness probe. As soon as I disable coredns the probe restarts the pod, and connection to DB is interrupted. I have played with haproxy config (use IPs instead of domain names, try to experiment with different options), but I did not find a configuration that can work without DNS.

Nickolay IhalainenJune 6, 2023 at 6:37 PM

Hi @Slava Sarzhan

I've made the same test with custom builds, now peer-list seems like works fine, but haproxy backend checks failing.

The test case is the same as before:
1. Create a connection to mysql via haproxy
2. execute queries in a loop using this connection (do not open new mysql connections)
3. Stop coredns or filter port 53 traffic with NetworkPolicy for haproxy pod

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-dns
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/component: haproxy
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
      - port: 53
        protocol: UDP
      - port: 53
        protocol: TCP

bash-4.4$ while true ; do echo 'SELECT NOW();' ; sleep 1 ; done|mysql -N -uroot -p$MYSQL_ROOT_PASSWORD -h cluster1-haproxy
...
2023-06-06 17:57:32
2023-06-06 17:57:33
2023-06-06 17:57:34
ERROR 2013 (HY000) at line 153: Lost connection to MySQL server during query

XC node 10.42.2.15 for backend galera-mysqlx-nodes is ok
[WARNING] (1) : Process 924 exited with code 0 (Exit)
ERROR 2003 (HY000): Can't connect to MySQL server on '10.42.0.7:33062' (111)
The following values are used for PXC node 10.42.0.7 in backend galera-nodes:
wsrep_local_state is ; pxc_maint_mod is ; wsrep_cluster_status is ; 3 nodes are available
PXC node 10.42.0.7 for backend galera-nodes is not ok
ERROR 2003 (HY000): Can't connect to MySQL server on '10.42.1.7:33062' (111)
The following values are used for PXC node 10.42.1.7 in backend galera-nodes:
wsrep_local_state is ; pxc_maint_mod is ; wsrep_cluster_status is ; 3 nodes are available
PXC node 10.42.1.7 for backend galera-nodes is not ok
ERROR 2003 (HY000): Can't connect to MySQL server on '10.42.2.15:33062' (111)
The following values are used for PXC node 10.42.2.15 in backend galera-nodes:
wsrep_local_state is ; pxc_maint_mod is ; wsrep_cluster_status is ; 3 nodes are available
PXC node 10.42.2.15 for backend galera-nodes is not ok
[WARNING] (1) : Process 990 exited with code 0 (Exit)
ERROR 2003 (HY000): Can't connect to MySQL server on '10.42.0.7:33062' (111)
The following values are used for PXC node 10.42.0.7 in backend galera-replica-nodes:
wsrep_local_state is ; pxc_maint_mod is ; wsrep_cluster_status is ; 3 nodes are available
PXC node 10.42.0.7 for backend galera-replica-nodes is not ok
ERROR 2003 (HY000): Can't connect to MySQL server on '10.42.2.15:33062' (111)
The following values are used for PXC node 10.42.2.15 in backend galera-admin-nodes:
wsrep_local_state is ; pxc_maint_mod is ; wsrep_cluster_status is ; 3 nodes are available
PXC node 10.42.2.15 for backend galera-admin-nodes is not ok
[WARNING] (1) : Process 1056 exited with code 0 (Exit)
[WARNING] (1) : Process 1079 exited with code 0 (Exit)
ERROR 2003 (HY000): Can't connect to MySQL server on '10.42.0.7:33062' (111)
The following values are used for PXC node 10.42.0.7 in backend galera-nodes:
wsrep_local_state is ; pxc_maint_mod is ; wsrep_cluster_status is ; 3 nodes are available
PXC node 10.42.0.7 for backend galera-nodes is not ok

[pod/cluster1-haproxy-0/haproxy] wsrep_local_state is 4; pxc_maint_mod is DISABLED; wsrep_cluster_status is Primary; 3 nodes are available
[pod/cluster1-haproxy-0/haproxy] PXC node 10.42.0.7 for backend galera-replica-nodes is ok
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 349 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 415 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 332
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 335 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 354
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 357 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 472 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 363
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 366 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 372
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 375 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 484 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 381
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 384 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 390
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 393 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 399
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 402 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 420
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 423 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 496 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 429
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 432 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 438
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 441 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 447
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 450 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 456
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 459 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 517 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 583 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 501
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : Server galera-replica-nodes/cluster1-pxc-1 is DOWN, reason: External check timeout, code: 0, check duration: 10002ms. 2 active and 0 backup servers left. 0 sess>
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 504 exited with code 0 (Exit)
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : kill 522
[pod/cluster1-haproxy-0/haproxy] [WARNING] (19) : Server galera-replica-nodes/cluster1-pxc-2 is DOWN, reason: External check timeout, code: 0, check duration: 10001ms. 1 active and 0 backup servers left. 0 sess>
[pod/cluster1-haproxy-0/haproxy] [WARNING] (1) : Process 525 exited with code 0 (Exit)

[pod/cluster1-haproxy-0/pxc-monit] PXC node cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local for backend is ok
[pod/cluster1-haproxy-0/pxc-monit] + for backup_server in ${NODE_LIST_BACKUP[@]}
[pod/cluster1-haproxy-0/pxc-monit] + echo 'shutdown sessions server galera-nodes/cluster1-pxc-1'
[pod/cluster1-haproxy-0/pxc-monit] + socat stdio /etc/haproxy/pxc/haproxy.sock
[pod/cluster1-haproxy-0/pxc-monit] No such server.
[pod/cluster1-haproxy-0/pxc-monit]
[pod/cluster1-haproxy-0/pxc-monit] + for backup_server in ${NODE_LIST_BACKUP[@]}
[pod/cluster1-haproxy-0/pxc-monit] + echo 'shutdown sessions server galera-admin-nodes/cluster1-pxc-1'
[pod/cluster1-haproxy-0/pxc-monit] + socat stdio /etc/haproxy/pxc/haproxy.sock
[pod/cluster1-haproxy-0/pxc-monit] No such server.
[pod/cluster1-haproxy-0/pxc-monit]
[pod/cluster1-haproxy-0/pxc-monit] + for backup_server in ${NODE_LIST_BACKUP[@]}
[pod/cluster1-haproxy-0/pxc-monit] + echo 'shutdown sessions server galera-nodes/cluster1-pxc-2'
[pod/cluster1-haproxy-0/pxc-monit] + socat stdio /etc/haproxy/pxc/haproxy.sock
[pod/cluster1-haproxy-0/pxc-monit] No such server.
[pod/cluster1-haproxy-0/pxc-monit]
[pod/cluster1-haproxy-0/pxc-monit] + for backup_server in ${NODE_LIST_BACKUP[@]}
[pod/cluster1-haproxy-0/pxc-monit] + echo 'shutdown sessions server galera-admin-nodes/cluster1-pxc-2'
[pod/cluster1-haproxy-0/pxc-monit] + socat stdio /etc/haproxy/pxc/haproxy.sock
[pod/cluster1-haproxy-0/pxc-monit] No such server.
[pod/cluster1-haproxy-0/pxc-monit]
[pod/cluster1-haproxy-0/pxc-monit] + '[' -S /etc/haproxy/pxc/haproxy-main.sock ']'
[pod/cluster1-haproxy-0/pxc-monit] + echo reload
[pod/cluster1-haproxy-0/pxc-monit] + socat stdio /etc/haproxy/pxc/haproxy-main.sock
[pod/cluster1-haproxy-0/pxc-monit] + exit 0
[pod/cluster1-haproxy-0/pxc-monit] 2023/06/06 17:55:55 lookup cluster1-pxc on 10.43.0.10:53: read udp 10.42.1.8:44126->10.43.0.10:53: i/o timeout
[pod/cluster1-haproxy-0/pxc-monit] 2023/06/06 17:56:36 lookup cluster1-pxc on 10.43.0.10:53: read udp 10.42.1.8:40623->10.43.0.10:53: i/o timeout
[pod/cluster1-haproxy-0/pxc-monit] 2023/06/06 17:57:17 lookup cluster1-pxc on 10.43.0.10:53: read udp 10.42.1.8:37546->10.43.0.10:53: i/o timeout
[pod/cluster1-haproxy-0/pxc-monit] 2023/06/06 17:57:58 lookup cluster1-pxc on 10.43.0.10:53: read udp 10.42.1.8:38746->10.43.0.10:53: i/o timeout
[pod/cluster1-haproxy-0/pxc-monit] 2023/06/06 17:58:34 lookup cluster1-pxc on 10.43.0.10:53: no such host
[pod/cluster1-haproxy-0/pxc-monit] 2023/06/06 17:58:35 Peer list updated
[pod/cluster1-haproxy-0/pxc-monit] was []
[pod/cluster1-haproxy-0/pxc-monit] now [cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local cluster1-pxc-2.cluster1-pxc.pxc.svc.cluster.local]
[pod/cluster1-haproxy-0/pxc-monit] 2023/06/06 17:58:35 execing: /usr/bin/add_pxc_nodes.sh with stdin: cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local
[pod/cluster1-haproxy-0/pxc-monit] cluster1-pxc-2.cluster1-pxc.pxc.svc.cluster.local
[pod/cluster1-haproxy-0/pxc-monit] 2023/06/06 17:58:35 Failed to execute /usr/bin/add_pxc_nodes.sh: + main
[pod/cluster1-haproxy-0/pxc-monit] + echo 'Running /usr/bin/add_pxc_nodes.sh'
[pod/cluster1-haproxy-0/pxc-monit] Running /usr/bin/add_pxc_nodes.sh
[pod/cluster1-haproxy-0/pxc-monit] + NODE_LIST=()
[pod/cluster1-haproxy-0/pxc-monit] + NODE_LIST_REPL=()
[pod/cluster1-haproxy-0/pxc-monit] + NODE_LIST_MYSQLX=()
[pod/cluster1-haproxy-0/pxc-monit] + NODE_LIST_ADMIN=()
[pod/cluster1-haproxy-0/pxc-monit] + NODE_LIST_BACKUP=()
[pod/cluster1-haproxy-0/pxc-monit] + firs_node=
[pod/cluster1-haproxy-0/pxc-monit] + firs_node_admin=
[pod/cluster1-haproxy-0/pxc-monit] + main_node=
[pod/cluster1-haproxy-0/pxc-monit] + SERVER_OPTIONS='check inter 10000 rise 1 fall 2 weight 1'
[pod/cluster1-haproxy-0/pxc-monit] + send_proxy=
[pod/cluster1-haproxy-0/pxc-monit] + path_to_haproxy_cfg=/etc/haproxy/pxc
[pod/cluster1-haproxy-0/pxc-monit] + [[ '' = \y\e\s ]]
[pod/cluster1-haproxy-0/pxc-monit] + read pxc_host
[pod/cluster1-haproxy-0/pxc-monit] + '[' -z cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local ']'
[pod/cluster1-haproxy-0/pxc-monit] ++ echo cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local
[pod/cluster1-haproxy-0/pxc-monit] ++ cut -d . -f -1
[pod/cluster1-haproxy-0/pxc-monit] + node_name=cluster1-pxc-0
[pod/cluster1-haproxy-0/pxc-monit] ++ echo cluster1-pxc-0
[pod/cluster1-haproxy-0/pxc-monit] ++ awk F '{print $NF}'
[pod/cluster1-haproxy-0/pxc-monit] + node_id=0
[pod/cluster1-haproxy-0/pxc-monit] + NODE_LIST_REPL+=("server $node_name $pxc_host:3306 $send_proxy $SERVER_OPTIONS")
[pod/cluster1-haproxy-0/pxc-monit] + '[' x0 == x0 ']'
[pod/cluster1-haproxy-0/pxc-monit] + main_node=cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local
[pod/cluster1-haproxy-0/pxc-monit] + firs_node='server cluster1-pxc-0 cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local:3306 check inter 10000 rise 1 fall 2 weight 1 on-marked-up shutdown-backup-sessions'
[pod/cluster1-haproxy-0/pxc-monit] + firs_node_admin='server cluster1-pxc-0 cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local:33062 check inter 10000 rise 1 fall 2 weight 1 on-marked-up shutdown-backup-sessions'
[pod/cluster1-haproxy-0/pxc-monit] + firs_node_mysqlx='server cluster1-pxc-0 cluster1-pxc-0.cluster1-pxc.pxc.svc.cluster.local:33060 check inter 10000 rise 1 fall 2 weight 1 on-marked-up shutdown-backup-sessions'
[pod/cluster1-haproxy-0/pxc-monit] + continue
[pod/cluster1-haproxy-0/pxc-monit] + read pxc_host
[pod/cluster1-haproxy-0/pxc-monit] + '[' -z cluster1-pxc-2.cluster1-pxc.pxc.svc.cluster.local ']'
[pod/cluster1-haproxy-0/pxc-monit] ++ echo cluster1-pxc-2.cluster1-pxc.pxc.svc.cluster.local

Issues

DNS resolution problem forces haproxy to remove all pxc nodes, including alive ones

Description

Environment

AFFECTED CS IDs

DetailsAssigneeege.gunesege.gunesReporterNickolay IhalainenNickolay Ihalainen(Deactivated)Needs QAYesNeeds DocYesFix versions1.13.0Affects versions1.10.0PriorityCritical

Details

Assignee

Reporter

Needs QA

Needs Doc

Fix versions

Affects versions

Priority

Smart ChecklistOpen Smart Checklist

Smart Checklist

Activity

Nickolay IhalainenJune 14, 2023 at 4:25 PM

Slava SarzhanJune 12, 2023 at 3:05 PM

Nickolay IhalainenJune 12, 2023 at 2:19 PM

Slava SarzhanJune 8, 2023 at 3:38 PM

Nickolay IhalainenJune 6, 2023 at 6:37 PM

Details
Assignee
ege.gunes
Reporter
Nickolay Ihalainen(Deactivated)
Needs QA
Yes
Needs Doc
Yes
Fix versions
1.13.0
Affects versions
1.10.0
Priority
Critical

Smart Checklist