DBaaS - cluster created in GKE is not show correctly

Description

I am using PMM 2.20 (upgraded from 2.15) and I connect to GKE cluster I created and I have there my PXC Operator 1.9 preview running with created MySQL cluster.

The cluster status is not show correctly (CPU: 0 Memory: 0), see screenshot

 

Also I am getting a red popup error message something about cluster secrets cannot be found, but I can't the full text of the message because pop-up text disappears after 5 sec (is the popup timeout something PMM team built or is that native grafana and as an interim workaround can that timeout be increased to maybe 10 seconds to allow more reaction time)

 

I created MySQL Cluster following steps in our docs:

https://www.percona.com/doc/kubernetes-operator-for-pxc/gke.html

 

 

 

How to test

None

How to document

None

Attachments

1
  • 10 Aug 2021, 06:30 PM

Confluence content

mentioned on

Smart Checklist

Activity

Show:

Roma Novikov August 11, 2021 at 1:38 PM

Thanks . All comments are valid. I have a request from the dev team to make it more formal about supported versions and compatibility for our DBaaS feature. I will update the documentation accordingly  

Vadim Tkachenko August 11, 2021 at 11:45 AM

Roma,

 

Few comments.

  1. Documentation https://www.percona.com/doc/percona-monitoring-and-management/2.x/using/dbaas.html does not say which Kubernetes supported and which is not

  2. The PMM DBaaS page says: "Register new Kubernetes Cluster". Again it does not mention that only subset of Kubernetes Clusters is acceptable

Roma Novikov August 11, 2021 at 8:22 AM

Here are several problems with this bug: 

  1. GKE not yet supported in DBaaS. In roadmap - for the next year. Wasn't the top system to be supported in the original req. 

  2. Support of "Clusters created outside of PMM/DBaaS" was out of the scope of the DBaaS feature from the beginning. We eventually ended up have this supported in some way but not officially. I'll follow this with Stakeholders to finalize - do we really need this.  

  3. PXC 1.9 operator is not yet supported in current PMM - https://perconadev.atlassian.net/browse/PMM-8545#icft=PMM-8545 

Vadim Tkachenko August 10, 2021 at 6:32 PM

My cr.yaml file

apiVersion: pxc.percona.com/v1-9-0 kind: PerconaXtraDBCluster metadata: name: dcluster3 finalizers: - delete-pxc-pods-in-order # - delete-proxysql-pvc # - delete-pxc-pvc # annotations: # percona.com/issue-vault-token: "true" spec: crVersion: 1.9.0 secretsName: my-cluster-secrets vaultSecretName: keyring-secret-vault sslSecretName: my-cluster-ssl sslInternalSecretName: my-cluster-ssl-internal logCollectorSecretName: my-log-collector-secrets # enableCRValidationWebhook: true # tls: # SANs: # - pxc-1.example.com # - pxc-2.example.com # - pxc-3.example.com # issuerConf: # name: special-selfsigned-issuer # kind: ClusterIssuer # group: cert-manager.io allowUnsafeConfigurations: false # pause: false updateStrategy: SmartUpdate upgradeOptions: versionServiceEndpoint: https://check.percona.com apply: 8.0-recommended schedule: "0 4 * * *" pxc: size: 3 image: percona/percona-xtradb-cluster:8.0.23-14.1 autoRecovery: true # expose: # enabled: true # type: LoadBalancer # trafficPolicy: Local # loadBalancerSourceRanges: # - 10.0.0.0/8 # annotations: # networking.gke.io/load-balancer-type: "Internal" # replicationChannels: # - name: pxc1_to_pxc2 # isSource: true # - name: pxc2_to_pxc1 # isSource: false # sourcesList: # - host: 10.95.251.101 # port: 3306 # weight: 100 # schedulerName: mycustom-scheduler # readinessDelaySec: 15 # livenessDelaySec: 600 # forceUnsafeBootstrap: false # configuration: | # [mysqld] # wsrep_debug=CLIENT # wsrep_provider_options="gcache.size=1G; gcache.recover=yes" # [sst] # xbstream-opts=--decompress # [xtrabackup] # compress=lz4 # for PXC 5.7 # [xtrabackup] # compress # imagePullSecrets: # - name: private-registry-credentials # priorityClassName: high-priority # annotations: # iam.amazonaws.com/role: role-arn # labels: # rack: rack-22 # readinessProbes: # initialDelaySeconds: 15 # timeoutSeconds: 15 # periodSeconds: 30 # successThreshold: 1 # failureThreshold: 5 # livenessProbes: # initialDelaySeconds: 300 # timeoutSeconds: 5 # periodSeconds: 10 # successThreshold: 1 # failureThreshold: 3 # containerSecurityContext: # privileged: false # podSecurityContext: # runAsUser: 1001 # runAsGroup: 1001 # supplementalGroups: [1001] # serviceAccountName: percona-xtradb-cluster-operator-workload # imagePullPolicy: Always # runtimeClassName: image-rc # sidecars: # - image: busybox # command: ["/bin/sh"] # args: ["-c", "while true; do trap 'exit 0' SIGINT SIGTERM SIGQUIT SIGKILL; done;"] # name: my-sidecar-1 # envVarsSecret: my-env-var-secrets resources: requests: memory: 10G cpu: "2" # ephemeral-storage: 1G # limits: # memory: 1G # cpu: "1" # ephemeral-storage: 1G # nodeSelector: # disktype: ssd # sidecarResources: # requests: # memory: 1G # cpu: 500m # limits: # memory: 2G # cpu: 600m affinity: antiAffinityTopologyKey: "kubernetes.io/hostname" # advanced: # nodeAffinity: # requiredDuringSchedulingIgnoredDuringExecution: # nodeSelectorTerms: # - matchExpressions: # - key: kubernetes.io/e2e-az-name # operator: In # values: # - e2e-az1 # - e2e-az2 # tolerations: # - key: "node.alpha.kubernetes.io/unreachable" # operator: "Exists" # effect: "NoExecute" # tolerationSeconds: 6000 podDisruptionBudget: maxUnavailable: 1 # minAvailable: 0 volumeSpec: # emptyDir: {} # hostPath: # path: /data # type: Directory persistentVolumeClaim: # storageClassName: "premium-rwo" # accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 20G gracePeriod: 600 haproxy: enabled: true size: 3 image: percona/percona-xtradb-cluster-operator:1.9.0-haproxy # imagePullPolicy: Always # schedulerName: mycustom-scheduler # configuration: | # global # maxconn 2048 # external-check # insecure-fork-wanted # stats socket /var/run/haproxy.sock mode 600 expose-fd listeners level user # # defaults # log global # mode tcp # retries 10 # timeout client 28800s # timeout connect 100500 # timeout server 28800s # # frontend galera-in # bind *:3309 accept-proxy # bind *:3306 # mode tcp # option clitcpka # default_backend galera-nodes # # frontend galera-replica-in # bind *:3307 # mode tcp # option clitcpka # default_backend galera-replica-nodes # imagePullSecrets: # - name: private-registry-credentials # annotations: # iam.amazonaws.com/role: role-arn # labels: # rack: rack-22 # readinessProbes: # initialDelaySeconds: 15 # timeoutSeconds: 1 # periodSeconds: 5 # successThreshold: 1 # failureThreshold: 3 # livenessProbes: # initialDelaySeconds: 60 # timeoutSeconds: 5 # periodSeconds: 30 # successThreshold: 1 # failureThreshold: 4 # serviceType: ClusterIP # externalTrafficPolicy: Cluster # replicasServiceType: ClusterIP # replicasExternalTrafficPolicy: Cluster # runtimeClassName: image-rc # sidecars: # - image: busybox # command: ["/bin/sh"] # args: ["-c", "while true; do trap 'exit 0' SIGINT SIGTERM SIGQUIT SIGKILL; done;"] # name: my-sidecar-1 # envVarsSecret: my-env-var-secrets resources: requests: memory: 1G cpu: 200m # limits: # memory: 1G # cpu: 700m # priorityClassName: high-priority # nodeSelector: # disktype: ssd # sidecarResources: # requests: # memory: 1G # cpu: 500m # limits: # memory: 2G # cpu: 600m # serviceAccountName: percona-xtradb-cluster-operator-workload affinity: antiAffinityTopologyKey: "kubernetes.io/hostname" # advanced: # nodeAffinity: # requiredDuringSchedulingIgnoredDuringExecution: # nodeSelectorTerms: # - matchExpressions: # - key: kubernetes.io/e2e-az-name # operator: In # values: # - e2e-az1 # - e2e-az2 # tolerations: # - key: "node.alpha.kubernetes.io/unreachable" # operator: "Exists" # effect: "NoExecute" # tolerationSeconds: 6000 podDisruptionBudget: maxUnavailable: 1 # minAvailable: 0 gracePeriod: 30 # loadBalancerSourceRanges: # - 10.0.0.0/8 # serviceAnnotations: # service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http proxysql: enabled: false size: 3 image: percona/percona-xtradb-cluster-operator:1.9.0-proxysql # imagePullPolicy: Always # configuration: | # datadir="/var/lib/proxysql" # # admin_variables = # { # admin_credentials="proxyadmin:admin_password" # mysql_ifaces="0.0.0.0:6032" # refresh_interval=2000 # # cluster_username="proxyadmin" # cluster_password="admin_password" # cluster_check_interval_ms=200 # cluster_check_status_frequency=100 # cluster_mysql_query_rules_save_to_disk=true # cluster_mysql_servers_save_to_disk=true # cluster_mysql_users_save_to_disk=true # cluster_proxysql_servers_save_to_disk=true # cluster_mysql_query_rules_diffs_before_sync=1 # cluster_mysql_servers_diffs_before_sync=1 # cluster_mysql_users_diffs_before_sync=1 # cluster_proxysql_servers_diffs_before_sync=1 # } # # mysql_variables= # { # monitor_password="monitor" # monitor_galera_healthcheck_interval=1000 # threads=2 # max_connections=2048 # default_query_delay=0 # default_query_timeout=10000 # poll_timeout=2000 # interfaces="0.0.0.0:3306" # default_schema="information_schema" # stacksize=1048576 # connect_timeout_server=10000 # monitor_history=60000 # monitor_connect_interval=20000 # monitor_ping_interval=10000 # ping_timeout_server=200 # commands_stats=true # sessions_sort=true # have_ssl=true # ssl_p2s_ca="/etc/proxysql/ssl-internal/ca.crt" # ssl_p2s_cert="/etc/proxysql/ssl-internal/tls.crt" # ssl_p2s_key="/etc/proxysql/ssl-internal/tls.key" # ssl_p2s_cipher="ECDHE-RSA-AES128-GCM-SHA256" # } # schedulerName: mycustom-scheduler # imagePullSecrets: # - name: private-registry-credentials # annotations: # iam.amazonaws.com/role: role-arn # labels: # rack: rack-22 # serviceType: ClusterIP # externalTrafficPolicy: Cluster # runtimeClassName: image-rc # sidecars: # - image: busybox # command: ["/bin/sh"] # args: ["-c", "while true; do trap 'exit 0' SIGINT SIGTERM SIGQUIT SIGKILL; done;"] # name: my-sidecar-1 # envVarsSecret: my-env-var-secrets resources: requests: memory: 1G cpu: 600m # limits: # memory: 1G # cpu: 700m # priorityClassName: high-priority # nodeSelector: # disktype: ssd # sidecarResources: # requests: # memory: 1G # cpu: 500m # limits: # memory: 2G # cpu: 600m # serviceAccountName: percona-xtradb-cluster-operator-workload affinity: antiAffinityTopologyKey: "kubernetes.io/hostname" # advanced: # nodeAffinity: # requiredDuringSchedulingIgnoredDuringExecution: # nodeSelectorTerms: # - matchExpressions: # - key: kubernetes.io/e2e-az-name # operator: In # values: # - e2e-az1 # - e2e-az2 # tolerations: # - key: "node.alpha.kubernetes.io/unreachable" # operator: "Exists" # effect: "NoExecute" # tolerationSeconds: 6000 volumeSpec: # emptyDir: {} # hostPath: # path: /data # type: Directory persistentVolumeClaim: # storageClassName: standard # accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 2G podDisruptionBudget: maxUnavailable: 1 # minAvailable: 0 gracePeriod: 30 # loadBalancerSourceRanges: # - 10.0.0.0/8 # serviceAnnotations: # service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http logcollector: enabled: true image: percona/percona-xtradb-cluster-operator:1.9.0-logcollector # configuration: | # [OUTPUT] # Name es # Match * # Host 192.168.2.3 # Port 9200 # Index my_index # Type my_type resources: requests: memory: 200M cpu: 100m pmm: enabled: false image: percona/pmm-client:2.18.0 serverHost: monitoring-service serverUser: admin # pxcParams: "--disable-tablestats-limit=2000" # proxysqlParams: "--custom-labels=CUSTOM-LABELS" # resources: # requests: # memory: 200M # cpu: 500m backup: image: percona/percona-xtradb-cluster-operator:1.9.0-pxc8.0-backup # serviceAccountName: percona-xtradb-cluster-operator # imagePullSecrets: # - name: private-registry-credentials pitr: enabled: false storageName: STORAGE-NAME-HERE timeBetweenUploads: 60 storages: s3-us-west: type: s3 # nodeSelector: # storage: tape # backupWorker: 'True' # resources: # requests: # memory: 1G # cpu: 600m # affinity: # nodeAffinity: # requiredDuringSchedulingIgnoredDuringExecution: # nodeSelectorTerms: # - matchExpressions: # - key: backupWorker # operator: In # values: # - 'True' # tolerations: # - key: "backupWorker" # operator: "Equal" # value: "True" # effect: "NoSchedule" # annotations: # testName: scheduled-backup # labels: # backupWorker: 'True' # schedulerName: 'default-scheduler' # priorityClassName: 'high-priority' # containerSecurityContext: # privileged: true # podSecurityContext: # fsGroup: 1001 # supplementalGroups: [1001, 1002, 1003] s3: bucket: S3-BACKUP-BUCKET-NAME-HERE credentialsSecret: my-cluster-name-backup-s3 region: us-west-2 fs-pvc: type: filesystem # nodeSelector: # storage: tape # backupWorker: 'True' # resources: # requests: # memory: 1G # cpu: 600m # affinity: # nodeAffinity: # requiredDuringSchedulingIgnoredDuringExecution: # nodeSelectorTerms: # - matchExpressions: # - key: backupWorker # operator: In # values: # - 'True' # tolerations: # - key: "backupWorker" # operator: "Equal" # value: "True" # effect: "NoSchedule" # annotations: # testName: scheduled-backup # labels: # backupWorker: 'True' # schedulerName: 'default-scheduler' # priorityClassName: 'high-priority' # containerSecurityContext: # privileged: true # podSecurityContext: # fsGroup: 1001 # supplementalGroups: [1001, 1002, 1003] volume: persistentVolumeClaim: # storageClassName: standard accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 6G schedule: - name: "sat-night-backup" schedule: "0 0 * * 6" keep: 3 storageName: s3-us-west - name: "daily-backup" schedule: "0 0 * * *" keep: 5 storageName: fs-pvc
Won't Do

Details

Assignee

Reporter

Priority

Needs QA

Yes

Needs Doc

Yes

Smart Checklist

Created August 10, 2021 at 6:31 PM
Updated October 3, 2024 at 11:03 AM
Resolved October 3, 2024 at 11:03 AM

Flag notifications