All postgresql backends frozen with following stack trace: 1 do_futex_wait.prop,__new_sem_wait_slow.prop.0,PGSemaphoreLock,LWLockAcquire,pgsm_store,pgsm_ExecutorEnd,PortalCleanup,PortalDrop,exec_simple_query,PostgresMain,ServerLoop,PostmasterMain,main
This makes primary pod unready, but not causing "database" container to be restarted, because patroni passes liveness check. As a result the instance loosing servers one by one until completely outage.
Disabling pg_stat_monitor could be a workaround, but the operator re-installs it with reconcile loop in all databases and there is no .spec parameter to disable this behavior.
pg_stat_monitor extension enable unconditionally in a reconcile loop.
This makes impossible to disable the extension in case of serious issues:
https://github.com/percona/percona-postgresql-operator/blob/v2.2.0/internal/controller/postgrescluster/postgres.go#L244
All postgresql backends frozen with following stack trace:
1 do_futex_wait.prop,__new_sem_wait_slow.prop.0,PGSemaphoreLock,LWLockAcquire,pgsm_store,pgsm_ExecutorEnd,PortalCleanup,PortalDrop,exec_simple_query,PostgresMain,ServerLoop,PostmasterMain,main
This makes primary pod unready, but not causing "database" container to be restarted, because patroni passes liveness check. As a result the instance loosing servers one by one until completely outage.
Disabling pg_stat_monitor could be a workaround, but the operator re-installs it with reconcile loop in all databases and there is no .spec parameter to disable this behavior.