pg_stat_monitor hangs primary instance and it's impossible to disable it

General

Escalation

General

Escalation

Description

pg_stat_monitor extension enable unconditionally in a reconcile loop.
This makes impossible to disable the extension in case of serious issues:

https://github.com/percona/percona-postgresql-operator/blob/v2.2.0/internal/controller/postgrescluster/postgres.go#L244

All postgresql backends frozen with following stack trace:
1 do_futex_wait.prop,__new_sem_wait_slow.prop.0,PGSemaphoreLock,LWLockAcquire,pgsm_store,pgsm_ExecutorEnd,PortalCleanup,PortalDrop,exec_simple_query,PostgresMain,ServerLoop,PostmasterMain,main

This makes primary pod unready, but not causing "database" container to be restarted, because patroni passes liveness check. As a result the instance loosing servers one by one until completely outage.

Disabling pg_stat_monitor could be a workaround, but the operator re-installs it with reconcile loop in all databases and there is no .spec parameter to disable this behavior.

Environment

None

AFFECTED CS IDs

CS0041658

Activity

Show:

Slava Sarzhan November 9, 2023 at 6:05 PM

It was improved under https://github.com/percona/percona-postgresql-operator/commit/168a5360308703fbfcee7cb929019720bade0f2b
Now you can disable/enable extensions via CR: https://github.com/percona/percona-postgresql-operator/blob/main/deploy/cr.yaml#L294-L296

Nickolay Ihalainen November 7, 2023 at 3:00 PM
Edited

PGSM bug: https://jira.percona.com/browse/PG-646

Done

Details
Assignee
Unassigned
Reporter
Nickolay Ihalainen(Deactivated)
Needs QA
Yes
Fix versions
2.3.0
Affects versions
2.2.0
Priority
Critical

Smart Checklist

Created November 7, 2023 at 12:41 PM

Updated March 8, 2024 at 2:10 PM

Resolved November 27, 2023 at 2:58 PM

Configure

pg_stat_monitor hangs primary instance and it's impossible to disable it

Description

Environment

AFFECTED CS IDs

Activity

Slava Sarzhan November 9, 2023 at 6:05 PM

Nickolay Ihalainen November 7, 2023 at 3:00 PMEdited

DetailsAssigneeUnassignedUnassignedReporterNickolay IhalainenNickolay Ihalainen(Deactivated)Needs QAYesFix versions2.3.0Affects versions2.2.0PriorityCritical

Details

Assignee

Reporter

Needs QA

Fix versions

Affects versions

Priority

Smart ChecklistOpen Smart Checklist

Smart Checklist

Nickolay Ihalainen November 7, 2023 at 3:00 PM
Edited

Details
Assignee
Unassigned
Reporter
Nickolay Ihalainen(Deactivated)
Needs QA
Yes
Fix versions
2.3.0
Affects versions
2.2.0
Priority
Critical

Smart Checklist