Issues
- Cascading disk fill-up due to WAL file accumulationK8SPG-685
- Secondary pod sporadically fails to get ready after restoreK8SPG-647ege.gunes
- Sidecars for pgbouncer do not workK8SPG-645Resolved issue: K8SPG-645Julio Pasinatto
- Tolerations not rendered correctly in pg-db helm chartK8SPG-644Resolved issue: K8SPG-644Julio Pasinatto
- Add section on how to use walVolumeClaimSpec functionality.K8SPG-639Resolved issue: K8SPG-639Julio Pasinatto
- Review Switchover DocumentationK8SPG-636Resolved issue: K8SPG-636dmitriy.kostiuk
- Can't use cluster in another namespace as data source with namespace-scoped operatorK8SPG-633Resolved issue: K8SPG-633
- Review restore documentationK8SPG-631Resolved issue: K8SPG-631dmitriy.kostiuk
- Adding Custom TLS Certificate for external and internal communicationK8SPG-627Resolved issue: K8SPG-627
- Add support for Using S3ForcePathStyle / verifyTLS customExtensionsK8SPG-624
- Need a fature to retry backup in the backup pod for a specified number of times before abandoing the pod.K8SPG-619Resolved issue: K8SPG-619dmitriy.kostiuk
11 of 11
Cascading disk fill-up due to WAL file accumulation
General
Escalation
General
Escalation
Description
Environment
None
Attachments
1
Details
Assignee
UnassignedUnassignedReporter
Diogo RecharteDiogo RecharteNeeds QA
YesAffects versions
Priority
Medium
Details
Details
Assignee
Unassigned
UnassignedReporter
Diogo Recharte
Diogo RecharteNeeds QA
Yes
Affects versions
Priority
Smart Checklist
Smart Checklist
Smart Checklist
Created November 21, 2024 at 2:37 PM
Updated November 21, 2024 at 3:04 PM
Activity
Charly BatistaNovember 21, 2024 at 3:04 PM
Based on the symptoms, it seems the problem was related to “replication slot”. My theory is there was a replication slot used by one replica that may have crashed and was later replaced by the operator, or never replaced, and that replication slot prevented the primary from removing old WAL files. Another possibility is the WAL retention configuration wasn’t ideal.
In the everest github repo, a user reported a WAL file accumulation on the repo host that eventually filled up the PVC and then led to the primary and replica pods to also fill up their respective PVCs and crashed the cluster.
https://github.com/percona/everest/issues/781
We require some investigation to understand the reason for this WAL file accumulation and assess possible solutions.