Skip to:
self-healing test is failing on PG 14 and lower in step “13-read-from-all-pods” where one pod is missing data after in previous step network loss was introduced to the master pod.
This is happening on PG 14 and lower.
Here’s how it looks:
logger.go:42: 16:02:47 | self-healing/13-read-from-all-pods | test step failed 13-read-from-all-pods case.go:364: failed in step 13-read-from-all-pods case.go:366: --- ConfigMap:kuttl-test-healthy-barnacle/13-read-from-3 +++ ConfigMap:kuttl-test-healthy-barnacle/13-read-from-3 @@ -4,9 +4,18 @@ 100500 100501 100502 - 100503 kind: ConfigMap metadata: + managedFields: + - apiVersion: v1 + fieldsType: FieldsV1 + fieldsV1: + f:data: + .: {} + f:data: {} + manager: kubectl-create + operation: Update + time: "2023-12-19T15:02:16Z" name: 13-read-from-3 namespace: kuttl-test-healthy-barnacle case.go:366: resource ConfigMap:kuttl-test-healthy-barnacle/13-read-from-3: .data.data: value mismatch, expected: 100500 100501 100502 100503 != actual: 100500 100501 100502
And indeed if we check manually data is still missing:
$ kubectl -n kuttl-test-healthy-barnacle exec pg-client-6cc584874-42gpr -- bash -c 'printf '\''\c myapp \\\ SELECT * from myApp;\n'\'' | psql -v ON_ERROR_STOP=1 -t -q postgres://'\''postgres:8KvTH9RTF6CrCL35yiy0qexO@self-healing-instance1-xkb2-0.self-healing-pods.kuttl-test-healthy-barnacle.svc'\''' 100500 100501 100502
patronictl is showing following status:
bash-4.4$ patronictl list + Cluster: self-healing-ha (7314320279136440409) ---------------------------------+---------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +-------------------------------+-------------------------------------------------+---------+---------+----+-----------+ | self-healing-instance1-l48l-0 | self-healing-instance1-l48l-0.self-healing-pods | Leader | running | 3 | | | self-healing-instance1-ps7w-0 | self-healing-instance1-ps7w-0.self-healing-pods | Replica | running | 3 | 0 | | self-healing-instance1-xkb2-0 | self-healing-instance1-xkb2-0.self-healing-pods | Replica | running | 2 | 32 | +-------------------------------+-------------------------------------------------+---------+---------+----+-----------+
So there is a lag for affected pod and also timeline is lower.
self-healing test is failing on PG 14 and lower in step “13-read-from-all-pods” where one pod is missing data after in previous step network loss was introduced to the master pod.
This is happening on PG 14 and lower.
Here’s how it looks:
logger.go:42: 16:02:47 | self-healing/13-read-from-all-pods | test step failed 13-read-from-all-pods case.go:364: failed in step 13-read-from-all-pods case.go:366: --- ConfigMap:kuttl-test-healthy-barnacle/13-read-from-3 +++ ConfigMap:kuttl-test-healthy-barnacle/13-read-from-3 @@ -4,9 +4,18 @@ 100500 100501 100502 - 100503 kind: ConfigMap metadata: + managedFields: + - apiVersion: v1 + fieldsType: FieldsV1 + fieldsV1: + f:data: + .: {} + f:data: {} + manager: kubectl-create + operation: Update + time: "2023-12-19T15:02:16Z" name: 13-read-from-3 namespace: kuttl-test-healthy-barnacle case.go:366: resource ConfigMap:kuttl-test-healthy-barnacle/13-read-from-3: .data.data: value mismatch, expected: 100500 100501 100502 100503 != actual: 100500 100501 100502
And indeed if we check manually data is still missing:
$ kubectl -n kuttl-test-healthy-barnacle exec pg-client-6cc584874-42gpr -- bash -c 'printf '\''\c myapp \\\ SELECT * from myApp;\n'\'' | psql -v ON_ERROR_STOP=1 -t -q postgres://'\''postgres:8KvTH9RTF6CrCL35yiy0qexO@self-healing-instance1-xkb2-0.self-healing-pods.kuttl-test-healthy-barnacle.svc'\''' 100500 100501 100502
patronictl is showing following status:
bash-4.4$ patronictl list + Cluster: self-healing-ha (7314320279136440409) ---------------------------------+---------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +-------------------------------+-------------------------------------------------+---------+---------+----+-----------+ | self-healing-instance1-l48l-0 | self-healing-instance1-l48l-0.self-healing-pods | Leader | running | 3 | | | self-healing-instance1-ps7w-0 | self-healing-instance1-ps7w-0.self-healing-pods | Replica | running | 3 | 0 | | self-healing-instance1-xkb2-0 | self-healing-instance1-xkb2-0.self-healing-pods | Replica | running | 2 | 32 | +-------------------------------+-------------------------------------------------+---------+---------+----+-----------+
So there is a lag for affected pod and also timeline is lower.