backup could change cluster status to error

General

Escalation

General

Escalation

Description

The problem could be reproduced on openshift and 4-node k3s cluster.

Steps to reproduce:
setup a fresh cluster and install operator:

Setup minio with helm, merge backup configuration to cr.yaml and apply cr.yaml

Apply backups one by one (change backup name) until psmdb status became error:

If a cluster in the error state backup is failed.

operator produces periodic messages with new error:

During error state mongo clients connecting slowly (ssl enabled):

Number of connections slowly grows:

the mongo server is reachable from operator host:
curl -k https://my-cluster-name-rs0.default.svc.cluster.local:27017/
curl: (52) NSS: client certificate not found (nickname not specified)

operator has many tcp connections opened:

after kubectl delete pod percona-server-mongodb-operator-588db759d-fcgww
psmdb status returns to normal and backup are possible again.

The stale could be related to similar issue at:
https://jira.percona.com/browse/K8SPSMDB-271

The error is consistently could be reproduced on my host (if it's idle, if CPU is used by other tasks and kubernetes is slow backup is not causing psmdb error and further backup errors)

Environment

None

AFFECTED CS IDs

CS0012909

Attachments

Linked issues

duplicates

K8SPSMDB-271

TLS not initializing replicas with CFSSL certificates

Smart Checklist

Activity

Sergey Pronin October 19, 2020 at 9:56 AM

Duplicates https://jira.percona.com/browse/K8SPSMDB-271

Nickolay Ihalainen October 14, 2020 at 1:49 PM

The problem is not happening with "allowUnsafeConfigurations: true" (no SSL) and the number of connections is stable.

Nickolay Ihalainen October 14, 2020 at 1:25 PM

Describe and logs output

Duplicate

Details
Assignee
Unassigned
Reporter
Nickolay Ihalainen(Deactivated)
Affects versions
1.5.0
Priority
Medium

Smart Checklist

Created October 14, 2020 at 1:22 PM

Updated March 5, 2024 at 5:04 PM

Resolved October 19, 2020 at 9:56 AM

backup could change cluster status to error

Description

Environment

AFFECTED CS IDs

Attachments

Linked issues

duplicates

Smart Checklist

Activity

Sergey Pronin October 19, 2020 at 9:56 AM

Nickolay Ihalainen October 14, 2020 at 1:49 PM

Nickolay Ihalainen October 14, 2020 at 1:25 PM

DetailsAssigneeUnassignedUnassignedReporterNickolay IhalainenNickolay Ihalainen(Deactivated)Affects versions1.5.0PriorityMedium

Details

Assignee

Reporter

Affects versions

Priority

Smart ChecklistOpen Smart Checklist

Smart Checklist

Details
Assignee
Unassigned
Reporter
Nickolay Ihalainen(Deactivated)
Affects versions
1.5.0
Priority
Medium

Smart Checklist