[BUG] ssl-internal secret never gets created if the operator crashes and restarts at some particular point

Description

Description
We find that the controller might never be able to create the `ssl-internal` secret if it crashes in the middle of a reconcile and restarts.Specifically, the controllers uses `reconcileSSL`  to create secrets. The method works as follows:

func (r *ReconcilePerconaServerMongoDB) reconsileSSL(cr *api.PerconaServerMongoDB) error { secretObj := corev1.Secret{} err := r.client.Get(context.TODO(), ...) // Get ssl secret if err == nil { return nil } else if !k8serrors.IsNotFound(err) { return fmt.Errorf("get secret: %v", err) } ... err = r.createSSLManualy(cr) // Create secrete is ssl secret is not found ... }

and inside `createSSLManualy` it tries to create two secrets: the `ssl` one and the `ssl-internal` one:

err = r.client.Create(context.TODO(), &secretObj) // create ssl secret ... err = r.client.Create(context.TODO(), &secretObjInternal) // create ssl-internal secret

To put it in a simple way, the controller:

  1. check whether `ssl` secret exists, if not exist:

  2. create `ssl` secret

  3. create `ssl-internal` secret

If the operator crashes between 2 and 3 and restarts, it will face a dirty state: the `ssl` secret is created, while the `ssl-internal` secret is not. More importantly, the operator cannot recover from the dirty state because it only checks whether `ssl` exists to decide whether to create the two secrets. Since `ssl` already exists, the operator will never create `ssl-internal`.

Possible Solution
A potential solution is to check the existence of `ssl` secret and `ssl-internal` secret and create them separately. We are willing to send a PR to help fix this issue.

Environment

None

Smart Checklist

Activity

Lalit Choudhary December 2, 2021 at 2:47 PM

sieveteam October 15, 2021 at 1:59 PM

We found the bug using the Sieve tool: https://github.com/sieve-project/sieve

Done

Details

Assignee

Reporter

Fix versions

Affects versions

Priority

Smart Checklist

Created October 14, 2021 at 11:52 PM
Updated March 5, 2024 at 4:47 PM
Resolved March 7, 2023 at 3:36 PM

Flag notifications