If an external sources changes the annotations of the mongos service, it gets deleted

Description

Bug description:

On GKE (After version 1.17.6-gke.7), because of [this feature|https://cloud.google.com/kubernetes-engine/docs/concepts/ingress#container-native_load_balancing], services may get annotated with:

 

cloud.google.com/neg: '{"ingress": true}'

 

This cause the service to get deleted by this mongos service check from the controller.

It then enters an endless loop of create/delete.

The workaround is simply to put this known annotation as part of the mongos service "
ServiceAnnotations".

Environment

None

Smart Checklist

Activity

Sergey Pronin September 17, 2021 at 8:48 AM

We are going to fix this with https://jira.percona.com/browse/K8SPSMDB-558.

In a nutshell we are going to ignore annotations changes.

Sergey Pronin June 24, 2021 at 9:08 AM

Exceptions are always introducing complexity and maintenance cost. I believe that when you say "tolerate" it means that Operator should not do anything if annotation was added to the Service manually:

cloud.google.com/neg: '{"ingress": true}'

I don't see how to do it easily and without adding some weird logic. Do you have any ideas?

I agree with you though that we should:

  1. Have it documented

    1. say that such manual annotation changes can cause the infinite loop

    2. provide an exact example with google could

  2. Log such events properly

 

Guillaume Coupelant June 23, 2021 at 7:41 AM

Hello
I understand this is the intended way and the workaround is quite straightforward. But maybe the operator could tolerate extra annotations ?

Also, it took me a while to find out the issue. Since the operator was constantly re-creating the service, the resource was always in a failed state with a connection error. It would be nice for the error message to be more accurate.
Or at least update the documentation because I used a standard GKE deployment, I think other will encounter this issue in the future

Sergey Pronin June 23, 2021 at 6:55 AM

Hey , thank you for submitting this.

Obviously it is intended behavior of the Operator: monitor the resources and primitives it creates and keep their state in sync with Custom Resource.

Adding this annotation into mongos definition would solve the case (spec.sharding.mongos.expose.serviceAnnotations).

 

Do you have something in mind how this can be tackled differently? 

 

Guillaume Coupelant June 19, 2021 at 3:18 PM

Sorry, the first link was not formatted properly, here's the GKE feature: https://cloud.google.com/kubernetes-engine/docs/concepts/ingress#container-native_load_balancing

Duplicate

Details

Assignee

Reporter

Affects versions

Priority

Smart Checklist

Created June 19, 2021 at 3:16 PM
Updated March 5, 2024 at 4:51 PM
Resolved September 17, 2021 at 8:48 AM

Flag notifications