Add support for cluster-wide operators

Description

Our operator design right now requires one operator per namespace, we still want this to be our default behavior, but it's a highly requested features from customers, partners, and community members to support a single operator for many namespaces.  We should validate the capability and enabled it in how we initialize the operator with the Operator Framework SDK.

Environment

None

Smart Checklist

Activity

Show:

Tomislav Plavcic May 7, 2022 at 8:00 AM

So the issues that one might expect for PSMDB right now are probably similar to what PXC had, just search "cluster wide" in jira under K8SPXC project: https://jira.percona.com/issues/?jql=project%20%3D%20K8SPXC%20AND%20text%20~%20%22cluster%20wide%22%20ORDER%20BY%20created%20DESC%2C%20priority%20DESC%2C%20updated%20DESC

Anyway, our test suite needs to be updated to run the tests in cluster wide mode, helm chart needs update, exploratory testing needs to be done etc. - there are tasks around this to make it work fully.

Kilian Ries May 6, 2022 at 8:54 AM

Just found out that there is a problem with the backup scheduling ... psmdb creates a cronjob which should bind to the operator service account which has privileges to create a 

PerconaServerMongoDBBackup api object. Normally, when the operator runs in the same namespace thats no problem because the SA is already created. In kubernetes it's not possible to bind a pod to a service account in a different namespace (where the operator is deployed). I think instead of creating / duplicating the SA to the correct namespace the backup job should be handled like pxc does it. So the PerconaServerMongoDBBackup api object should be created from the operator direct and the operator should also schedule the backup pod (and not like it is actually that the operator just creates a cronjob -> job -> backup pod). 

Kilian Ries May 5, 2022 at 3:54 PM

Just tried it and it seems to work - just follow the guide for pxc cluster wide mode: https://www.percona.com/doc/kubernetes-operator-for-pxc/cluster-wide.html

 

Just create a cluster-role instead of a role and do:

cw is the cluster wide role / binding / serviceAccount i created manually before. 

Damiano Albani December 22, 2021 at 8:18 PM

I'm curious what kind of (technical) hurdles are expected when implementing this feature.

Because I've tried some things out and setting the WATCH_NAMESPACE to an empty string was pretty much the only changed required to get it working.
Actually, it was that plus the fact that a service account needs to be present in each namespace where a PSMDB clusters is running. Because cron jobs are configured to use this service account at the moment.

So a solution would be to make the operator create service accounts dynamically in each namespace where it's active.

An alternative solution could also be to have all the cron jobs in the namespace where the operator is running, thereby avoiding the need to create service accounts dynamically.
And by including the namespace in the name of the cron job resources, the unicity of cron job names for all PSMDB cluster could be guaranteed I suppose.

What do you think?

Tyler Duzan June 1, 2020 at 5:30 PM

This could have impacts on how we implement Sharding, as well, so noting this with a relation link.

Done

Details

Assignee

Reporter

Needs QA

Yes

Needs Doc

Yes

Time tracking

1h 50m logged

Fix versions

Priority

Smart Checklist

Created June 1, 2020 at 5:29 PM
Updated March 5, 2024 at 5:08 PM
Resolved September 15, 2022 at 4:55 PM