"unable to decode an event from the watch stream" during cluster update
General
Escalation
General
Escalation
Description
Hi,
During CW update of operator from 1.16.2 to 1.17.0 with both bundle and separate appliance of crd.yaml, rbac.aml, and operator.yaml the errors are seen in 1.16.2 operator log. Update finishes ok. STR 1 - use bundles:
Update cw-bundle.yaml 1.16.2 with WATCH_NAMESPACE and apply bundle in psmdb-operator namespace
- name: WATCH_NAMESPACE
value: "psmdb"
Start psmdb cluster using 1.16.2 cr.yaml
Update cw-bundle.yaml 1.17.0 with the same WATCH_NAMESPACE and apply cw-bundle.yaml 1.17.0 in psmdb-operator namespace. The 1.16.2 Operator POD log contains these errors during shutdown:
2024-09-05T19:49:39.669Z INFO added to shard {"controller": "psmdb-controller", "object": {"name":"my-cluster-name","namespace":"psmdb2"}, "namespace": "psmdb2", "name": "my-cluster-name", "reconcileID": "6f394093-4e1d-47cc-94ec-160c03c595b0", "rs": "rs0"}
^[[B^[[B^[[B2024-09-05T19:55:57.521Z INFO Stopping and waiting for non leader election runnables
2024-09-05T19:55:57.521Z INFO Stopping and waiting for leader election runnables
2024-09-05T19:55:57.521Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "psmdbbackup-controller"}
2024-09-05T19:55:57.521Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "psmdbrestore-controller"}
2024-09-05T19:55:57.521Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "psmdb-controller"}
2024-09-05T19:55:57.521Z INFO All workers finished {"controller": "psmdbbackup-controller"}
2024-09-05T19:55:57.521Z INFO All workers finished {"controller": "psmdb-controller"}
2024-09-05T19:55:57.521Z INFO All workers finished {"controller": "psmdbrestore-controller"}
2024-09-05T19:55:57.521Z INFO Stopping and waiting for caches
W0905 19:55:57.522159 1 reflector.go:470] pkg/mod/k8s.io/client-go@v0.30.0/tools/cache/reflector.go:232: watch of *v1.PerconaServerMongoDB ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
2024-09-05T19:55:57.522Z INFO Stopping and waiting for webhooks
2024-09-05T19:55:57.522Z INFO Stopping and waiting for HTTP servers
W0905 19:55:57.522364 1 reflector.go:470] pkg/mod/k8s.io/client-go@v0.30.0/tools/cache/reflector.go:232: watch of *v1.PerconaServerMongoDBBackup ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
2024-09-05T19:55:57.522Z INFO controller-runtime.metrics Shutting down metrics server with timeout of 1 minute
2024-09-05T19:55:57.522Z INFO shutting down server {"name": "health probe", "addr": "[::]:8081"}
2024-09-05T19:55:57.522Z INFO Wait completed, proceeding to shutdown the manager
1.17.0 log looks okey:
% k logs -f percona-server-mongodb-operator-59b4bccf5f-b5hrr
2024-09-05T19:55:56.570Z INFO setup Manager starting up {"gitCommit": "5019408f1fe40483fc5effaf61ab3f672765b189", "gitBranch": "release-1-17-0", "goVersion": "go1.22.6", "os": "linux", "arch": "amd64"}
2024-09-05T19:55:56.602Z INFO server version {"platform": "kubernetes", "version": "v1.29.7-gke.1104000"}
2024-09-05T19:55:56.613Z INFO starting server {"name": "health probe", "addr": "[::]:8081"}
2024-09-05T19:55:56.613Z INFO controller-runtime.metrics Starting metrics server
I0905 19:55:56.613552 1 leaderelection.go:250] attempting to acquire leader lease psmdb-operator2/08db0feb.percona.com...
2024-09-05T19:55:56.613Z INFO controller-runtime.metrics Serving metrics server {"bindAddress": ":8080", "secure": false}
I0905 19:56:16.087228 1 leaderelection.go:260] successfully acquired lease psmdb-operator2/08db0feb.percona.com
2024-09-05T19:56:16.088Z INFO Starting EventSource {"controller": "psmdbrestore-controller", "source": "kind source: *v1.PerconaServerMongoDBRestore"}
2024-09-05T19:56:16.089Z INFO Starting EventSource {"controller": "psmdbrestore-controller", "source": "kind source: *v1.Pod"}
2024-09-05T19:56:16.088Z INFO Starting EventSource {"controller": "psmdb-controller", "source": "kind source: *v1.PerconaServerMongoDB"}
2024-09-05T19:56:16.089Z INFO Starting EventSource {"controller": "psmdbbackup-controller", "source": "kind source: *v1.PerconaServerMongoDBBackup"}
2024-09-05T19:56:16.092Z INFO Starting EventSource {"controller": "psmdbbackup-controller", "source": "kind source: *v1.Pod"}
2024-09-05T19:56:16.092Z INFO Starting Controller {"controller": "psmdbbackup-controller"}
2024-09-05T19:56:16.092Z INFO Starting Controller {"controller": "psmdb-controller"}
2024-09-05T19:56:16.093Z INFO Starting Controller {"controller": "psmdbrestore-controller"}
2024-09-05T19:56:16.345Z INFO Starting workers {"controller": "psmdbrestore-controller", "worker count": 1}
2024-09-05T19:56:16.405Z INFO Starting workers {"controller": "psmdbbackup-controller", "worker count": 1}
2024-09-05T19:56:16.405Z INFO Starting workers {"controller": "psmdb-controller", "worker count": 1}
2024-09-05T19:56:17.664Z INFO add new job {"controller": "psmdb-controller", "object": {"name":"my-cluster-name","namespace":"psmdb2"}, "namespace": "psmdb2", "name": "my-cluster-name", "reconcileID": "6acc1f2b-1dbc-47b2-9658-2d327778cce1", "name": "ensure-version/psmdb2/my-cluster-name", "schedule": "0 2 * * *"}
Hi,
During CW update of operator from 1.16.2 to 1.17.0 with both bundle and separate appliance of crd.yaml, rbac.aml, and operator.yaml the errors are seen in 1.16.2 operator log. Update finishes ok.
STR 1 - use bundles:
Update
cw-bundle.yaml
1.16.2 with WATCH_NAMESPACE and apply bundle inpsmdb-operator
namespace- name: WATCH_NAMESPACE value: "psmdb"
Start psmdb cluster using 1.16.2
cr.yaml
Update
cw-bundle.yaml
1.17.0 with the same WATCH_NAMESPACE and applycw-bundle.yaml
1.17.0 inpsmdb-operator
namespace. The 1.16.2 Operator POD log contains these errors during shutdown:2024-09-05T19:49:39.669Z INFO added to shard {"controller": "psmdb-controller", "object": {"name":"my-cluster-name","namespace":"psmdb2"}, "namespace": "psmdb2", "name": "my-cluster-name", "reconcileID": "6f394093-4e1d-47cc-94ec-160c03c595b0", "rs": "rs0"} ^[[B^[[B^[[B2024-09-05T19:55:57.521Z INFO Stopping and waiting for non leader election runnables 2024-09-05T19:55:57.521Z INFO Stopping and waiting for leader election runnables 2024-09-05T19:55:57.521Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "psmdbbackup-controller"} 2024-09-05T19:55:57.521Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "psmdbrestore-controller"} 2024-09-05T19:55:57.521Z INFO Shutdown signal received, waiting for all workers to finish {"controller": "psmdb-controller"} 2024-09-05T19:55:57.521Z INFO All workers finished {"controller": "psmdbbackup-controller"} 2024-09-05T19:55:57.521Z INFO All workers finished {"controller": "psmdb-controller"} 2024-09-05T19:55:57.521Z INFO All workers finished {"controller": "psmdbrestore-controller"} 2024-09-05T19:55:57.521Z INFO Stopping and waiting for caches W0905 19:55:57.522159 1 reflector.go:470] pkg/mod/k8s.io/client-go@v0.30.0/tools/cache/reflector.go:232: watch of *v1.PerconaServerMongoDB ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding 2024-09-05T19:55:57.522Z INFO Stopping and waiting for webhooks 2024-09-05T19:55:57.522Z INFO Stopping and waiting for HTTP servers W0905 19:55:57.522364 1 reflector.go:470] pkg/mod/k8s.io/client-go@v0.30.0/tools/cache/reflector.go:232: watch of *v1.PerconaServerMongoDBBackup ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding 2024-09-05T19:55:57.522Z INFO controller-runtime.metrics Shutting down metrics server with timeout of 1 minute 2024-09-05T19:55:57.522Z INFO shutting down server {"name": "health probe", "addr": "[::]:8081"} 2024-09-05T19:55:57.522Z INFO Wait completed, proceeding to shutdown the manager
1.17.0 log looks okey:
% k logs -f percona-server-mongodb-operator-59b4bccf5f-b5hrr 2024-09-05T19:55:56.570Z INFO setup Manager starting up {"gitCommit": "5019408f1fe40483fc5effaf61ab3f672765b189", "gitBranch": "release-1-17-0", "goVersion": "go1.22.6", "os": "linux", "arch": "amd64"} 2024-09-05T19:55:56.602Z INFO server version {"platform": "kubernetes", "version": "v1.29.7-gke.1104000"} 2024-09-05T19:55:56.613Z INFO starting server {"name": "health probe", "addr": "[::]:8081"} 2024-09-05T19:55:56.613Z INFO controller-runtime.metrics Starting metrics server I0905 19:55:56.613552 1 leaderelection.go:250] attempting to acquire leader lease psmdb-operator2/08db0feb.percona.com... 2024-09-05T19:55:56.613Z INFO controller-runtime.metrics Serving metrics server {"bindAddress": ":8080", "secure": false} I0905 19:56:16.087228 1 leaderelection.go:260] successfully acquired lease psmdb-operator2/08db0feb.percona.com 2024-09-05T19:56:16.088Z INFO Starting EventSource {"controller": "psmdbrestore-controller", "source": "kind source: *v1.PerconaServerMongoDBRestore"} 2024-09-05T19:56:16.089Z INFO Starting EventSource {"controller": "psmdbrestore-controller", "source": "kind source: *v1.Pod"} 2024-09-05T19:56:16.088Z INFO Starting EventSource {"controller": "psmdb-controller", "source": "kind source: *v1.PerconaServerMongoDB"} 2024-09-05T19:56:16.089Z INFO Starting EventSource {"controller": "psmdbbackup-controller", "source": "kind source: *v1.PerconaServerMongoDBBackup"} 2024-09-05T19:56:16.092Z INFO Starting EventSource {"controller": "psmdbbackup-controller", "source": "kind source: *v1.Pod"} 2024-09-05T19:56:16.092Z INFO Starting Controller {"controller": "psmdbbackup-controller"} 2024-09-05T19:56:16.092Z INFO Starting Controller {"controller": "psmdb-controller"} 2024-09-05T19:56:16.093Z INFO Starting Controller {"controller": "psmdbrestore-controller"} 2024-09-05T19:56:16.345Z INFO Starting workers {"controller": "psmdbrestore-controller", "worker count": 1} 2024-09-05T19:56:16.405Z INFO Starting workers {"controller": "psmdbbackup-controller", "worker count": 1} 2024-09-05T19:56:16.405Z INFO Starting workers {"controller": "psmdb-controller", "worker count": 1} 2024-09-05T19:56:17.664Z INFO add new job {"controller": "psmdb-controller", "object": {"name":"my-cluster-name","namespace":"psmdb2"}, "namespace": "psmdb2", "name": "my-cluster-name", "reconcileID": "6acc1f2b-1dbc-47b2-9658-2d327778cce1", "name": "ensure-version/psmdb2/my-cluster-name", "schedule": "0 2 * * *"}