Replica sets can't connect to config server when TLS is used

Description

I've got a MongoDB 4.0.11 cluster set up with routers, a config server, and two replicasets. Strict SSL checking is used, but I've got the same set of certs on each node. pbm-agent is configured using the URI

PBM_MONGODB_URI=mongodb://pbmAgent:pass@mongo1-cmgo-replica-0.mongo1-cmgo.default.svc.cluster.local:27017/?authSource=admin&replicaSet=mongo1&tls=true&tlsCertificateKeyFile=/data/db/private/client.pem&tlsCAFile=/data/db/private/cacert.pem

and can talk to the local mongod process. pbm is also talking to the config servers. When I execute the pbm backup command, the backup hangs. Looking at the logs, pdb-agent is failing to connect to the config server:

2019/11/22 23:37:20 connect to mongodb: create mongo connection to configsvr: mongo ping: server selection error: server selection timeout

info current topology: Type: Unknown
info Servers:
Addr: mongocnf-cmgo-replica-0.mongocnf-cmgo.default.svc.cluster.local:27017, Type: Unknown, State: Connected, Average RTT: 0, Last error: connection() : connection(mongocnf-cmgo-replica-0.mongocnf-cmgo.default.svc.cluster.local:27017[-187]) unable to decode message length: EOF
23:37:20 docker-entrypoint.sh: LOG:info Addr: mongocnf-cmgo-replica-1.mongocnf-cmgo.default.svc.cluster.local:27017, Type: Unknown, State: Connected, Average RTT: 0, Last error: connection() : connection(mongocnf-cmgo-replica-1.mongocnf-cmgo.default.svc.cluster.local:27017[-186]) unable to decode message length: EOF
23:37:20 docker-entrypoint.sh: LOG:info Addr: mongocnf-cmgo-replica-2.mongocnf-cmgo.default.svc.cluster.local:27017, Type: Unknown, State: Connected, Average RTT: 0, Last error: connection() : connection(mongocnf-cmgo-replica-2.mongocnf-cmgo.default.svc.cluster.local:27017[-185]) unable to decode message length: EOF

On the Mongod process on the client is complaining that the connections are not SSL:

I NETWORK [conn9279] Error receiving request from client: SSLHandshakeFailed: The server is configured to only allow SSL connections. Ending connection from 192.168.1.198:60382 (connection id: 9279)

192.168.1.198 is the address of the mongo replica that is doing part of the backup. The other replica set has a node going crazy this way as well.

I don't see anything in the documentation about how to configure the SSL from the replicas to the config server. If they just used the URI that the PBM backup command was given, it would work.

Environment

CentOS Linux release 7.6.1810 (Core) (docker container)

percona-server-mongodb-server-4.0.12-6.el7.x86_64

percona-backup-mongodb-1.0.0-1.el7.x86_64

Smart Checklist

Activity

Show:

Peter Schwaller January 18, 2020 at 12:42 AM

Please let the bot auto-close this issue (just to actually test this code in production...).

Akira Kurogane December 10, 2019 at 12:54 AM

Hi

The TLS testing hasn't been done yet; but I did promise to share if the (testing) package was available

If you install the percona-release repo management tool and execute sudo percona-release enable tools testing you will be getting that currently-in-test version.

I'm guessing you've already installed percona-release, so I'll just show the step of installing pbm-backup-mongodb (includes pbm and pbm-agent and systemd service unit files for it).
Debian/Ubuntu case

If RHEL of course use yum instead.

Akira Kurogane December 3, 2019 at 1:19 AM

Sorry I can't put an exact date on it yet but testing and release after the dev stops (where we just entered now) can take up to two weeks.

Mike Norton December 2, 2019 at 7:43 PM

Thank you, that would be excellent. I'm hoping the 1.1.0 release isn't too far away, I can't find a schedule.

Akira Kurogane December 2, 2019 at 6:57 AM

Hi FYI we think the patch is only in need of testing. If a build is ready for more than a few days ahead of the likely v1.1.0 release date I'll let you know. Building the master branch code (current git commit = 000c0910) will also work.

Done

Details

Assignee

Reporter

Needs QA

Yes

Time tracking

7h logged

Components

Fix versions

Affects versions

Priority

Smart Checklist

Created November 23, 2019 at 12:04 AM
Updated March 5, 2024 at 7:20 PM
Resolved January 15, 2020 at 4:48 PM