Percona XtraDB 8.0.33-25-1 Nodes 2 and 3 cannot join the cluster

Description

To get started i didn't experience this issue with Percona Galera XtraDB 8.0.26, Below my server specs 

I have 3 Percona Galera XtraDB Cluster with 3 nodes
miniongal001-dev3
miniongal002-dev3
miniongal003-dev3

Percona Galera XtraDB version : 

 

OS Version : Ubuntu 20 bullseye/sid

When either miniongal00[2,3]-dev3 is trying to join the cluster if failed with the same error message on the joiner 

 

I attached the full log of miniongal002-dev3 showing the Error message above 

I also attached the config file for the node

 

Environment

None

Attachments

5

Activity

Show:

Thomas O'Brien November 20, 2023 at 4:28 PM

We've found the same problem running 8.0.34 as mysql:mysql on Alma EL9 hosts.  It cannot dereference the symlink for the exe in /proc.  Our patch for now is to comment out the readline.

/usr/bin/wsrep_sst_common 294-299

  1.    if which readlink >/dev/null; then

  2.        # Check to see if the symlink for the exe exists

  3.        if [[ -L /proc/${WSREP_SST_OPT_PARENT}/exe ]]; then

  4.            MYSQLD_PATH=$(readlink -f /proc/${WSREP_SST_OPT_PARENT}/exe)

  5.        fi

  6.    fi

Kamil Holubicki November 2, 2023 at 10:36 AM

Hi ,

Any news?

Kamil Holubicki October 26, 2023 at 6:48 AM
Edited

>We're not sure how this works in AWS with the exact same code, Or….  you are running mysql as the root user or something different.

No, I'm using it in a standard way. Installed as described here then sudo service mysql start/sudo service mysql stop, so the script is executed from mysql:mysql user context.

I'm glad that it finally worked for you, but would be thankful if you investigated what was the root cause that 'readlink' couldn't be executed. Does it require root privileges?

 

Mikael Gbai October 25, 2023 at 6:02 PM

I work with our sysadmin today and were able to make this works. we Performed a backup of wsrep_sst_xtrabackup-v2 and edit that script by adding "set +e" right before get_mysqld_path below the diff showing the only line we changed 

Adding the set +e tells bash to not exit on failed commands (which ), we noticed a combination of adding (set +e) and removing set -e) all over the scripts.

We're not sure how this works in AWS with the exact same code, Or….  you are running mysql as the root user or something different.

With that line in place for wsrep_sst_xtrabackup-v2, all our 3 nodes are up and running now. we can now use that latest version 8.0.33 in our environment. 

Kamil Holubicki October 25, 2023 at 12:22 PM

Hi , interesting case, indeed.

The interesting part in the provided logs is:

so for some reason, it was not possible to get mysqld path. Even if that was not possible, the sst script should go on, but it didn't.

Could you please:

  1. remove 'set -ox' from sst script. It was just for getting this log, it will not work with this setting, but we learned something.

  2. Provide from the joiner node:

    1.  the output of 'aa-status'

    2. the output of 'journalctl -xe' just after the failure

    3. output of 'which readlink'

    4. output of 'ls -la /bin/sh'

Initially, I thought that it was AppArmor, but I did some experiments on AWS instances and it is not. We've got two profiles used by PXC: one for mysqld and another one for SST script. They are installed by default in complain mode, so should not cause any problems. But just in case if you find them to be in enforced mode, please do:

and try again.

As I said above I did several tests on AWS instances Ubuntu 20.04 and everything works. Maybe it is something else restricting usage readlink in your environment?

Could you try to reproduce the problem on AWS instances and provide steps? 

Done

Details

Assignee

Reporter

Labels

Needs QA

Yes

Priority

Smart Checklist

Created October 16, 2023 at 11:57 PM
Updated June 6, 2024 at 7:58 AM
Resolved January 12, 2024 at 2:18 PM