Docker Way Upgrade from PMM-Server 2.17.0 to 2.18.0 is Broken

General

Escalation

General

Escalation

Description

During the release testing, while upgrading the performance testing instance via Docker way Upgrade, we have pmm-server in restarting status.

Checking the docker logs, it is clear some issues related to

Impact on User: Docker-way Upgrade to 2.18.0 is broken and it would impact all users using PMM to monitor high number of instances.

Steps to recreate:

1) Setup PMM-Server 2.16.0, add a decent number of nodes and services for monitoring, in our test instance, we had a total of 33 nodes and 187 services being monitored.
2) Keep the instance running for about 2 days, then perform a docker-way upgrade to 2.17.0 and also perform the upgrade on all client instances.
3) keep this instance running for another 1 day
4) Perform docker way upgrade of PMM-server to 2.18.0-rc

pmm-server starts going into a restart loop, on checking logs we can see the error mentioned above.

How to test

None

How to document

None

Attachments

Linked issues

relates to

PMM-3352

Low file descriptors limit (1024) with AMI or OVF images causes errors

Smart Checklist

Activity

Show:

Denys Kondratenko June 1, 2021 at 7:25 AM

it is actually a blocker for the release. increasing limits on VM or host makes perfect sense but not inside of the container where supervisord could do nothing about it.

https://github.com/Percona-Lab/percona-images/pull/104

Puneet Kala May 31, 2021 at 9:13 AM
Edited

Had to apply these changes https://github.com/Percona-Lab/jenkins-pipelines/pull/984/files restarted and setup container again, couldn't find this issue. Hence closing the ticket.

Puneet Kala May 31, 2021 at 7:57 AM

Might be related to the changes done for https://jira.percona.com/browse/PMM-3352 could also be related to the configuration of the test instance.

Alex Demidoff May 29, 2021 at 9:49 PM
Edited

What do you recommend to raise `minfds` to? One million, 1,2 mln or other?

Done

Details

Assignee

Unassigned

Reporter

Puneet Kala

Priority

High

Components

Labels

defect

Needs QA

Yes

Fix versions

2.18.0

Regression Issue

Yes

Affects versions

2.18.0

Environment

Smart Checklist

Created May 29, 2021 at 12:13 PM

Updated March 6, 2024 at 2:39 AM

Resolved June 1, 2021 at 12:27 PM

Configure

Docker Way Upgrade from PMM-Server 2.17.0 to 2.18.0 is Broken

Description

How to test

How to document

Attachments

Linked issues

relates to

Smart Checklist

Activity

Denys Kondratenko June 1, 2021 at 7:25 AM

Puneet Kala May 31, 2021 at 9:13 AMEdited

Puneet Kala May 31, 2021 at 7:57 AM

Alex Demidoff May 29, 2021 at 9:49 PMEdited

Details

Assignee

Reporter

Priority

Components

Labels

Needs QA

Fix versions

Regression Issue

Affects versions

Environment

Smart Checklist

Puneet Kala May 31, 2021 at 9:13 AM
Edited

Alex Demidoff May 29, 2021 at 9:49 PM
Edited