Issue upgrading dashboards in RC when doing a container update

Description

Encountered this trying to upgrade a PMM server from 2.26.0 to latest RC (released this morning, Apr 6). 

I stopped the old pmm server and ran the following command to get the RC installed:
docker run -d -p 8080:80 -p 8443:443 --volumes-from pmm-data --name pmm-server --restart always -e PMM_DEBUG=1 -e GF_SMTP_ENABLED=true -e GF_SMTP_HOST=smtp.gmail.com:587 -e GF_SMTP_USER=steve.hoffman@percona.com -e GF_SMTP_PASSWORD=xxxxx -e GF_SMTP_SKIP_VERIFY=false -e GF_SMTP_FROM_ADDRESS=steve.hoffman@percona.com -e GF_SMTP_FROM_NAME=Grafana -e GF_AUTH_LDAP_ENABLED=true -e GF_AUTH_LDAP_CONFIG_FILE=/srv/grafana/ldap.toml perconalab/pmm-server:2.27.0-rc

after the container created I looked at the status and it went to unhealthy after about 5-10 seconds so I looked in to see what was happening in /srv/logs.  

Tailing the pmm-update-perform-init.log I saw the following two errors that jumped out at me: 

TASK [sqlite-to-postgres : Start grafana again] ********************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "grafana: ERROR (not running)\ngrafana: ERROR (spawn error)\n"}
...ignoring

TASK [sqlite-to-postgres : Check if initial data were created] *****************
FAILED - RETRYING: Check if initial data were created (3 retries left).
FAILED - RETRYING: Check if initial data were created (2 retries left).
FAILED - RETRYING: Check if initial data were created (1 retries left).
fatal: [localhost]: FAILED! => {"attempts": 3, "changed": false, "query": "SELECT 1 FROM org WHERE id=1", "query_result": [], "rowcount": 0, "statusmessage": "SELECT 0"}

the failure to start grafana caught my attention so I looked deeper to see this in the grafana log: 

Failed to start grafana. error: failed to create admin user: pq: invalid input syntax for integer: "false"
failed to create admin user: pq: invalid input syntax for integer: "false"

 

I did not attempt to create the user by hand.  I stopped the container and dropped both the pmm-server and pmm-data container and recreated and was able to get in but lost all the config data.  

Didn't see a ticket on this but if a duplicate please close or merge if I provided anything useful.  

How to test

None

How to document

None

Attachments

2

Smart Checklist

Activity

Show:

vasyl.yurkovych September 12, 2023 at 5:12 PM

upgrade fails. Need to check this again after the 12494 fix

Puneet Kala April 25, 2022 at 6:20 AM

This is not reproducible anymore, I belive this was also a side effect, a defect of the SQLite to Postgres migration task

Puneet Kala April 7, 2022 at 6:10 AM

 basically the first 1-2 mins after replacing the container with RC, we basically see unhealthy status of docker container: 

Done

Details

Assignee

Reporter

Priority

Labels

Fix versions

Affects versions

Smart Checklist

Created April 6, 2022 at 8:51 PM
Updated March 6, 2024 at 1:22 AM
Resolved September 27, 2023 at 9:50 AM