PBM logs inexistent PITR error during physical restore
General
Escalation
General
Escalation
Description
Environment
None
Activity
Show:
Details
Details
Assignee
Unassigned
UnassignedReporter
Boris Ilijic
Boris IlijicNeeds QA
Yes
Priority
Smart Checklist
Open Smart Checklist
Smart Checklist

Open Smart Checklist
Created 2 days ago
Updated 2 days ago
Problem description
During the physical restore, PITR is stopped, but the main PITR loop reports the db connection error when MongoDb is down during the physical restore.
The error is present in logs with severity E and is misleading:
2025-03-14T09:42:35.000+0000 E [pitr] init: get conf: get: server selection error: context canceled, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: cfg01:27017, Type: Unknown, Last error: dial tcp: lookup cfg01 on 127.0.0.11:53: no such host }, ... ] }
STR:
Start physical restore that takes more than few minutes
Log entry will be present at least on one pbm-agent
Solution proposition
Log entry should be completely omitted, or if it's hard to distinguish that state from the actual db down error, it should be logged with severity D.