-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PostgreSQL backup verification is failing in v2.8.1 and v2.9.0 #132
Comments
I had to dig into this issue as verification is failing when testing my latest changes from #139 I started with testing a known working backup created using 2.6.0: verification succeeds with 2.6.0 as well as with the latest version. Comparing the sql files from each backup shows that the difference consists in the newer (post @bolyachevets I see you introduced the change likely to fix issues with the pager duty exporter process and related custom roles. Do you have any insight on what exactly is different in your use-case when compared with more "plain" backup-container usage? It looks like while things may work in that scenario (which I can't test myself), they will consistently break for standard usage. |
I think the way we start the local server (see here) is creating the users that are in the role dump and possibly causing the issues. I will test, if that is the case we might need to have a breaking change to only support roles, at least during verification, from the dump and not specify custom users during server startup - in this case older backup files without roles will likely not be verifiable anymore. |
@esune, I think you're headed in the right direction with identifying the issue. The code you pointed out uses the built in initialization scripts in the container to start the postgres server locally (mimicking how an new database instance is initialized in OpenShift). That includes creation of the user and admin accounts and credentials based on the environemnt variable settings. So if the creation of any of those accounts is duplicated in the dump there are going to be errors. This issue seems to affect the restore process, which is the first step in the verification process. We need to ensure the restore process continues to work without error, otherwise we loose the whole purpose of the backup-container. |
When upgrading to v2.9.0 our PostgreSQL 13.13 backup verifications are failing with the error:
When looking into the problem it was found that everything works fine with v2.8.0, but v2.8.1 and v2.9.0 fail in this same way.
The backups up to and including v2.8.0 have the
PostgreSQL database dump
at the top of the backup file, followed by thePostgreSQL database cluster dump
that is:However, in v2.8.1 and v2.9.0, the order of these two sections is reversed, with the
PostgreSQL database cluster dump
containing theCREATE ROLE postgres
coming before thePostgreSQL database dump
. It looks like the order was switched in commit eda584f.The logs for the failure look like:
To complicate things, the same error does occur in v2.8.0, even though the verification is marked as a success:
So if it was continuing with the verification even if there were errors (which is bad) is it possible that commit 7e46504 didn't get included in the build until v2.8.1?
The text was updated successfully, but these errors were encountered: