It's looking for a checkpoint record in the transaction log that probably doesn't exist or is corrupted. You can determine if this is the case by running:
If the transaction log is corrupt, you'll see a message like:
The database server was not shut down cleanly.
Resetting the transaction log might cause data to be lost.
If you want to proceed anyway, use `-f` to force reset.
You can then follow the instructions and run with -f to force the update:
That should reset the transaction log. However, it could leave your database in an indeterminate state as explained in the PostgreSQL documentation on pg_resetwal:
If pg_resetwal complains that it cannot determine valid data for
pg_control, you can force it to proceed anyway by specifying the
-f (force) option. In this case plausible values will be substituted
for the missing data. Most of the fields can be expected to match, but
manual assistance might be needed for the next OID, next transaction
ID and epoch, next multitransaction ID and offset, and WAL starting
location fields. These fields can be set using the options discussed
below. If you are not able to determine correct values for all these
fields, -f can still be used, but the recovered database must be
treated with even more suspicion than usual: an immediate dump and
reload is imperative. Do not execute any data-modifying operations in
the database before you dump, as any such action is likely to make the
corruption worse.
Do you do continuous archiving? If you are backing up at the time, you may find it more prudent to remove backup_label. pg_resetxlog is a severe thing.
just like the log saying : could not locate a valid checkpoint record.Postgres can't find a properly WAL under the $PGDATA/pg_xlog/ directory.
Try to use pg_resetxlog
As indicated here pg_resetxlog should not be run. The answers that refer to this is bad advice. Assuming the error occured in a context of copy/replication instance, the link provides a more succinct way of doing copy/replication with pg_basebackup
This error will result in the container getting constantly killed and restarted.
The first step is to get the container up and running so that we can exec into the container and run pg_resetwal or pg_resetxlog .
In the postgres docker layer info, we can see that
ENTRYPOINT is ["docker-entrypoint.sh"] and CMD is ["postgres"]
docker-entrypoint.sh script will run any linux command passed as argument.
If you are on docker then passing /bin/bash will override default CMD and give you access to container shell,
docker run -it -v /my_data:/var/lib/postgresql/data postgres:9.6.22 /bin/bash
here /var/lib/postgresql/data is the postgres data directory inside the container.
Once inside the continer, run below commands based on your postgres version. This will reset the transaction logs (WAL)
On postgres >= 10
pg_resetwal /var/lib/postgresql/data
On postgres < 10
pg_resetxlog /var/lib/postgresql/data
Top rated answer on this thread explains more on the pg_resetwal commands.
Finally, you can exit this container and start the postgres DB container with its original CMD.
Some Additional Info
If you see below error, it might be because the data directory you have specified above might be incorrect.
pg_resetxlog: could not open file "PG_VERSION" for reading: No such file or directory
You can check the PGDATA env variable for the right path.
In case of any container orchestrator like kubernetes, rancher v1 (since we cant run docker commands directly), we will have to start the container with a process like sleep. Pass the below as cmd or args in your orchestrator manifest.
sleep infinity
or
sh -c 'while sleep 3600; do :; done'
Then enter the container using tool like kubectl exec. Once you are inside, pg_resetwal/pg_resetxlog commands can be run.