ADBDEV-6936 Avoid reusing timeline for interrupted promotion #1173
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Each Postgres instance (that GPDB segments are) should use own timeline
to exclude WAL segment names interference if derived from the common
ancestor. When we derive the new instance from backup or promote a
replica, the Postgres searches the next free timeline number to identify
the given reincarnation. Postgres persists information about selected
timeline in the Checkpoint record or associates it with End of Recovery
record in the controldata. Also, it remembers switch point to the
history file. These two entities helps the Postgres to handle timeline
switch in case of failure and consequent recovery. But Postgres used to
remove signal files earlier than persisting new timeline info. As a
result, the Postgres had no way to remember about promotion or to try
again, if something goes wrong soon. So, Postgres carried out the common
crash recovery and continued the previous timeline. As a result, the
instance may overwrite next WAL segments and push it to the archive with
the same name with unpredictable results depending on archive_command
implementation. But even without archiving, possible presence of this
orphaned timeline in the history file may lead to unexpected error
during the next replica switching because LSN of such switch point is
before latest checkpoint record created on the reused timeline.
The optimal solution is to retry interrupted promotion. FTS is ready to
do it, at least. To achieve this goal and fall into archive recovery
again, we should delay removing signal files until we persist the new
timeline information. The only insignificant negative outcome for this
approach is possible assignment of the one more timeline identity, if
we fall right after history file creation.