Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VACUUM for ModifyColumnMigration should not be done at full speed #109

Open
jonathanjouty opened this issue Feb 13, 2024 · 3 comments
Open

Comments

@jonathanjouty
Copy link
Contributor

The implementation for ModifyColumnMigration uses a VACUUM step but this is done at "full speed", and it should ideally be slowed down so that it does not overwhelm the database being operated on.

This can be achieved using Cost-based Vacuum Delay (Note: the documentation's introductory paragraph to those PostgreSQL parameters are worth reading). In particular vacuum_cost_delay needs to be non-zero to enable this feature on a manual VACUUM (for clarity: and that is what we are doing in the implementation).

Chatting to @marco44, these values should ideally be inferred from the table's equivalent autovacuum_* values, as these are assumed to be safe (in terms of low risk of DB performance degradation). However, they can be -1 (which implies using the vacuum_* values instead) and so this needs to be checked rather than blindly copied.

@arybczak
Copy link
Collaborator

arybczak commented Sep 30, 2024

This was implemented in #49 and AFAIR vacuum as it is there was added because of consultation with @marco44 (see #49 (comment)). I'm guessing things have changed since then?

@jonathanjouty
Copy link
Contributor Author

Not sure if things have changed since #49. vacuum_cost_delay seems to exist since at least PG 12 from a quick glance at docs, so I suspect #49 was a good starting point, and further examination (and @marco44's Engineering Demo) revealed that we could further minimise the impact on running applications when doing a ModifyColumnMigration.

There is not much in our Slack, apart from me announcing this very issue 🤪
https://scrive.slack.com/archives/C01CRN6D4HJ/p1707837775942649

I'll let @marco44 chime in if there's more :)

@marco44
Copy link

marco44 commented Oct 1, 2024

Nothing has changed. It's just that having cost limits in place makes it possible to have a less aggressive migration. It's a trade-off. Being capable of doing it gently may be useful in case we have something really nasty to do (like a migration on a very large table), and we know it's going to take a long time anyway…
I suppose 99% of the time we don't care, because the tables are not that large, so we'll mostly hit the cache when vacuuming

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants