Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why is the volume of data on the target DB much larger than on the source? #194

Open
Jamic28 opened this issue Nov 30, 2024 · 4 comments
Open

Comments

@Jamic28
Copy link

Jamic28 commented Nov 30, 2024

I am doing synchronous replication from standalone Postgres version 14 to a database cluster version 16. On the source database the data volume is 270 GB, and on the target cluster it is already 500 GB and the data copy is still going (no errors)...

root@db-1:/home/administrator# du -sh /var/lib/postgresql/16/main/base/
500G    /var/lib/postgresql/16/main/base/
"replication_stats_count_by_state": {
    "replicating": 128,
    "data_is_being_copied": 148
  },
  "message_lsn_receipts": [
    {
      "received_lsn": "394/3B0D3A58",
      "last_msg_send_time": "2024-11-30 14:38:54 UTC",
      "last_msg_receipt_time": "2024-11-30 14:38:49 UTC",
      "latest_end_lsn": "394/3B0D3A58",
      "latest_end_time": "2024-11-30 14:38:54 UTC"
    },
    {
      "received_lsn": null,
      "last_msg_send_time": "2024-11-30 14:38:04 UTC",
      "last_msg_receipt_time": "2024-11-30 14:38:04 UTC",
      "latest_end_lsn": null,
      "latest_end_time": "2024-11-30 14:38:04 UTC"
    },
    {
      "received_lsn": null,
      "last_msg_send_time": "2024-11-30 14:37:48 UTC",
      "last_msg_receipt_time": "2024-11-30 14:37:48 UTC",
      "latest_end_lsn": null,
      "latest_end_time": "2024-11-30 14:37:48 UTC"
    }
  ],
  "sync_started_at": "2024-11-30 08:22:38 UTC",
  "sync_failed_at": null,
  "switchover_completed_at": null

why is the volume of data on the target DB much larger than on the source?

@Jamic28
Copy link
Author

Jamic28 commented Dec 1, 2024

The inspection table of mydb on source is 237GB

public | inspection                                         | table | mydb | permanent   | heap          | 237 GB     | 

And on target is growing while start synch

 public | inspection                                         | table | mydb | permanent   | heap          | 500 GB     | 

could this be related to the fact that a record is being written to the table on the source base during synchronization?

@joetynan
Copy link
Contributor

I've run into this as well a couple times. and I'm not entirely sure why - but the target table is VASTLY larger.

@shayonj
Copy link
Owner

shayonj commented Dec 26, 2024

Does running vacuum on it help ?

@joetynan
Copy link
Contributor

joetynan commented Jan 2, 2025

Upon further review - looks like (in my instance at least) - a table that has a large amount of read/writes to it (say, if it's a cache table), will generate a rather large TOAST table on the target side. since vacuum can't run until the table's in a replicating state (I'm guessing there's a lock to prevent that from happening while the data copy is running), it can't clean it up until after it's in a replicating state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants