Yezzey is greenplum extension, which makes data offloading in Greenplum easy.
Yezzey extension defines API for creating data offloading policies and attach those policies to tables.
Data offloading means physical move of relation data to external storage, namely S3.
--- some more info here (TBD)
Conditions:
- no generic wal availabale in gp6 (pg 9.4)
- no custom access method in pg9.4
- no custom wal redo routines
This means we need to
Greenplum compatibility
Avoid binary incompatibily, meaning to avoid custom WAL record, which will be unredoable for vanil gp
Try to avoid custom relation forks, use of which need to be somehow WAL-logged or handled separately, meaning additional backup/restore complexity and corner cases to handle.
Yezzey defines custom smgr for AO/AOCS related storage operations
Table reading logic is the following:
Case: AO/AOCS file segments being read throught yezzey SMGR. Filename to be accessed is in form
base/<dboid>/<tableoid>.<segnum>
Read/write logic in GP with AO/AOCS tables works in following way:
- In case of read operation,
- Open AO/AOCS segment file.
- Set read/write offset to either 0 or $logicalEof
- Read/Write X bytes
- Close file
So, our read logic will be following:
- Check if base/<dboid>/<tableoid>. present locally. If yes, this means table (and this segment file) was not (yet) offloaded to external storage. So, process normally.
- If not, try to search for file with prefix segment/base//..<current_read_offset>* with highest epoch number is external storage (S3). Is there is, read them in lexicographically ascending order. Sum of sizes of external files should be >= than logical EOF (which can be found in pg_aoseg.pg_aoseg_ table)
- Read this file, while not exhausted
- If any failure, there is probably a corruption by to some unknown bugs in implementation.
Algo of AO/AOCS table offloading:
- Lock AO/AOCS relation in pg_class in exclusive mode. This prevents other concurrent sessions to read or write anything from this table.
- Write all table segments files to s3, one by one. Write them with name segment/base//..0 (last number means that zero is this file logical eof start)
- After that add relation file nodes to current transaction on commit pending deletion list.
Write to already offloaeded AO/AOCS segment logic is following:
- For each AO/AOCS segment we firstly resolve resolve highest epoch in which this changes are made. This is last number from lexicographically largest segment/base/<dboid>/<tableoid>.<segnum>.0.* file. Let in be Y
- Write new file with name segment/base/<dboid>/<tableoid>.<segnum>.<current_write_offset=logical_end_of_file>.Y
- success
/------\ Cloud (external storage) e.g. S3
|\______/| /-------\
| GP | ( )
| segment| -----> \vvvvvvv/
|\______/|
| |
| |
\______/
Need to change relfilenode while vacuuming yezzey relations, since yezzey does not support truncate operation propetly.
do not read this, this trash will be moved to separate doc/test files and explaned fully later
problems:
pg_aoseg.pg_aoseg_<tableoid> cannot be locked:
src/include/catalog/pg_class.h:172:#define RELKIND_AOSEGMENTS 'o' /* AO segment files and eof's */
LockRelationAppendOnlySegmentFile -> LOCKACQUIRE_ALREADY_HELD
Install:
/* */
psql postgres -f ./gpcontrib/yezzey/test/regress/yezzey.sql
gpconfig -c yezzey.storage_prefix -v 'wal-e/mdbtvdnna6t7oqaioeaj/6/segments_005' gpconfig -c yezzey.storage_bucket -v 'yandexcloud-dbaas-mdbtvdnna6t7oqaioeaj'
gpconfig -c yezzey.storage_host -v 's3.mds.yandex.net' gpconfig -c yezzey.storage_config -v '/home/gpadmin/yezzey_conf/yezzey_s3.conf'
gpconfig -c yezzey.storage_prefix -v "'wal-e/mdb8i7f8cr8ker9ec6a8/6/segments_005'" gpconfig -c yezzey.storage_bucket -v "'yandexcloud-dbaas-mdb8i7f8cr8ker9ec6a8'" gpconfig -c yezzey.storage_config -v "'/home/gpadmin/gpconfigs/yezzey.conf'" gpconfig -c yezzey.storage_host -v "'s3.mds.yandex.net'" gpconfig -c yezzey.walg_bin_path -v "'/usr/bin/wal-g-gp'" gpconfig -c yezzey.walg_config_path -v "'/etc/wal-g/wal-g.yaml'"
gpconfig -c yezzey.gpg_key_id -v "'4993C0545AF16F9F'"
gpconfig -c yezzey.storage_prefix -v 'wal-e/mdb8i7f8cr8ker9ec6a8/6/segments_005' gpconfig -c yezzey.storage_bucket -v 'yandexcloud-dbaas-mdb8i7f8cr8ker9ec6a8' gpconfig -c yezzey.storage_config -v '/home/gpadmin/gpconfigs/yezzey.conf' gpconfig -c yezzey.storage_host -v 's3.mds.yandex.net' gpconfig -c yezzey.walg_bin_path -v '/usr/bin/wal-g-gp' gpconfig -c yezzey.walg_config_path -v '/etc/wal-g/wal-g.yaml'
gpconfig -c yezzey.gpg_key_id -v '4993C0545AF16F9F'
gpconfig -c shared_preload_libraries -v yezzey
gpstop -a -i && gpstart -a