Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to disable periodic hflush during multithreaded writes? #1869

Open
julianhess opened this issue Jan 2, 2025 · 1 comment
Open
Assignees

Comments

@julianhess
Copy link

For performance reasons, the output stream is periodically flushed during a multithreaded BGZF write:

htslib/bgzf.c

Lines 1438 to 1449 in c705bec

/*
* Periodically call hflush (which calls fsync when on a file).
* This avoids the fsync being done at the bgzf_close stage,
* which can sometimes cause significant delays. As this is in
* a separate thread, spreading the sync delays throughout the
* program execution seems better.
* Frequency of 1/512 has been chosen by experimentation
* across local XFS, NFS and Lustre tests.
*/
if (++mt->flush_pending % 512 == 0)
if (hflush(fp->fp) != 0)
goto err;

However, when writing to certain filesystems that do not allow appending (e.g. mounted cloud storage buckets), calling fsync marks the file as "finished" and prohibits further writes to it. For instance,

$ samtools view -@4 -b /path/to/my.bam > /mnt/s3_bucket/out.bam # /mnt/s3_bucket is a bucket mounted via mountpoint-s3
samtools view: writing to standard output failed
samtools view: error reading file "/mnt/s3_bucket/out.bam"
samtools view: error closing standard output: -1

Note that this works fine in singlethreaded mode, since stdout (or any output file descriptor) only gets flushed at the very end.

I'm currently getting around this by piping the output stream through cat to absorb the fsync calls (i.e. samtools view -@4 ... | cat > ...), but it would be nice if htslib natively supported this (e.g. to allow for samtools commands with -o functionality to write directly to files). Maybe there could be an environment variable that universally disables the periodic flushing.

@daviesrob
Copy link
Member

Hm, I'd argue that mountpoint-s3's fsync() is broken if it departs so radically from the expected behaviour, although at least they do document what happens when you use it. They do also say that it should work more as expected on S3 Express One Zone, but no doubt that comes with a cost...

On the other hand, the periodic flushing was put in place to fix a problem on an ancient system which would fill the entire memory with buffered output, and then pretty much lock up solid as it tried to write all the data to storage. That was probably running a ten year old OS, so it would be worth revisiting to see if more modern systems behave better. We'd still want to call fsync() on file close though, as it provides much better assurance than close() that the data really has ended up somewhere permanent.

@daviesrob daviesrob self-assigned this Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants