File locking module data store #3139

doriable · 2024-07-08T16:09:49Z

This adds file locking to the cache implementation of ModuleDataStore, using a
module.lock file. This fixes a race condition we had with concurrent
reads/writes for module data by concurrent invocations of the buf CLI.

In this implementation, we are taking a shared lock on <module_key>/module.lock
when reading from the cache, and taking an exclusive lock for writes.

When clearing out the module directory, the module.lock file is deleted last
to ensure that we hold the lock while everything else is being cleared out.

When writing data to the cache, we first take a shared lock and check module.yaml. If
module.yaml contains valid data, then we do not need to do an extraneous write. If not,
then we can upgrade to an exclusive lock, check module.yaml again, and if it is valid, then
we do not write, otherwise we proceed to writing the data.

A separate module.lock file is used instead of locking on an existing file, e.g. module.yaml
since that is the intended use of the flock package, since the file handler is not exposed
from the lock and Windows prevents the same process from accessing a region on a file it
has taken an exclusive lock on: (https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-lockfileex)

If the locking process opens the file a second time, it cannot access the specified region through this second handle until it unlocks the region.

This PR also adds support for setting a default lock timeout and retry delay on filelock.Locker using LockerOptions when constructing the Locker. If any timeouts and/or retry delays are passed to Lock through LockOptions, then those will take precedence.

This adds a test for concurrent cache reads/writes from _different_ caches. This is meant to capture the race condition that occurs when concurrent, independent invocations of the CLI interacts with the cache.

…ata-store

github-actions · 2024-07-08T16:10:08Z

The latest Buf updates on your PR. Results from workflow Buf CI / buf (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`✅ passed`	`✅ passed`	`✅ passed`	Aug 21, 2024, 9:15 PM

.golangci.yml

private/buf/bufcli/cache.go

private/bufpkg/bufmodule/bufmodulecache/bufmodulecache_test.go

private/bufpkg/bufmodule/bufmodulecache/module_data_provider.go

private/bufpkg/bufmodule/bufmodulecache/bufmodulecache_test.go

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go

private/bufpkg/bufmodule/bufmodulecache/bufmodulecache_test.go

…ata-store

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go

private/bufpkg/bufmodule/bufmodulecache/bufmodulecache_test.go

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go

private/pkg/filelock/locker.go

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go

private/bufpkg/bufmodule/bufmodulecache/bufmodulecache_test.go

pkwarren · 2024-07-11T01:09:43Z

We might consider landing #3123 before this so we can update to the latest filelock version. It requires Go 1.21+ in all of the more recent releases.

This adds some module.lock files to tests that read `testdata` directly as a cache.

private/buf/bufcli/cache.go

.../v3/modules/lock/b5/bufbuild.test/bufbot/people/fc7d540124fd42db92511c19a60a1d98/module.lock

private/bufpkg/bufmodule/bufmodulecache/bufmodulecache_test.go

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go

Co-authored-by: Philip K. Warren <[email protected]>

…ata-store

… constructor.

private/pkg/filelock/filelock.go

private/buf/bufcli/cache.go

make/buf/all.mk

private/buf/bufcli/cache.go

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go

bufdev · 2024-08-01T21:06:08Z

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go

+		data, err := storage.ReadPath(ctx, moduleCacheBucket, externalModuleDataFileName)
+		p.logDebugModuleKey(
+			moduleKey,
+			fmt.Sprintf("module data store put read check %s", externalModuleDataFileName),


Was this copied from previous code? I don't understand what this is, perhaps it was my fault though.

This was not -- this is part of the comment above:

// Before writing to the module directory, first get a shared lock and check module.yaml

We are reading the module.yaml preemptively before attempting to write new data to the cache.

bufdev · 2024-08-01T21:08:04Z

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go

-	}
-	return p.bucket.DeleteAll(ctx, dirPath)
-}
-
 func (p *moduleDataStore) putModuleData(


It's really difficult to understand what the code in here is trying to accomplish now. We need specific documentation as we go, and comment within explaining what this code block is trying to accomplish. Something like "we're going to first check for X, if that's not the case, we're going to do this overwrite like this, etc etc".

So getModuleDataForModuleKey currently only does read operations, and each step of that read is documented... we can add more documentation, but I need some clarification on which parts.

As the last comment in getModuleDataForModuleKey addresses, validity of content is always determined by module.yaml, and that is always expected to be written last.

Any error and or invalid module.yaml is returned as an error, and then handled by fetching new data and putting it into the cache. There is nothing being overwritten/changed by getModuleDataForModuleKey, it is only reading the cache.

Can you point me to the documentation? I can't follow what is going on here.

bufdev · 2024-08-01T21:09:04Z

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go

-	}
-	return p.bucket.DeleteAll(ctx, dirPath)
-}
-
 func (p *moduleDataStore) putModuleData(


I don't see anywhere where invalid data is cleaned up anymore. Perhaps I missed it. What happens if I have invalid data, say file "a.proto", and the valid data is only the file "b.proto", both written to same cache directory...will "a.proto" not be there anymore? Basically, I don't understand how the deleteInvalidModuleData case is now handled. It appears to only be handled for tar files (getModuleDataForModuleKey calls bucket.Delete).

So the validity of the contents of v3/modules/<digest_type>/<module_key> is determined by the presence of a valid module.yaml file. As documented at the end of getModuleDataForModuleKey, we rely on module.yaml to be written last.

So the case you're describing, where there is an invalid a.proto but a valid b.proto, this occurs because of some interruption to the process while writing a.proto. In which case, no module.yaml would be written, this would not be considered valid, and then new data is fetched, etc.

…ata-store

bufdev · 2024-08-14T17:26:58Z

private/buf/bufcli/cache.go

+	// This directory is used to store lock files for synchronizing reading and writing module data from the cache.
+	//
+	// Normalized.
+	v3CacheModuleLockRelDirPath = normalpath.Join("v3", "module_locks")


wellknowntypes doesn't use underscores, this shouldn't either, for consistency - modulelocks

bufdev · 2024-08-14T17:28:55Z

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go

+				if err := p.bucket.Delete(ctx, tarPath); err != nil {
+					return nil, err
+				}
+				// Return a path error indicating the module data was not found


This shouldn't be a PR comment - this should explained in the code for future readers who may question it. It should be clear to a reader at first glance why we are returning a fs.PathError instead of the error itself from reading the code comments - perhaps this wasn't the case in the past, but it needs to be going forward.

bufdev · 2024-08-14T17:31:57Z

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go

 		}
-	}()
+		// Only attempt to get a file lock when storing individual files


I don't know what this means - what is an "individual file"? It should be clear from the code comments as to what this code does. This may take a paragraph or two of comments.

bufdev · 2024-08-14T17:32:39Z

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go

-	}
-	return p.bucket.DeleteAll(ctx, dirPath)
-}
-
 func (p *moduleDataStore) putModuleData(


Can you point me to the documentation? I can't follow what is going on here.

…ata-store

…odule data store layer.

doriable added 5 commits July 4, 2024 15:13

Add test for concurrent cache reads + writes

c4cad75

This adds a test for concurrent cache reads/writes from _different_ caches. This is meant to capture the race condition that occurs when concurrent, independent invocations of the CLI interacts with the cache.

Implement file locking on module.yaml in for ModuleDataStore

d4eb36a

Skip test

a9dffdd

Fix lint

584b8f1

Merge remote-tracking branch 'origin/main' into file-locking-module-d…

f1953ef

…ata-store

doriable mentioned this pull request Jul 8, 2024

File locking module data provider #3140

Closed

doriable requested a review from pkwarren July 8, 2024 17:27

pkwarren reviewed Jul 8, 2024

View reviewed changes

doriable added 5 commits July 8, 2024 19:13

Updates from conversations + comments

c40e2fa

Merge remote-tracking branch 'origin/main' into file-locking-module-d…

e9589be

…ata-store

Close moduleDir before other delete operations

89a076d

Refactor test away from errgroup

0e49421

Use a lock file instead of locking on module.yaml

812da0c

doriable requested a review from pkwarren July 9, 2024 15:35

doriable added 3 commits July 9, 2024 13:58

Check module.yaml during PutModules to avoid writing extra data

caaad59

Fix delete invalid to use storage APIs

1ac3afd

Add locker options for filelock.Locker

ad470e5

pkwarren reviewed Jul 10, 2024

View reviewed changes

doriable added 2 commits July 10, 2024 14:21

Address comments + fix delete in delete invalid files

05ea4bb

Remove extra comment

7332a7d

pkwarren reviewed Jul 10, 2024

View reviewed changes

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go Show resolved Hide resolved

Only avoid calling deleteInvalidModuleData if module.yaml was not found

e7bf3cd

pkwarren reviewed Jul 10, 2024

View reviewed changes

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go Outdated Show resolved Hide resolved

pkwarren reviewed Jul 10, 2024

View reviewed changes

private/bufpkg/bufmodule/bufmodulestore/module_data_store.go Outdated Show resolved Hide resolved

pkwarren reviewed Jul 10, 2024

View reviewed changes

private/bufpkg/bufmodule/bufmodulecache/bufmodulecache_test.go Outdated Show resolved Hide resolved

doriable added 2 commits July 11, 2024 09:32

Address small comments

2b73dc4

Change filelocker as a required arg and adjust tests.

cbf7021

This adds some module.lock files to tests that read `testdata` directly as a cache.

Separate dir for file locking

e98467d

doriable requested review from pkwarren and bufdev July 24, 2024 17:04

pkwarren reviewed Jul 25, 2024

View reviewed changes

doriable and others added 6 commits July 25, 2024 10:23

Apply suggestions from code review

6a9416a

Co-authored-by: Philip K. Warren <[email protected]>

Merge remote-tracking branch 'origin/main' into file-locking-module-d…

b52c371

…ata-store

Delete lock files

4266bd0

Address comments

d4340a7

Fix lint

baa4f31

Remove nil checks for filelocker and use NopLocker when passed nil in…

1939643

… constructor.

pkwarren approved these changes Jul 25, 2024

View reviewed changes

Merge branch 'main' into file-locking-module-data-store

ede95e9

bufdev reviewed Aug 1, 2024

View reviewed changes

doriable added 4 commits August 1, 2024 18:20

Address some comments

d5d9178

Fix tests

5c3baa2

Merge remote-tracking branch 'origin/main' into file-locking-module-d…

44f3966

…ata-store

Merge remote-tracking branch 'origin/main' into file-locking-module-d…

2e46398

…ata-store

bufdev reviewed Aug 14, 2024

View reviewed changes

doriable added 2 commits August 21, 2024 13:24

Merge remote-tracking branch 'origin/main' into file-locking-module-d…

7fa96d1

…ata-store

Address comments

a6c476a

bufdev added the Needs review label Aug 21, 2024

doriable added 5 commits August 21, 2024 14:07

Unexpand dockerignore glob

c520685

Merge remote-tracking branch 'origin/main' into file-locking-module-d…

723e85c

…ata-store

Add in-line comments to document module read/write processes.

61a38bb

Merge remote-tracking branch 'origin/main' into file-locking-module-d…

fbd040f

…ata-store

Add comment explaining module.yaml as the source of validity at the m…

4598d8c

…odule data store layer.

bufdev removed the Needs review label Aug 21, 2024

bufdev approved these changes Aug 21, 2024

View reviewed changes

bufdev merged commit cb33efb into main Aug 21, 2024
12 checks passed

bufdev deleted the file-locking-module-data-store branch August 21, 2024 21:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File locking module data store #3139

File locking module data store #3139

doriable commented Jul 8, 2024 •

edited

Loading

github-actions bot commented Jul 8, 2024 •

edited

Loading

pkwarren commented Jul 11, 2024

bufdev Aug 1, 2024

doriable Aug 1, 2024

bufdev Aug 1, 2024

doriable Aug 1, 2024

bufdev Aug 14, 2024

bufdev Aug 1, 2024

doriable Aug 1, 2024

bufdev Aug 14, 2024

bufdev Aug 14, 2024

bufdev Aug 14, 2024

bufdev Aug 14, 2024

File locking module data store #3139

File locking module data store #3139

Conversation

doriable commented Jul 8, 2024 • edited Loading

github-actions bot commented Jul 8, 2024 • edited Loading

pkwarren commented Jul 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

doriable commented Jul 8, 2024 •

edited

Loading

github-actions bot commented Jul 8, 2024 •

edited

Loading