Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 support #2

Merged
merged 127 commits into from
Jan 24, 2025
Merged

S3 support #2

merged 127 commits into from
Jan 24, 2025

Conversation

cgoina
Copy link
Collaborator

@cgoina cgoina commented Jan 24, 2025

This is a large PR that implements support for accessing S3 buckets using AWS java2 SDK.

Cristian Goina added 30 commits July 30, 2024 18:42
…d original implementations so that I can override them
…alues for root storage uri; handle the null root storage uri
…ntService; implemented two endpoints from VolumeStorageResource and fixed unit tests
…ys takes bucket as an arg so the path should no longer contain the bucket
Copy link
Member

@krokicki krokicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to poke around and find the most important changes but there are too many files to go through one by one. For large PRs it's helpful to have a high level overview about where to look.

None of my comments should prevent merging, they can be addressed later.

private static final ConcurrentMap<String, S3Adapter> S3_ADAPTERS = new ConcurrentHashMap<>();

public S3Adapter getS3Adapter(String bucket, String endpoint, JADEOptions s3Options) {
return S3_ADAPTERS.computeIfAbsent(bucket, (b) -> new S3Adapter(b, endpoint, s3Options));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a bucket with the same name in both AWS and VAST, this would only cache one of them. Why not key on endpoint+bucket?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good point - will do it

return credentialsProvider;
}

private boolean isTryAnonymousFirst(JADEOptions s3Options) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to explain options like this. What does it mean to try anonymous first? Why would you want to set this? It's not documented here or in JADEOptions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that should be removed in the end and leave the anonymous credentials always first in the chain of credentials. Being first is needed in order to access open bucket that do not require a signature but the AWSCRT async client fails when that is first in the list.

private Path getContentPath(String contentLocation) {
Path contentPath = Paths.get(contentLocation);
if (Files.notExists(contentPath)) {
return null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this null, but the IOException causes an exception? Shouldn't it be FileNotFoundException or something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what is the best behaviour here. My option was to return an empty list which is in line with 'aws cli ls' does when you provide an invalid prefix


private List<ContentNode> listContentFromPath(Path contentPath, ContentAccessParams contentAccessParams) {
if (contentPath == null) {
return Collections.emptyList();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems dangerous. If the file doesn't exist, the content path is set to null and passed into this method. Then this method claims that a non existing path has no files. Not sure that makes sense. That means if you pass a non existing path to listContentNodes you will get an empty list which can be very misleading.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am changing it to throw the exception

throw new ContentException("Content found at " + contentLocation + " is not a regular file");
}
} else {
throw new NoContentFoundException("No object found at " + contentLocation);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here file not found is treated correctly, as an exception. Why doesn't listContentNodes use the same pattern?

@cgoina cgoina merged commit 112fae3 into master Jan 24, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants