Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-17302: Convert /cluster filestore APIs to JAX-RS #2470

Conversation

gerlowskija
Copy link
Contributor

@gerlowskija gerlowskija commented May 20, 2024

https://issues.apache.org/jira/browse/SOLR-17302

Description

Solr has slowly been modifying its v2 APIs to align around a more REST-ful model, implemented using the JAX-RS framework. Many APIs have already been through this re-alignment, but many more remain including Solr's filestore APIs (used internally by package store).

Solution

This commit migrates several of the filestore APIs, namely those under the /cluster/files path to use the JAX-RS framework. The APIs themselves are unchanged from an end-user perspective.

Other endpoints, particularly the "fetch filestore entry" (GET /api/node/files/somepath.txt) and "delete local filestore entry" (DELETE /api/node/files/somepath.txt) are not attempted in this PR, but left for another.

Tests

Existing filestore and package-store tests continue to pass.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.

This commit moves some of Solr's "filestore" APIs, those located under
the `/api/cluster/files` path, to JAX-RS.  Still needs some tidying up
and sanity-checking, but running a few of the relevant tests pass which
is promising given earlier results.
Copy link
Contributor

@epugh epugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like some great work here! I tagged some places where it would be nice ot have more descriptive variable names, however I suspect you are trying to minimize change. Do we still need DistribFileStore in a world with ClusterFileStore?


@Path("/cluster")
public interface ClusterFileStoreApis {
// TODO Better understand the purpose of the 'sig' parameter and improve docs here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... There is some signing of stuff, and it probably should be a more global capability for managing uploading things. Maybe you "sign" a configset for example. Or treat a file the same ways we treat a configset?

@@ -634,7 +635,7 @@ public Map<String, byte[]> getKeys() throws IOException {
// reads local keys file
private static Map<String, byte[]> _getKeys(Path solrHome) throws IOException {
Map<String, byte[]> result = new HashMap<>();
Path keysDir = _getRealPath(FileStoreAPI.KEYS_DIR, solrHome);
Path keysDir = _getRealPath(ClusterFileStore.KEYS_DIR, solrHome);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh I hate you _ methods and objects!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I don't love the convention either. There's several of these in DistribFileStore though - enough that I'm worried renaming will bloat this PR too much 😦

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it "errorprone" that we used to help us clean up our code and enforce some conventions? For another JIRA, see if errorprone would flag any "_" methods for example. We do need to make all our conventions enforced via automated check, there are too many people over too many years to otherwise keep code standards intact...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sort of thought maybe FileStoreAPI.java would be totally removed in favour of the new classes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately it will, but for scope reasons this PR only converts some of FileStoreAPI API's to JAX-RS.

I initially tried to do all the APIs at once, but the change got a bit unmanageable as time went on so I've split it into two PRs (of which this is the first).

solr/core/src/java/org/apache/solr/pkg/PackageAPI.java Outdated Show resolved Hide resolved
solr/core/src/java/org/apache/solr/pkg/PackageAPI.java Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need DistribFileStore when we have ClusterFileStore?

@gerlowskija
Copy link
Contributor Author

Thanks for the quick review Eric!

I tagged some places where it would be nice ot have more descriptive variable names, however I suspect you are trying to minimize change

Your suspicion is correct. This PR aims to convert the API(s) to JAX-RS without delving too deeply into cleaning up the underlying implementation. I think some of the stuff you commented on is innocuous enough that I can be tempted into making a few cleanups though, particularly some of the naming and javadoc/comment issues.

Do we still need DistribFileStore in a world with ClusterFileStore?

Yes, we do. There's some naming-related confusion going on here that I've definitely caused:

  1. FileStore is an interface that defines the operations that Solr's file-store has to support.
  2. DistribFileStore is the lone implementation of the FileStore interface.
  3. ClusterFileStore is the entrypoint for the relevant v2 APIs.

ClusterFileStore is, on its surface, a pretty bad naming choice given the pre-existing FileStore interface. It 100% appears to be an implementation of that interface, but it's not.

I do have an excuse, though it may be a bad one 😛 I chose ClusterFileStore for the v2 API class name to be consistent with a naming convention we've had since creating the "api" gradle module earlier this year. (i.e. interfaces in "api" module are FooApi, with classes in solr-core being Foo).

The filestore (for reasons I still don't really understand) makes APIs available under two paths: /api/cluster/files and /api/node/files. So I chose ClusterFileStoreApi/ClusterFileStore partially in an attempt to distinguish between these two sets of filestore APIs, and partially to maintain consistency with the existing naming convention.

The end result is clearly confusing; very open if anyone has other suggestions that'd avoid the ambiguity and confusion!?

@epugh
Copy link
Contributor

epugh commented May 21, 2024

If DistribFileStore is the lone implementation of FileStore, then do we need FileStore? I never liked the whole "Let me have an interface for everything" cause it doubles the files to navigate through. In the future, if we need a FileStore implementation because some wrote NonDistribFileStore then we can make it then... We aren't C code, we don't need a seperate .h file for every .cpp file!

@gerlowskija
Copy link
Contributor Author

If DistribFileStore is the lone implementation of FileStore, then do we need FileStore?

I suspect the intention was that having an interface leaves the door open to adding a Solr-standalone filestore implementation at some point if someone is motivated enough.

I don't feel strongly on whether the interface remains or not. Maybe that's worth cleaning up at some point, but IMO it's not necessary here since this PR isn't changing anything about how the FileStore is implemented, just some APIs around it.

@epugh
Copy link
Contributor

epugh commented May 24, 2024

If DistribFileStore is the lone implementation of FileStore, then do we need FileStore?

I suspect the intention was that having an interface leaves the door open to adding a Solr-standalone filestore implementation at some point if someone is motivated enough.

I don't feel strongly on whether the interface remains or not. Maybe that's worth cleaning up at some point, but IMO it's not necessary here since this PR isn't changing anything about how the FileStore is implemented, just some APIs around it.

Assuming we do it as a seperate JIRA, what do you think of renaming DistribFileStore to FileStore and removing the interface?

@epugh
Copy link
Contributor

epugh commented May 24, 2024

Reading back some comments, ClusterFileStore is the entry point that implements ClusterFileStoreAPI, is there an argument that ClusterFileStore should do everything DistribFileStore does?

So honestly, since the goal of this JIRA is just to go get /cluster over to jax-rs, we should probably tee up a seperate JIRA to try and untangle these classes post this one being committed.... Heck, and then maybe we take another whack at BlobHandler and maybe have a proper interface that works across all the implementations! I look forward to this being merged.

@gerlowskija
Copy link
Contributor Author

is there an argument that ClusterFileStore should do everything DistribFileStore does?

I guess? Idk about "do everything", but ClusterFileStore/ClusterFileStoreApi will eventually expose everything that DistribFileStore supports.

And I don't think that's a bad thing - it's an intentional "separation of concerns" thing. "ClusterFileStore"/"ClusterFileStoreApi" define the HTTP interface an end-user sees, and some of the API concerns around that. "DistribFileStore" is concerned with how we actually implement the "Filestore" idea in a SolrCloud context. In theory, we could rewrite the filestore entirely without having to touch the API code, and that's a good thing!

The names are a mess and that definitely needs straightened out, but even in the long term IMO each of these classes has a distinct purpose that make them worth keeping around. Totally agree about the other TODOs you mentioned in your comment above though - it'd be great if we could migrate all the BlobHandler usage onto the filestore. (Though as you mentioned - those are different tickets for sure.)

@gerlowskija gerlowskija merged commit 8c5ae1b into apache:main May 28, 2024
3 checks passed
@gerlowskija gerlowskija deleted the SOLR-17302-convert-filestore-write-apis-to-jaxrs branch May 28, 2024 18:12
gerlowskija added a commit that referenced this pull request May 28, 2024
This commit moves some of Solr's "filestore" APIs (those located under
the `/api/cluster/files` path) to JAX-RS.
@gerlowskija
Copy link
Contributor Author

I've since noticed two problems with this PR:

  1. The generated SolrRequest class for uploading files doesn't handle the request body correctly, making it largely unusable. We can fix this by modifying the JAX-RS annotations to be similar to those used in ZooKeeperReadAPI.
  2. This breaks a test that @AndreyBozhko had in-flight on PR SOLR-17304: PKG_VERSIONS not honored when loading the schema plugins #2471.

I'm going to revert until I've had a chance to fix (1) and so that Andrey isn't blocked.

gerlowskija added a commit that referenced this pull request May 30, 2024
gerlowskija added a commit that referenced this pull request May 30, 2024
gerlowskija added a commit to gerlowskija/solr that referenced this pull request Jun 4, 2024
This commit moves some of Solr's "filestore" APIs (those located under
the `/api/cluster/files` path) to JAX-RS.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants