Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for Livy REST API #18

Open
nixent opened this issue May 26, 2024 · 7 comments
Open

Documentation for Livy REST API #18

nixent opened this issue May 26, 2024 · 7 comments

Comments

@nixent
Copy link

nixent commented May 26, 2024

Is there any documentation for the Livy endpoint, something similar to one for Synapse?

Fabric REST APi describes other elements of Fabric but not Livy.

@jcvdodson
Copy link
Contributor

jcvdodson commented Jun 1, 2024

There's not yet full public documentation for the Livy endpoint, but much of the functionality is the same as what is documented for Synapse. The endpoints are different, but they give a good idea of what Livy's functionality is.

Is there a particular thing you want to do?

@nixent
Copy link
Author

nixent commented Jun 4, 2024

@jcvdodson I'm looking for details about endpoints. lakehouse_endpoint suggests that lakehouseid is part of endpoint base url and then it is used in sessions endpoint for instance. My understanding is that session belongs to workspace and not to lakehouse since lakehouse is equivalent to database as Spark has 2 tier namespaces database_name.table_name

    def lakehouse_endpoint(self) -> str:
        # TODO: Construct Endpoint of the lakehouse from the 
        return f'{self.endpoint}/workspaces/{self.workspaceid}/lakehouses/{self.lakehouseid}/livyapi/versions/2023-12-01'

I'd like to understand how access to tables in different lakeshouses is managed via Livy endpoint.

@jcvdodson
Copy link
Contributor

jcvdodson commented Jun 4, 2024

You are correct that a Livy session belongs to a particular workspace. Livy itself doesn't really manage access to tables -- it just executes the SQL statements. A Livy session is considered "hosted" by a specific Lakehouse within the workspace, but as you can see in the submitLivyCode method, we don't actually submit statements against that Lakehouse, just to the Livy session.

You can execute statements against tables in different LH by prefacing the table name with the LH name, as you have[LH_name].[table_name], in the statement.

@nixent
Copy link
Author

nixent commented Jun 5, 2024

Thank you for the clarification.
In case I have multiple lakehouses which one you'd recommend setting in the dbt profile?

@jcvdodson
Copy link
Contributor

I think they would all be equivalent. To be transparent, though, I haven't tested on large lakehouses. My inclination is to say that the profiles.yml lakehouse should be the largest lakehouse or the one you expect to interact with most, but I'm not sure that this really has any practical effect.

@nixent
Copy link
Author

nixent commented Jun 5, 2024

Could you provide clarification on why the Livy endpoint is integrated within the lakehouse, whereas other endpoints are located under the workspace? Are there specific design aspects or considerations that users should be aware of in this context?

@jcvdodson
Copy link
Contributor

The hosting artifact (the Lakehouse) is there for internal reasons related to authentication flows. In practical effect (i.e. statement execution), Livy is tied to the workspace. The hosting artifact has no impact on statement execution -- so, there aren't any special considerations you need to be aware of when using Livy across Lakehouses within a Workspace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants