-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minio local endpoint is broken #2303
Comments
minio is an s3 dropin replacement, so your bucket_url needs to follow s3 notation: "s3://my_bucket" |
sorry @sh-rp. I was not showing my bucket actual value. The value was with "s3://" prefix. And DLT was OK to write to it. But it failed to read. And that's the issue. |
@sh-rp you may be right that the issue could be somewhere else. I cannot share all of my code how I run DLT in a concurrent environment where I see that DLT struggles: it works locally on Mac and fails to find tables in kubernetes. Now this thing with being able to write, but not to read.... While debugging DLT code I saw that you pass all DuckDB secret settings in a right way. But when interacting with DLT it fails to get a correct endpoint url. I'm not sure if that's something related to DuckDB thread safety specifics or anything else. Anyway, I dropped DLT for reading and replaced with plain pyarrow dataset. I will likely get back to DLT when I'd need a more complex data relationships than a single table. Sorry for too many details. IMO the important bit is that DLT |
OK. I am not quite sure what the problem is though, maybe there is something going on with your minio server and it only accepts one connection at a time or something like that? I would need some code that is locally reproducible. |
@sh-rp the thing is that endpoint_url ("http://localhost:9000") is cut off protocol when making request to minio. Please see the error "...HTTP GET to '//localhost:9000/ ...". See the missing "http:" piece here. I will do my best to provide a workable code to reproduce when I have some time. |
@smasyutin I have attached a PR which should fix this problem for your. Now http (without ssl) urls are interpreted correctly. You may also need to set the url_style to path, depending on wether you have set up your minio to support vhost path styles or not. Can you let me know if this fixes your problem? |
Hi @sh-rp . Thanks for the update and sorry for the delay. |
@sh-rp I tested it with local dlt from the branch and it works for me with |
dlt version
1.6.1
Describe the problem
I want to read a pipeline dataset from my local minio instance. I create pipeline in code as
But then when I try to read it as
I get an error
IO Error: Could not establish connection error for HTTP GET to '//localhost:9000/?encoding-type=url&list-type=2&prefix=...'
The "http:" was cut out of the URL.
Expected behavior
I work flawlessly with dlthub regardless if it is local, cloud or self-hosted storage like minio.
Steps to reproduce
I want to read a pipeline dataset from my local minio instance. I create pipeline in code as
But then when I try to read it as
I get an error
IO Error: Could not establish connection error for HTTP GET to '//localhost:9000/?encoding-type=url&list-type=2&prefix=...'
The "http:" was cut out of the URL.
Operating system
macOS
Runtime environment
Local
Python version
3.11
dlt data source
No response
dlt destination
Filesystem & buckets
Other deployment details
No response
Additional information
No response
The text was updated successfully, but these errors were encountered: