Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS data node hostname issue #46

Closed
bugzyz opened this issue May 17, 2021 · 2 comments
Closed

HDFS data node hostname issue #46

bugzyz opened this issue May 17, 2021 · 2 comments

Comments

@bugzyz
Copy link

bugzyz commented May 17, 2021

Hi! We are using the hdfs service outside of the kubernetes cluster, so we are facing the HDFS data node host name issue.

Since the name node only return the internal data node hostname/ip to service outside the kubernetes cluster, so that external service cannot access to the data node through the hostname/ip. Can anyone help on this scenario? Thanks!

@cgiraldo
Copy link
Member

We don't have yet support for using hdfs outside kubernetes cluster. To do that, we should integrate the hadoop internals to kubernetes and this is not a direct task. We need to modify the hostname published by the datanodes, create a NodePort or Host port for each individual datanode, etc.

If you need to access hdfs from outside, you can try the httpfs service with a NodePort or even and Ingress.

@imtiny
Copy link

imtiny commented Jun 8, 2021

The current chart for hdfs seems like only support to enable ingress or not.

I tried to upload a large file(4GB) to hdfs through nginx ingress to httpfs, the client always raise exceptions...

I've spent a lot of time to configure the nginx ingress to support large file uploading, to support transfer-encoding: chunked, but failed.

Finally I change the httpfs-svc.yaml to NodePort, and successfully uploaded the large file.

@cgiraldo cgiraldo closed this as completed Sep 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants