Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when run gravitino-fileset-example.ipynb #76

Open
shaofengshi opened this issue Sep 19, 2024 · 0 comments
Open

Error when run gravitino-fileset-example.ipynb #76

shaofengshi opened this issue Sep 19, 2024 · 0 comments

Comments

@shaofengshi
Copy link
Contributor

Start the playground docker images, and then open Jupyter, run the gravitino-fileset-example.ipynb.

In the second step, I got this error:

from hdfs import InsecureClient

# Create a HDFS connector client
hdfs_client = InsecureClient('http://hive:50070', user='root')

# List HDFS file and directories
print(hdfs_client.list('/user/datastrato'))


HdfsError                                 Traceback (most recent call last)
Cell In[2], line 7
      4 hdfs_client = InsecureClient('http://hive:50070/', user='root')
      6 # List HDFS file and directories
----> 7 print(hdfs_client.list('/user/datastrato'))
      9 # hdfs_client.delete("/user/datastrato")

File /opt/conda/lib/python3.11/site-packages/hdfs/client.py:1118, in Client.list(self, hdfs_path, status)
   1116 _logger.info('Listing %r.', hdfs_path)
   1117 hdfs_path = self.resolve(hdfs_path)
-> 1118 statuses = self._list_status(hdfs_path).json()['FileStatuses']['FileStatus']
   1119 if len(statuses) == 1 and (
   1120   not statuses[0]['pathSuffix'] or self.status(hdfs_path)['type'] == 'FILE'
   1121   # HttpFS behaves incorrectly here, we sometimes need an extra call to
   1122   # make sure we always identify if we are dealing with a file.
   1123 ):
   1124   raise HdfsError('%r is not a directory.', hdfs_path)

File /opt/conda/lib/python3.11/site-packages/hdfs/client.py:118, in _Request.to_method.<locals>.api_handler(client, hdfs_path, data, strict, **params)
    116   if err.exception not in ('RetriableException', 'StandbyException'):
    117     if strict:
--> 118       raise err
    119     return res
    121 attempted_hosts.add(host)

HdfsError: File /user/datastrato does not exist.

Then I look into the HDFS, there is no "/user/datastrato" folder; Under "/user" there is only "/user/hive"

root@3c98b46a73b9:/# hadoop fs -ls /user OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0 24/09/19 06:50:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 1 items drwxr-xr-x - root hdfs 0 2024-09-19 06:32 /user/hive

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant