Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File file:/user/hive/warehouse/helloworld does not exist #32

Closed
CloudMarc opened this issue Aug 5, 2020 · 2 comments
Closed

File file:/user/hive/warehouse/helloworld does not exist #32

CloudMarc opened this issue Aug 5, 2020 · 2 comments

Comments

@CloudMarc
Copy link

CloudMarc commented Aug 5, 2020

Environment:

  • GKE
  • Helm 3
  • Jupyter server

Steps to reproduce:

  1. helm install gradiant/hive --generate-name
  2. get ip address of hive server (thrift)
  3. Run this simple example code: https://saagie.zendesk.com/hc/en-us/articles/360007829439-Read-Write-from-Hive
from impala.dbapi import connect
from impala.util import as_pandas
import pandas as pd
import os
Connection
conn = connect(host=os.environ['IP_HIVE'], port=10000, user=os.environ['USER'], 
              password=os.environ['PASSWORD'], auth_mechanism='PLAIN')
Writing to a Hive table
cursor = conn.cursor()
cursor.execute('CREATE TABLE default.helloworld (hello STRING,world STRING)')
cursor.execute("insert into default.helloworld values ('hello1','world1')")

Expected: values are inserted

Actual:

  • Table is created
df = as_pandas(cursor)
print(df.head())```
  tab_name

0 helloworld```

  • Insert fails
	at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:257)
	at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:348)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.FileNotFoundException: File file:/user/hive/warehouse/helloworld2 does not exist
	at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:2886)
	at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:3297)
	at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:2022)
	at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:360)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1232)
	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:255)
	... 11 more
Caused by: java.io.FileNotFoundException: File file:/user/hive/warehouse/helloworld2 does not exist
	at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)
	at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)
	at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)
	at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:428)
	at org.apache.hadoop.hive.io.HdfsUtils$HadoopFileStatus.<init>(HdfsUtils.java:211)
	at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:2884)
	... 22 more```
@cgiraldo
Copy link
Member

I can reproduce the problem in my local environment with microk8s.

It looks like a problem with hive user permission on hdfs files after upgrading hive container to use non-root user.

I will try to fix it ASAP.

@cgiraldo
Copy link
Member

It is a misconfiguration in hive-metastore.

You can manually fix it by providing this configuration in your values.yaml:

metastore:
  conf:
    hiveSite:
      hive.metastore.warehouse.dir: hdfs://$HDFS_SERVICE_NAME:8020/user/hive/warehouse

We will upload a fix to set this value as default in hive-metastore chart in the following days.

imtiny pushed a commit to student-drawing/charts that referenced this issue Jul 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants