-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blaze's query results don't match the spark query results #789
Comments
Correct the bug content, not the results of the query is not correct, but can not find the data, increase the id condition for the query, no matter |
orc schema:
driver log:
Physical Plan
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
A bug was tested in our environment, that is, the Blaze query results were inconsistent with the spark query results. In spark, the query result of the
label
query is0.00
, while the query result in blaze isnull
, which causes deviations during model calculation. Searching this table separately on our data platform also reproduces this problem. Here are our table creation statements and query statementsThis table creation statement is the result of a query via the
show create table xxx
command on beeline.As a side note, writing this data to another orc formatted table is queried correctly using blaze. The original table was created a long time ago and through beeline, the data should be written using spark2, but now we are using spark3.
The text was updated successfully, but these errors were encountered: