You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The MR execution engine does not seem to provide a reliable value for the hive.io.file.readcolumn.names when multiple tables are read in the same query. So we can't properly support column pruning as we have to select all the columns (i.e. SELECT *).
This is unfortunately quite inefficient. Tez, however, does not have that issue.
The MR execution engine does not seem to provide a reliable value for the
hive.io.file.readcolumn.names
when multiple tables are read in the same query. So we can't properly support column pruning as we have to select all the columns (i.e.SELECT *
).This is unfortunately quite inefficient. Tez, however, does not have that issue.
See more info here: https://lists.apache.org/thread/g464zybq4g6c7p2h6nd9jmmznq472785
We need to investigate to see if we can come up with a workaround, or figure out how to get the subset of read columns from some property or variable.
Relevant part of the codebase here:
hive-bigquery-connector/connector/src/main/java/com/google/cloud/hive/bigquery/connector/input/BigQueryInputSplit.java
Lines 176 to 191 in af82dcc
The text was updated successfully, but these errors were encountered: