-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support array_concat
for Utf8View
#14378
Conversation
/// Returns the wider type among arguments `lhs` and `rhs`. | ||
/// The wider type is the type that can safely represent values from both types | ||
/// without information loss. Returns an Error if types are incompatible. | ||
pub fn get_wider_type(lhs: &DataType, rhs: &DataType) -> Result<DataType> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was used in array_concat
only to compute the return type, and array_concat
errors at runtime anyways if you pass it arrays with different types.
Thus this code is unnecessary so I propose removing it
> select array_concat([arrow_cast('1', 'LargeUtf8')], ['2']);
Arrow error: Invalid argument error: It is not possible to concatenate arrays of different data types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The best optimization .. no code
arg_type.clone() | ||
} else if !expr_type.equals_datatype(arg_type) { | ||
return plan_err!( | ||
"It is not possible to concatenate arrays of different types. Expected: {}, got: {}", expr_type, arg_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the implementation of array_concat
requires the field types of ListArray
to be the same and errors at runtime if they aren't
Previously the return type was calculated only to get a runtime error. I simply moved the check to planning time
---- | ||
[1, 2, 3] | ||
|
||
# Concatenating stringview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these three queries work fine on main
---- | ||
[1, 2, 3] | ||
|
||
# Concatenating Mixed types (doesn't work) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of these queries fail on main though the errors are different for the string view one (see the first commit of this PR)
It seems to me like this display is 🤮 due to the Display
impl of DataType
being crap.
I'll file an upstream ticket to make this easier to understand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻
Thank you for the review @jayzhan211 and @Omega359 |
Which issue does this PR close?
45.0.0
#14008Rationale for this change
As part of completing #13504 it is important to supoprt
Utf8View
inarray_concat
,It turns out the only user of
get_wider_type
as noted by @jayzhan211 and @Omega359 in #13370 (comment) and in factget_wider_type
is not necessaryWhat changes are included in this PR?
array_concat
with different string types (see first commit)get_wider_type
Are these changes tested?
Yes, new tests
Are there any user-facing changes?
Error messages change slightly