Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove implicit interval type coercion from ScalarValue comparison #7514

Merged

Conversation

tustvold
Copy link
Contributor

@tustvold tustvold commented Sep 9, 2023

Which issue does this PR close?

Part of #7353

Rationale for this change

We have explicit type coercion machinery during plan construction that handles this, doing this implicitly within ScalarValue introduces inconsistency and is unnecessary. I am also not sure the type coercion logic is correct, it is certainly not consistent with the arrow-cast kernel.

Potentially more critically this definition of PartialEq is inconsistent with the definition of Hash, which could cause some very strange behaviour

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@@ -377,28 +351,10 @@ impl PartialOrd for ScalarValue {
}
(TimestampNanosecond(_, _), _) => None,
(IntervalYearMonth(v1), IntervalYearMonth(v2)) => v1.partial_cmp(v2),
(IntervalYearMonth(v1), IntervalDayTime(v2)) => {
ym_to_milli(v1).partial_cmp(&dt_to_milli(v2))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simply wrong as it makes gross assumptions about the number of days in months, etc...

@@ -431,122 +387,6 @@ impl PartialOrd for ScalarValue {
}
}

/// This function computes the duration (in milliseconds) of the given
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions are somewhat misleading as it is not clearly conveyed that they are a gross approximation, assuming certain numbers of days in months, etc... Given these don't appear to be used elsewhere, I think the best option is to remove them

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you -- this makes a lot of sense to me. I think some of this was left over from when the arrow cast kernels didn't support all the different types of coercion

cc @berkaysynnada

@tustvold tustvold added the api change Changes the API exposed to users of the crate label Sep 9, 2023
@tustvold tustvold merged commit 7b9bb08 into apache:main Sep 9, 2023
@berkaysynnada
Copy link
Contributor

Thank you -- this makes a lot of sense to me. I think some of this was left over from when the arrow cast kernels didn't support all the different types of coercion

cc @berkaysynnada

Yes, you are right. We did this before arrow kernels, when I didn't fully know the purpose of Interval type. I can now say that they were erroneous and they are unnecessary at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants