Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add support for Decimal128 and Decimal256 types in interval arithmetic #14126

Merged
merged 8 commits into from
Jan 25, 2025

Conversation

waynexia
Copy link
Member

@waynexia waynexia commented Jan 14, 2025

…hmetic

Which issue does this PR close?

Closes #14124

Rationale for this change

Fixes the panic when decimal computation overflows

What changes are included in this PR?

This patch contains two fixes:

  • fix the logic of checking if a value is negative.
  • implement get_extreme_value for decimal 128 and 256 types

Are these changes tested?

adding test

Are there any user-facing changes?

@waynexia waynexia changed the title fix: add support for Decimal128 and Decimal256 types in interval arit… fix: add support for Decimal128 and Decimal256 types in interval arithmetic Jan 14, 2025
@waynexia waynexia requested a review from Copilot January 14, 2025 13:12

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 1 changed files in this pull request and generated no comments.

Signed-off-by: Ruihang Xia <[email protected]>
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @waynexia
I feed we can also add the initial problematic query into .slt file?

SELECT ('0.54321543215432154321543215432154321'::DECIMAL(35,35) + 10000)::VARCHAR

Signed-off-by: Ruihang Xia <[email protected]>
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Jan 15, 2025
@waynexia
Copy link
Member Author

thanks @waynexia I feed we can also add the initial problematic query into .slt file?

SELECT ('0.54321543215432154321543215432154321'::DECIMAL(35,35) + 10000)::VARCHAR

Done 👍, added one in select.slt

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the bug fix @waynexia

I am not sure about the code for min/max extremes however

@@ -76,6 +76,14 @@ macro_rules! get_extreme_value {
DataType::Interval(IntervalUnit::MonthDayNano) => {
ScalarValue::IntervalMonthDayNano(Some(IntervalMonthDayNano::$extreme))
}
DataType::Decimal128(precision, scale) => {
ScalarValue::Decimal128(Some(i128::$extreme), *precision, *scale)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct for min/max of i128? It seems like the minimum value of Decimal128(precision, scale) would be the minimum value for the precision and scale separately rather than the min value of the overall i128 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, that makes sense. These huge arrays remind me it should be all of 9s...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sadly the decimal 256 version is not public 😢 let me find a workaround

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe get_extreme_value should be part of arrow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sadly the decimal 256 version is not public 😢 let me find a workaround

I recommend

  1. copy/pasting the value into DataFusion for now
  2. File a ticket in arrow-rs to make it public (or maybe even a PR!)
  3. leave a comment next to the copy in DataFUsion referencing the upstream ticket

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed a PR in arrow to expose these constants:

evenyag added a commit to GreptimeTeam/greptimedb that referenced this pull request Jan 16, 2025
@alamb alamb marked this pull request as draft January 17, 2025 15:22
@alamb alamb marked this pull request as ready for review January 17, 2025 15:23
github-merge-queue bot pushed a commit to GreptimeTeam/greptimedb that referenced this pull request Jan 23, 2025
* change dep

Signed-off-by: Ruihang Xia <[email protected]>

* feat: adapt to arrow's interval array

* chore: fix compile errors in datatypes crate

* chore: fix api crate compiler errors

* chore: fix compiler errors in common-grpc

* chore: fix common-datasource errors

* chore: fix deprecated code in common-datasource

* fix promql and physical plan related

Signed-off-by: Ruihang Xia <[email protected]>

* wip: upgrading network deps

Signed-off-by: Ruihang Xia <[email protected]>

* block on updating `sqlparser`

* upgrade sqlparser

Signed-off-by: Ruihang Xia <[email protected]>

* adapt new df's trait requirements

Signed-off-by: Ruihang Xia <[email protected]>

* chore: fix compiler errors in mito2

* chore: fix common-function crate errors

* chore: fix catalog errors

* change import path

Signed-off-by: Ruihang Xia <[email protected]>

* chore: fix some errors in query crate

* chore: fix some errors in query crate

* aggr expr and some other tiny fixes

Signed-off-by: Ruihang Xia <[email protected]>

* chore: fix expr related errors in query crate

* chore: fix query serializer and admin command

* chore: fix grpc services

* feat: axum serve

* chore: fix http server

* remove handle_error handler
* refactor timeout layer
* serve axum

* chore: fix flow aggr functions

* chore: fix flow

* feat: fix errors in meta-srv

* boxed()
* use TokioIo

* feat!: Remove script crate and python feature (#5321)

* feat: exclude script crate

* chore: simplify feature

* feat: remove the script crate

* chore: remove python feature and some comments

* chore: fix warning

* chore: fix servers tests compiler errors

* feat: fix tests-integration errors

* chore: fix unused

* test: fix catalog test

* chore: fix compiler errors for crates using common-meta

testing feature is enabled when check with --workspace

* test: use display for logical plan test

* test: implement rewrite for ScanHintRule

* fix: http server build panic

* test: fix mito test

* fix: sql parser type alias error

* test: fix TestClient not listen

* test: some flow tests

* test(flow): more fix

* fix: test_otlp_logs

* test: fix promql test that using deprecated method fun()

* fix: sql type replace supports Int8 ~ Int64, UInt8 ~ UInt64

* test: fix infer schema test case

* test: fix tests related to plan display

* chore: fix last flow test

* test: fix function format related assertion

* test: use larger port range for tests

* fix: test_otlp_traces

* fix: test_otlp_metrics

* fix range query and dist plan

Signed-off-by: Ruihang Xia <[email protected]>

* fix: flow handle distinct use deprecated field

* fix: can't pass Join plan expressions to LogicalPlan::with_new_exprs

* test: fix deserialize test

* test: reduce split key case num

* tests: lower case aggr func name

* test: fix some sqlness tests

* tests: more sqlness fix

* tests: fixed sqlness test

* commit non-bug changes

Signed-off-by: Ruihang Xia <[email protected]>

* fix: make our udf correct

* fix: implement empty methods of ContextProvider for DfContextProviderAdapter

* test: update sqlness test result

* chore: remove unused

* fix: provide alias name for AggregateExprBuilder in range plan

* test: update range query result

* fix: implement missing ContextProvider methods for DfContextProviderAdapter

* test: update timestamps, cte result

* fix: supports empty projection in mito

* test: update comment for cte test

* fix: support projection for numbers

* test: update test cases after projection fix

* fix: fix range select first_value/last_value

* fix: handle CAST and time index conflict

* fix: handle order by correctly in range first_value/last_value

* test: update sqlness result

* test: update view test result

* test: update decimal test

wait for apache/datafusion#14126 to fix this

* feat: remove redundant physical optimization

todo(ruihang): Check if we can remove this.

* test: update sqlness test result

* chore: range select default sort use nulls_first = false

* test: update filter push down test result

* test: comment deciaml test to avoid different panic message

* test: update some distributed test result

* test: update test for distributed count and filter push down

* test: update subqueries test

* fix: SessionState may overwrite our UDFs

* chore: fix compiler errors after merging main

* fix: fix elasticsearch and dashboard router panic

* chore: fix common-functions tests

* chore: update sqlness result

* test: fix id keyword and update sqlness result

* test: fix flow_null test

* fix: enlarge thread size in debug mode to avoid overflow

* chore: fix warnings in common-function

* chore: fix warning in flow

* chore: fix warnings in query crate

* chore: remove unused warnings

* chore: fix deprecated warnings for parquet

* chore: fix deprecated warning in servers crate

* style: fix clippy

* test: enlarge mito cache tttl test ttl time

* chore: fix typo

* style: fmt toml

* refactor: reimplement PartialOrd for RangeSelect

* chore: remove script crate files introduced by merge

* fix: return error if sql option is not kv

* chore: do not use ..default::default()

* chore: per review

* chore: update error message in BuildAdminFunctionArgsSnafu

Co-authored-by: jeremyhi <[email protected]>

* refactor: typed precision

* update sqlness view case

Signed-off-by: Ruihang Xia <[email protected]>

* chore: flow per review

* chore: add example in comment

* chore: warn if parquet stats of timestamp is not INT64

* style: add a newline before derive to make the comment more clear

* test: update sqlness result

* fix: flow from substrait

* chore: change update_range_context log to debug level

* chore: move axum-extra axum-macros to workspace

---------

Signed-off-by: Ruihang Xia <[email protected]>
Co-authored-by: Ruihang Xia <[email protected]>
Co-authored-by: luofucong <[email protected]>
Co-authored-by: discord9 <[email protected]>
Co-authored-by: shuiyisong <[email protected]>
Co-authored-by: jeremyhi <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
@waynexia
Copy link
Member Author

waynexia commented Jan 23, 2025

Hi @alamb, I've copied constant vectors from arrow, please take another look. A new dep (paste) is imported to concat variable name.

Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @waynexia ❤️ -- looks great to me

@waynexia waynexia merged commit 1dad545 into apache:main Jan 25, 2025
27 checks passed
@waynexia waynexia deleted the fix-14124 branch January 25, 2025 13:24
@waynexia
Copy link
Member Author

Thank you @comphead and @alamb for reviewing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Regression] Panic when handling Decimal128 overflow
3 participants