-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decimal multiply kernel should not cause precision loss #5675
Conversation
There is more broader issue on type coercion. I'm working on this and will update. |
e5e3bcb
to
b108159
Compare
b108159
to
9ba3285
Compare
rust-decimal only supports precision <= 28... |
@@ -125,7 +125,7 @@ select | |||
sum(l_quantity) as sum_qty, | |||
sum(l_extendedprice) as sum_base_price, | |||
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, | |||
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, | |||
sum(cast(l_extendedprice as decimal(12,2)) * (1 - l_discount) * (1 + l_tax)) as sum_charge, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This decimal calculation actually causes overflow on decimal multiplication because the kernel DataFusion uses doesn't allow precision loss. Previously it accidentally works because we implicitly truncate all input decimal value before multiplying...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I add a cast to get the test result unchanged. To get rid of this cast
later, we can use new multiply kernel at arrow-rs which allows precision-loss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we can add a comment here referencing the new arrow-rs kernel / work. Otherwise this context will likely get forgotten
big_decimal_to_str( | ||
BigDecimal::from_str(&Decimal::from_i128_with_scale(value, scale).to_string()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rust-decimal
doesn't allow precision > 28. We can simply use Decimal128Type
from arrow-rs to get string of decimal.
32e5dfc
to
e727f5a
Compare
cast(cast(sum(case | ||
when nation = 'BRAZIL' then volume | ||
else 0 | ||
end) / sum(volume) as mkt_share | ||
end) as decimal(12,2)) / cast(sum(volume) as decimal(12,2)) as decimal(15,2)) as mkt_share |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current decimal divide kernel fails to compute this. The result decimal type is 38 scale. While divide_dyn_opt_decimal
multiplies left
array with 10.pow(scale)
, obviously it overflows because the multiply kernel doesn't allow precision-loss multiplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With inner cast
s wrapping two sum
, we can use a result scale which won't cause overflow. The result decimal is correct.
But the answer file is fixed as decimal(15, 2) so the result still fails the check. I only can add another cast
to make it as same decimal type to pass this result check.
b1292ba
to
fb9a27a
Compare
More changes than I expected, but finally making DataFusion all tests happy... |
I'll try and find time to review this PR tomorrow |
cc @liukun4515 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @viirya
I took a look through this code -- the coercion rule changes make sense to me. However, I don't undertand the need for Expr::PromotePrecision
or the new data_type field on Expr. They don't seem to be related to improving the coercion for decimals
@@ -385,6 +408,20 @@ impl Cast { | |||
} | |||
} | |||
|
|||
/// Cast expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please add documentation about what PromotePrecision
does and in what circumstances it is needed?
@@ -125,7 +125,7 @@ select | |||
sum(l_quantity) as sum_qty, | |||
sum(l_extendedprice) as sum_base_price, | |||
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, | |||
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, | |||
sum(cast(l_extendedprice as decimal(12,2)) * (1 - l_discount) * (1 + l_tax)) as sum_charge, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we can add a comment here referencing the new arrow-rs kernel / work. Otherwise this context will likely get forgotten
@@ -39,6 +40,13 @@ pub trait ExprSchemable { | |||
|
|||
/// cast to a type with respect to a schema | |||
fn cast_to<S: ExprSchema>(self, cast_to_type: &DataType, schema: &S) -> Result<Expr>; | |||
|
|||
/// promote to a type with respect to a schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add some comments to clarify what the difference between "promote" and "cast" is?
@@ -234,12 +236,33 @@ pub struct BinaryExpr { | |||
pub op: Operator, | |||
/// Right-hand side of the expression | |||
pub right: Box<Expr>, | |||
/// The data type of the expression, if known |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the reason for this change -- is it no longer possible to calculate the output type of a BinaryExpr
from its inputs?
If the desired output type is different then couldn't we cast
the expr to the type?
The coercion rule that modifies sides of arithmetic op is not idempotent. Multiple runs of the rule will change it to incorrect result. So we need something to prevent the rule on coerced sides. For the new |
This makes sense -- thank you My biggest concern with the approach in this PR is that it adds some special case for decimal (which seems reasonable in itself) but that special case bleeds out over the rest of the code / plans, which is probably why it got so big. I wonder would it be possible to use Perhaps something like the following. /// The target decimal type the expression should be output to
struct PromotePrecision {
expr: Box<Expr>,
precision: u8,
scale: u8,
} One of the downsides of this approach is that PromotePrecision is now clearly special purpose for Decimal. I am not sure if it would have other uses in the future 🤔 |
Keeping precision/scale in |
I think then it would be best to keep the data type in |
Thanks for your mention I will take a look it carefully tomorrow. @jackwener has moved the type coercion to the analysis phase. Maybe we have some same comments about the rule of type coercion. |
Yea, seems I need to rebase this. |
Does this new |
It is for Decimal type only. Yea, the idea is from Spark. It was removed in latest Spark as the precision promotion is moved to where the computation happens. I was not sure if moving it into decimal kernels is a good idea for DataFusion. Let me re-think about it and maybe refactor this. |
marking as draft while it gets worked on |
Thanks @alamb. I'll find some time on this. :) |
This new approach is proposed at #5980. |
Which issue does this PR close?
Closes #5674.
Rationale for this change
Currently decimal multiplication in DataFusion silently truncates precision of result. It happens generally for regular decimal multiplication which doesn't overflow. Looks like DataFusion uses incomplete decimal precision coercion rule from Spark to coerce sides of decimal multiplication (and other arithmetic operators). The coerced type on two sides of decimal multiplication is not the resulting decimal type of multiplication. This (and how we computes decimal multiplication in the kernels) leads to truncated precision in the result decimal type.
What changes are included in this PR?
This patch updates type coercion rule for decimal arithmetic operators. To prevent the type coercion rule applied on sides of decimal operators, we need one expression node to wrap coerced sides.
Are these changes tested?
Are there any user-facing changes?