-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: upgrade to DataFusion 46.0.0 (WORK IN PROGRESS) #3261
base: main
Are you sure you want to change the base?
Conversation
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
crates/sql/src/planner.rs
Outdated
@@ -44,6 +44,7 @@ impl<'a, S: ContextProvider> DeltaSqlToRel<'a, S> { | |||
enable_ident_normalization: self.options.enable_ident_normalization, | |||
support_varchar_with_length: false, | |||
enable_options_value_normalization: false, | |||
collect_spans: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is spans?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The source code locations for expressions:
https://docs.rs/sqlparser/latest/sqlparser/tokenizer/struct.Span.html
We are starting to gather / plumb this into DataFusion. This particular setting means the sql planner won't try and pass the span information along. The only thing the spans are used for now is debug / help messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok thanks for explaining :) I learned something new about datafusion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW this is a new feature that was starting to be added in DataFusion 46
Looks like from CI we also need to update rust to be 1.82 https://github.com/delta-io/delta-rs/actions/runs/13506223447/job/37736308124?pr=3261
|
I have a few cleanup PRs I would plan to make as I work on this. Here is the first one: |
I also started breaking this PR up into some smaller ones: |
#[derive(Default)] | ||
struct ParquetPredicateVisitor { | ||
struct ParquetVisitor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to change the way the parquet scan information was found in the two visitors, and I combined them together at the same time as they were mostly boiler plate copy/paste
I filed a ticket in datafusion explaining the current test failures |
d300241
to
c9b1903
Compare
Signed-off-by: Andrew Lamb <[email protected]>
enable_options_value_normalization: false, | ||
}, | ||
); | ||
let parser_options = ParserOptions::new() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This uses the nice new API from @kosiew in
Signed-off-by: Andrew Lamb <[email protected]>
@@ -20,7 +20,7 @@ jobs: | |||
uses: actions-rs/toolchain@v1 | |||
with: | |||
profile: default | |||
toolchain: '1.81' | |||
toolchain: '1.82' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DataFusion 46 requires Rust 1.82
You can see CI fail without these changes
https://github.com/delta-io/delta-rs/actions/runs/13591787541/job/38000195037?pr=3261
I don't know what the MSRV policy in delta is so we probably can't merge this PR until it is ok to increase MSRV in delta
I think this PR will pass -- maybe someone could trigger the CI so I can show a clean run? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3261 +/- ##
=======================================
Coverage 72.11% 72.11%
=======================================
Files 143 143
Lines 45530 45545 +15
Branches 45530 45545 +15
=======================================
+ Hits 32833 32846 +13
- Misses 10618 10620 +2
Partials 2079 2079 ☔ View full report in Codecov by Sentry. |
🤔 there are some python failures here. Seeing if I can figure out what is going on https://github.com/delta-io/delta-rs/actions/runs/13592071811/job/38008047004?pr=3261
|
Could someone tell me how to run these tests locally or give me a rust stack trace? I don't really understand what is failing here |
cd python
make develop
RUST_BACKTRACE=1 uv run pytest tests/test_cdf.py -s -k "test_read_cdf_partitioned_projection" |
Description
chore: upgrade to DataFusion 46.0.0
This is a work in progress. I am testing the effects of upgrading to DataFusion 46
Related Issue(s)
46.0.0
apache/datafusion#14123Documentation