-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helper methods for CDF Physical to Logical Transformation #579
Helper methods for CDF Physical to Logical Transformation #579
Conversation
Tests are currently failing because we they depend on the FileMeta fix. |
94d8cbb
to
5622874
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added few more comments (should be quick) but still LGTM
kernel/src/table_changes/scan.rs
Outdated
ColumnType::Selected(CHANGE_TYPE_COL_NAME.to_string()), | ||
ColumnType::Selected(COMMIT_VERSION_COL_NAME.to_string()), | ||
ColumnType::Selected(COMMIT_TIMESTAMP_COL_NAME.to_string()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually (and sorry for the churn) I think I like the literal strings here: this let's us check that they are the actual strings we expect in the protocol (failure case we prevent is someone changes the const and then this test would change and pass)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aha that makes sense. So we get to assert more about the code through this test 👍
scan_file: &CdfScanFile, | ||
global_state: &GlobalScanState, | ||
logical_schema: &StructType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! i like taking the more constrained data
@@ -198,6 +213,7 @@ mod tests { | |||
use crate::expressions::{column_expr, Scalar}; | |||
use crate::scan::ColumnType; | |||
use crate::schema::{DataType, StructField, StructType}; | |||
use crate::table_changes::COMMIT_VERSION_COL_NAME; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can just do use super::*?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't seem like it. super
would point to the scan module, but we need its parent.
What changes are proposed in this pull request?
This PR introduces methods methods that will be useful to read CDF data and transform it from its physical form into the logical schema. The methods are:
get_cdf_columns
: Generates a map from cdf column name to expression that fills that columnphysical_to_logical_expression
: Generates the physical to logical expression used to transform the engine datascan_file_read_schema
: Gets the physical schema. This depends on the cdf scan file typeWe also introduce the method
Scalar::timestamp_ntz_from_millis
which converts from ani64
millisecond value to aScalar::TimestampNtz
.How was this change tested?
We test that
physical_to_logical_expression
generates the correct expression for aCdfScanFile
with:_change_type
column in the case of add, cdc, and remove CdfScanFile.