snowflake column identifier #520

pawel-big-lebowski · 2022-06-09T09:17:26Z

Signed-off-by: Pawel Leszczynski [email protected]

resolves #519

According to documentation, Snowflake allows:

query data staged in files with an identifier of a form [<alias>.]$<file_col_num>[.<element>]
copy into table with an column identifier containing $ character like select t.$1,t.$2,t.$3 from @~/datafile.csv.gz t;
get-path like select v:attr[0].name from vartab;

Although the feature seems to be snowflake specific, it cannot be implemented within snowflake dialect. It's because tokens starting with $ character are tokenized into Token::Placeholder. That's why parser's change is required.

Initial OpenLineage project issue: OpenLineage/OpenLineage#814.

coveralls · 2022-06-12T10:05:52Z

Pull Request Test Coverage Report for Build 2487464140

38 of 42 (90.48%) changed or added relevant lines in 4 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.02%) to 89.813%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/ast/mod.rs	12	14	85.71%
src/parser.rs	22	24	91.67%

Totals
Change from base Build 2450394803:	0.02%
Covered Lines:	8755
Relevant Lines:	9748

💛 - Coveralls

alamb · 2022-06-12T10:24:45Z

There appear to be a few CI failures on this PR

Signed-off-by: Pawel Leszczynski <[email protected]>

alamb

Thank you for the contribution @pawel-big-lebowski I will try and find time to review this PR tomorrow.

alamb

Thank you @pawel-big-lebowski -- I spent some time researching this feature.

First, I think it might help this PR if you split it into 2 pieces so we can discuss each feature ($identifiers and get path) separately.

Also, 🤔 it seems as though snowflake also supports identifiers like column$FILENAME -- is this something you want to explore?

https://docs.snowflake.com/en/user-guide/querying-metadata.html#metadata-columns

alamb · 2022-06-15T10:50:17Z

src/ast/mod.rs

@@ -217,6 +217,18 @@ pub enum Expr {
    Identifier(Ident),
    /// Multi-part identifier, e.g. `table_alias.column` or `schema.table.col`
    CompoundIdentifier(Vec<Ident>),
+    /// Multi-part identifier with a column number, e.g. [alias.]$file_col_num[.element] (snowflake)
+    CompoundIdentifierWithColumnNumber {


In terms of supporting things like a.$1 -- what would you think about modeling this as a a two part CompoundIdent identifier ["a", "$1"] rather than a new variant?

I think the parser would still have to be changed, but I think that approach would keep the AST and the consumption of the AST smaller with fewer dialect specific structures

alamb · 2022-06-15T10:52:33Z

tests/sqlparser_snowflake.rs

@@ -99,6 +99,12 @@ fn test_single_table_in_parenthesis() {
    );
 }

+#[test]
+fn test_snowflake_variables() {


I think it would be good to have a test that the parsed structure is actually what is expected -- for example https://github.com/sqlparser-rs/sqlparser-rs/blob/c884fbc/tests/sqlparser_common.rs#L138-L168

alamb · 2022-08-11T10:47:20Z

Marking this as a draft to show it is waiting some changes so I can easily filter out which PRs need review. Please mark it as "ready for review" when it is next ready.

AugustoFKL · 2023-02-25T21:54:00Z

@alamb @pawel-big-lebowski is closing this for now because it's almost a year without an update.

@pawel-big-lebowski, ping me if you will work on this code specifically, if so, I open the PR again :)

alamb · 2023-02-26T06:44:14Z

Thanks for cleaning this up @AugustoFKL

coveralls · 2024-10-04T07:45:44Z

Pull Request Test Coverage Report for Build 2487464140

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

38 of 42 (90.48%) changed or added relevant lines in 4 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.02%) to 89.813%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/ast/mod.rs	12	14	85.71%
src/parser.rs	22	24	91.67%

Totals
Change from base Build 2450394803:	0.02%
Covered Lines:	8755
Relevant Lines:	9748

💛 - Coveralls

snowflake column identifier

17ac30c

Signed-off-by: Pawel Leszczynski <[email protected]>

pawel-big-lebowski force-pushed the snowflake-variable-support branch from 396ea62 to 17ac30c Compare June 13, 2022 09:41

pawel-big-lebowski mentioned this pull request Jun 13, 2022

sql: Allow SQL parsing to recognize and support variables and special characters OpenLineage/OpenLineage#814

Closed

alamb reviewed Jun 13, 2022

View reviewed changes

alamb reviewed Jun 15, 2022

View reviewed changes

alamb marked this pull request as draft August 11, 2022 10:47

AugustoFKL closed this Feb 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

snowflake column identifier #520

snowflake column identifier #520

pawel-big-lebowski commented Jun 9, 2022 •

edited by alamb

Loading

coveralls commented Jun 12, 2022 •

edited

Loading

alamb commented Jun 12, 2022

alamb left a comment •

edited

Loading

alamb left a comment

alamb Jun 15, 2022

alamb Jun 15, 2022

alamb commented Aug 11, 2022

AugustoFKL commented Feb 25, 2023

alamb commented Feb 26, 2023

coveralls commented Oct 4, 2024 •

edited

Loading

snowflake column identifier #520

snowflake column identifier #520

Conversation

pawel-big-lebowski commented Jun 9, 2022 • edited by alamb Loading

coveralls commented Jun 12, 2022 • edited Loading

Pull Request Test Coverage Report for Build 2487464140

💛 - Coveralls

alamb commented Jun 12, 2022

alamb left a comment • edited Loading

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

alamb Jun 15, 2022

Choose a reason for hiding this comment

alamb Jun 15, 2022

Choose a reason for hiding this comment

alamb commented Aug 11, 2022

AugustoFKL commented Feb 25, 2023

alamb commented Feb 26, 2023

coveralls commented Oct 4, 2024 • edited Loading

Pull Request Test Coverage Report for Build 2487464140

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

pawel-big-lebowski commented Jun 9, 2022 •

edited by alamb

Loading

coveralls commented Jun 12, 2022 •

edited

Loading

alamb left a comment •

edited

Loading

coveralls commented Oct 4, 2024 •

edited

Loading