Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect backslash treatment in string literals in DataFusion CLI #13286

Open
Tracked by #14123
findepi opened this issue Nov 7, 2024 · 3 comments · May be fixed by #14844
Open
Tracked by #14123

Incorrect backslash treatment in string literals in DataFusion CLI #13286

findepi opened this issue Nov 7, 2024 · 3 comments · May be fixed by #14844
Assignees
Labels
bug Something isn't working correctness

Comments

@findepi
Copy link
Member

findepi commented Nov 7, 2024

In standard SQL, the \ character has no special meaning in '...' varchar literals.

In PostgreSQL

postgres=# SELECT '\', '\\'
postgres-# ;
 ?column? | ?column?
----------+----------
 \        | \\
(1 row)

in DataFusion sqllogictest the behavior is the same:

query T
SELECT '\'
----
\

query T
SELECT '\\'
----
\\

query T
SELECT '\\\'
----
\\\

query T
SELECT '\\\\'
----
\\\\

However, DataFusion CLI behaves differently

$ cargo run
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.15s
     Running `target/debug/datafusion-cli`
DataFusion CLI v43.0.0
> SELECT '\\';
+-----------+
| Utf8("\") |
+-----------+
| \         |
+-----------+
1 row(s) fetched.
Elapsed 0.049 seconds.

> SELECT '\';  🤔 Invalid statement: SQL error: TokenizerError("unsupported escape char: '\\''")

Given that DataFusion CLI is used a lot to test and verify DataFusion's behavior, it's super important for the CLI to behave correctly with respect to its input.

@alamb
Copy link
Contributor

alamb commented Feb 5, 2025

BTW @pmcgleenon hit this when updating ClickBench as well:

ClickHouse/ClickBench#301

@Lordworms
Copy link
Contributor

take

@alamb
Copy link
Contributor

alamb commented Feb 25, 2025

@Lordworms has a proposed PR here to fix this:

However it seems like the way to get consistent behavior with sqllogictest (and psql) is to avoid unescaping in datafusion-cli 🤔

Any thoughts on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctness
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants