Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(query builders): quoting identifiers by default #1658

Merged
merged 1 commit into from
Mar 5, 2025

Conversation

scaliseraoul
Copy link
Contributor

@scaliseraoul scaliseraoul commented Mar 5, 2025


Important

Default quoting of SQL identifiers added to BaseQueryBuilder and ViewQueryBuilder, with tests updated to reflect this change.

  • Behavior:
    • Default quoting of SQL identifiers using quote_identifiers in BaseQueryBuilder and ViewQueryBuilder.
    • Affects build_query() and get_head_query() methods in both classes.
  • Tests:
    • Updated expected SQL strings in test_sql_loader.py, test_group_by.py, test_query_builder.py, test_sql_transformation_manager.py, and test_view_query_builder.py to include quoted identifiers.
    • Ensures that all SQL queries are correctly formatted with quotes around identifiers.

This description was created by Ellipsis for c461eaa. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to c461eaa in 1 minute and 26 seconds

More details
  • Looked at 1144 lines of code in 7 files
  • Skipped 0 files when reviewing.
  • Skipped posting 11 drafted comments based on config settings.
1. pandasai/query_builders/base_query_builder.py:42
  • Draft comment:
    Good: Using query.transform(quote_identifiers) ensures all identifiers are quoted consistently, enhancing SQL injection safety and readability.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
2. pandasai/query_builders/view_query_builder.py:83
  • Draft comment:
    Nice update: The view query builder correctly applies quote_identifiers transformation. Ensure sanitize_view_column_name is properly unit tested.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
3. tests/unit_tests/data_loader/test_sql_loader.py:45
  • Draft comment:
    The expected SQL queries now include quoted identifiers. This improves SQL safety and clarity. Verify that the quoting is consistent across all sources.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
4. tests/unit_tests/query_builders/test_group_by.py:80
  • Draft comment:
    Group by clauses now include properly quoted identifiers. This change enhances consistency; ensure that any changes to identifier quoting update all related tests.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
5. tests/unit_tests/query_builders/test_query_builder.py:217
  • Draft comment:
    The tests now check for correctly quoted table and column names, which is a positive update to prevent SQL injection.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
6. tests/unit_tests/query_builders/test_sql_transformation_manager.py:295
  • Draft comment:
    All transformation tests now validate SQL using fully quoted identifiers. This is beneficial for consistency and security.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
7. tests/unit_tests/query_builders/test_view_query_builder.py:144
  • Draft comment:
    The view query builder tests now expect properly quoted aliases and table names. The injection tests with altered schema names are comprehensive.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50%
    None
8. pandasai/query_builders/base_query_builder.py:61
  • Draft comment:
    Consider applying the quote_identifiers transformation to the get_row_count() query for consistency with build_query() and get_head_query().
  • Reason this comment was not posted:
    Comment was on unchanged code.
9. tests/unit_tests/query_builders/test_query_builder.py:144
  • Draft comment:
    Expected SQL strings are built inline and are verbose; consider extracting helper functions or constants to DRY up test string construction and improve maintainability.
  • Reason this comment was not posted:
    Comment was on unchanged code.
10. tests/unit_tests/query_builders/test_view_query_builder.py:475
  • Draft comment:
    The test for column name comment injection expects the sanitized alias 'column___'; ensure that the replacement logic is well-documented and consistently applied.
  • Reason this comment was not posted:
    Comment was on unchanged code.
11. Overall:1
  • Draft comment:
    Good job applying quote_identifiers by default across query builders to mitigate SQL injection risks. The added tests cover multiple injection scenarios comprehensively.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_mfzVOvL8uU5FRMMD


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@gventuri gventuri merged commit 869f048 into sinaptik-ai:main Mar 5, 2025
12 checks passed
scaliseraoul added a commit to scaliseraoul/pandas-ai that referenced this pull request Mar 5, 2025
* commit '869f048b60976c1fc5be313922368c626308a44d':
  fix(query builders): quoting identifiers by default (sinaptik-ai#1658)
  fix(agent): remove wrong double validation (sinaptik-ai#1659)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants