Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Impact: Allow For Disabling SQLx Type Checking Queries, As They Add Significant Latency #3611

Open
dnlsndr opened this issue Nov 27, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@dnlsndr
Copy link

dnlsndr commented Nov 27, 2024

Problem Description

When executing queries with custom types (enums, etc.), SQLx performs two preliminary SQL queries before the actual query:

  1. SELECT
      oid :: regtype :: text,
      typtype,
      typcategory,
      typrelid,
      typelem,
      typbasetype
    FROM
      pg_catalog.pg_type
    WHERE
      oid = $1
  2.  SELECT
       enumlabel
     FROM
       pg_catalog.pg_enum
     WHERE
       enumtypid = $1
     ORDER BY
       enumsortorder

These type validation queries significantly impact performance, especially with high DB RTTs:

  • Single query: ~8ms
  • With type checks: ~24ms (3x slower)

This runtime type checking creates unnecessary overhead in production systems where types are stable and tracked in code. I understand that this is required in many cases and is a good base setup, but allowing for disabling this functionality would greatly improve performance in many cases.

Proposed Solution

Add a configuration option to disable type checking queries, implemented either:

  1. At build time through the query_as! macro system
  2. At runtime via Pool settings (similar to test_before_acquire), e.g., validate_types_before_execution: bool

Attempted Workarounds

  • query_as_unchecked!()
  • Direct query!() with manual struct construction
    Neither approach resolved the performance issue.

Environment

  • PostgreSQL 16
  • SQLx v0.8.2
  • Rust 1.82.0

Code Context

Type Definitions

#[sqlx(transparent)]
pub struct UserId(i64);

#[derive(Debug, Clone, sqlx::Type)]
#[sqlx(type_name = "user_gender", rename_all = "SCREAMING_SNAKE_CASE")]
pub enum SystemGender {
    Male,
    Female,
    Other,
}

#[derive(Debug, Clone, FromRow)]
#[cfg_attr(debug_assertions, derive(Default))]
pub struct UserRow {
    pub id: UserId,
    pub name: String,
    pub family_name: Option<String>,
    pub email: String,
    pub email_verified: bool,
    pub phone: SystemPhoneNumber,
    pub phone_verified: bool,
    pub password_hash: String,
    pub locale: String,
    pub jurisdiction_code: SystemCountry,
    pub birthdate: chrono::NaiveDate,
    pub gender: Option<SystemGender>,
    pub time_zone: String,
    pub blocked_until: Option<chrono::DateTime<Utc>>,
    pub excluded_until: Option<chrono::DateTime<Utc>>,
    pub created_at: chrono::DateTime<Utc>,
    pub updated_at: chrono::DateTime<Utc>,
    pub deleted_at: Option<chrono::DateTime<Utc>>,
}

Note

SystemPhoneNumber and SystemCountry are just wrapper types around Strings

Database Schema

CREATE TYPE user_gender AS ENUM (
    'MALE',
    'FEMALE',
    'OTHER'
);

CREATE TABLE users (
    id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    name text NOT NULL,
    family_name text,
    email text NOT NULL UNIQUE,
    email_verified boolean NOT NULL DEFAULT false,
    phone text UNIQUE NOT NULL,
    phone_verified boolean NOT NULL DEFAULT false,
    password_hash text NOT NULL,
    language text NOT NULL,
    jurisdiction_code text NOT NULL REFERENCES jurisdictions (code),
    birthdate date NOT NULL,
    gender user_gender,
    time_zone text NOT NULL,
    blocked_until timestamptz,
    excluded_until timestamptz,
    created_at timestamptz NOT NULL DEFAULT now(),
    updated_at timestamptz NOT NULL DEFAULT now(),
    deleted_at timestamptz
);

Query Implementation

async fn get_user_by_id<'a>(
    &self,
    tx: &mut Box<dyn TransactionContext<'a>>,
    user_id: UserId,
) -> RepositoryErrorResult<Option<UserRow>> {
    let user = sqlx::query_as!(
        UserRow,
        r#"
        SELECT
            id as "id: UserId",
            name,
            family_name,
            email,
            email_verified,
            phone as "phone: SystemPhoneNumber",
            phone_verified,
            password_hash,
            locale,
            jurisdiction_code as "jurisdiction_code: SystemCountry",
            birthdate,
            gender as "gender: SystemGender",
            time_zone,
            blocked_until,
            excluded_until,
            created_at,
            updated_at,
            deleted_at
        FROM users
        WHERE id = $1
          AND deleted_at IS NULL
        "#,
        user_id.inner()
    )
    .fetch_optional(&mut *tx.conn().await?)
    .await?;
    Ok(user)
}
@dnlsndr dnlsndr added the enhancement New feature or request label Nov 27, 2024
@CommanderStorm
Copy link
Contributor

Could you please provide a complete, runnable, minimal example instead of the current version in the future?
You are making reproducing your issue more difficult than it needs to be.

Here is a minimised version of above:

docker run -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=CHANGE_ME -e POSTGRES_DB=postgres -p 5432:5432 postgres:latest postgres -c log_statement=all
CREATE TYPE user_gender AS ENUM (
    'MALE',
    'FEMALE',
    'OTHER'
);
#[derive(Debug, Clone, sqlx::Type)]
#[sqlx(type_name = "user_gender", rename_all = "SCREAMING_SNAKE_CASE")]
pub enum SystemGender {
    Male,
    Female,
    Other,
}

async fn get_user(pool:&sqlx::PgPool)->Option<SystemGender>{
    sqlx::query_scalar!(
            r#"SELECT gender as "gender: SystemGender" FROM users"#
        )
        .fetch_optional(pool)
        .await.unwrap()
}

#[tokio::main]
async fn main(){
    let pool = sqlx::PgPool::connect("postgres://postgres:CHANGE_ME@localhost:5432/postgres").await.unwrap();
    dbg!(get_user(&pool).await);
    dbg!(get_user(&pool).await);
}

I think we currently track this (why type resolution is dynamic) here:

@abonander
Copy link
Collaborator

abonander commented Nov 28, 2024

@dnlsndr these queries are only executed if the types are not in-cache. They are not executed every time.

The caches are per-connection so they do need to be looked up on every new connection, but once the caches are warm and the pool is populated you shouldn't see this happening often.

We could possibly share the type cache through the ConnectOptions, however.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants