Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

varchar columns are empty when reading from a Trino source #302

Open
jhatcher1 opened this issue May 23, 2024 · 0 comments
Open

varchar columns are empty when reading from a Trino source #302

jhatcher1 opened this issue May 23, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@jhatcher1
Copy link

Issue Description

  • Description of the issue:

    I am reading data from a Trino database, and saving it to both CSV and parquet files. In both cases, the sling CLI generates the files without errors.

    Upon inspecting the resulting CSV and parquet files, any columns of type varchar or varchar(n) are blank in the resulting file.

    I was able to reroduce the issue running Trino locally creating a simple table:

    trino> CREATE SCHEMA iceberg.example_schema;
    CREATE SCHEMA
    trino> CREATE TABLE iceberg.example_schema.example_sling (name varchar, id bigint, description varchar);
    CREATE TABLE
    trino> INSERT INTO iceberg.example_schema.example_sling VALUES ('my_name', 123456, 'my_description');
    INSERT: 1 row
    trino> SELECT * FROM iceberg.example_schema.example_sling;
      name   |   id   |  description
    ---------+--------+----------------
     my_name | 123456 | my_description
    (1 row)
    

    I then ran the following to export the table to CSV:

    sling conns set LOCAL_TRINO type=trino http_url="http://admin@localhost:8080?catalog=iceberg&schema=example_schema"
    sling run --src-conn LOCAL_TRINO --src-stream "example_schema.example_sling" --tgt-object "file://example_sling.csv"
    

    Inspecting the CSV I see:

    $ cat example_sling.csv
    name,id,description
    ,123456,
    

    I observe the same when saving to a parquet file, and open it in dbeaver (via duckdb):

    Screenshot 2024-05-23 at 3 26 38 PM

    I tried using both an iceberg catalog and a postgres catalog when creating the table to determine if the issue was catalog specific, but I saw the same behaviour regardless of the catalog used.

  • Sling version (sling --version): 1.2.10

  • Operating System (linux, mac, windows): mac (arm)

  • Log Output (please run command with -d):

$ sling conns set LOCAL_TRINO_POSTGRES type=trino http_url="http://admin@localhost:8080?catalog=postgres&schema=public"
$ sling run --src-conn LOCAL_TRINO_POSTGRES --src-stream "postgres.public.example_sling" --tgt-object "file://example_sling_pg.csv" -d
2024-05-23 15:10:43 DBG Sling version: 1.2.10 (darwin arm64)
2024-05-23 15:10:43 DBG type is db-file
2024-05-23 15:10:43 DBG using source options: {"empty_as_null":false,"null_if":"NULL","datetime_format":"AUTO","max_decimals":-1}
2024-05-23 15:10:43 DBG using target options: {"header":true,"compression":"auto","concurrency":7,"datetime_format":"auto","delimiter":",","file_max_rows":0,"file_max_bytes":0,"max_decimals":-1,"use_bulk":true,"add_new_columns":true,"column_casing":"source"}
2024-05-23 15:10:43 INF connecting to source database (trino)
2024-05-23 15:10:43 DBG opened "trino" connection (conn-trino-Xqx)
2024-05-23 15:10:43 INF reading from source database
2024-05-23 15:10:43 DBG select * from "postgres"."public"."example_sling"
2024-05-23 15:10:43 INF writing to target file system (file)
2024-05-23 15:10:43 DBG writing to file://example_sling_pg.csv [fileRowLimit=0 fileBytesLimit=0 compression=auto concurrency=7 useBufferedStream=false fileFormat=csv]
2024-05-23 15:10:43 DBG wrote 30 B: 1 rows [6 r/s]
2024-05-23 15:10:43 INF wrote 1 rows [6 r/s] to file://example_sling_pg.csv
2024-05-23 15:10:43 DBG closed "trino" connection (conn-trino-Xqx)
2024-05-23 15:10:43 INF execution succeeded
@flarco flarco added the bug Something isn't working label May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants