Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default hint of "unique" does not add a unique index, and does not say why #1848

Open
boxydog opened this issue Sep 19, 2024 · 4 comments
Open
Assignees
Labels
community This issue came from slack community workspace enhancement New feature or request question Further information is requested

Comments

@boxydog
Copy link

boxydog commented Sep 19, 2024

dlt version

1.0.0

Describe the problem

Default hint of "unique" does not add a unique index.

See my test code and the Postgres table description below. id2 has a unique index (because I explicitly declared one), but id does not.

                   Table "public.a_test_unique"
    Column    |       Type        | Collation | Nullable | Default 
--------------+-------------------+-----------+----------+---------
 id           | bigint            |           |          | 
 id2          | bigint            |           |          | 
 _dlt_load_id | character varying |           | not null | 
 _dlt_id      | character varying |           | not null | 
Indexes:
    "a_test_unique__dlt_id_key" UNIQUE CONSTRAINT, btree (_dlt_id)
    "a_test_unique_id2_key" UNIQUE CONSTRAINT, btree (id2)

Expected behavior

I declare a default hint that "id" should be unique. For the Postgres destination, "unique" should put a unique index on the field, or it should produce a warning message saying why it is not. It does neither.

Steps to reproduce

  • tar xvfz test_unique_hint.tgz
  • Set up a postgres destination
  • ./test_unique.py
  • $ psql $DESTINATION__CREDENTIALS -c '\d a_test_unique'

id should have a unique index, but it does not. Separately, id2 does have a unique index, but it was declared explicitly.

test_unique_hint.tgz

Operating system

macOS

Runtime environment

Local

Python version

3.11

dlt data source

json

dlt destination

Postgres

Other deployment details

No response

Additional information

Using hints to declare indexes may not be supported behavior. It's not documented anywhere.

However, it is what I expected, given https://dlthub.com/docs/devel/dlt-ecosystem/destinations/postgres#supported-column-hints says a unique field will get an index.

@rudolfix rudolfix added the bug Something isn't working label Sep 23, 2024
@VioletM VioletM added the community This issue came from slack community workspace label Oct 21, 2024
@burnash
Copy link
Collaborator

burnash commented Jan 27, 2025

Hi @boxydog, sorry for the delayed response. Thanks for reporting this issue and for providing the code to reproduce it.

I attempted to reproduce the issue, but in the test_unique_hint.tgz in test_unique.schema.yaml, I noticed that the hint set for id is primary_key, not unique:

  a_test_unique:
    columns:
      id:
        data_type: bigint
        primary_key: true
      id2:
        data_type: bigint
        unique: true
      _dlt_load_id:
        data_type: text
        nullable: false
      _dlt_id:
        data_type: text
        nullable: false
        unique: true

As of version 1.5.0, dlt does not create an index for primary_key hints in PostgreSQL. We welcome a pull request for this feature if you're interested in implementing it.

If you're set the unique hint for id, dlt will create a unique index just like it does for id2. Let me know if this resolves your issue.

@burnash burnash added the question Further information is requested label Jan 27, 2025
@boxydog
Copy link
Author

boxydog commented Jan 27, 2025

Thanks for your response.

This does not resolve my issue. If the default hint of unique for an id is not going to add a unique index for a field with a primary_key designation (which seems reasonable), then dlt should fail with this as an error message, so I know what's going on, or at least have a command-line tool that can show me schema warnings ("Warning: 'id' has a default hint of 'unique' but is declared a 'primary_key'").

This issue is a part of my "quality of life" series of issues (#1839, #1840, #1848, #1958). I've found dlt is working well, but if I do anything wrong, it can be hard to figure out what's happening. I'm trying to push dlt to report schema issues.

In this case, I want id to always be unique, so I wanted to use a default hint, and didn't notice for awhile that it doesn't work.

As of version 1.5.0, dlt does not create an index for primary_key hints in PostgreSQL.
We welcome a pull request for this feature if you're interested in implementing it.

Is this filed as a feature request? Should I file it?

@burnash burnash added the enhancement New feature or request label Feb 3, 2025
@burnash
Copy link
Collaborator

burnash commented Feb 3, 2025

Thanks for clarification, @boxydog

Is this filed as a feature request? Should I file it?

No this has not been yet filed as a feature request. You can file a new issue or update the description of this one and we change the label from bug to enhancement.

@boxydog boxydog changed the title Default hint of "unique" does not add a unique index Default hint of "unique" does not add a unique index, and does not say why Feb 3, 2025
@boxydog
Copy link
Author

boxydog commented Feb 3, 2025

You can file a new issue or update the description of this one

I changed the headline and description.

Whether you label it as bug or enhancement is your choice. I still consider it a bug to silently fail, but I understand you may prioritize other things.

@burnash burnash removed the bug Something isn't working label Feb 16, 2025
@burnash burnash moved this from Planned to Todo in dlt core library Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community This issue came from slack community workspace enhancement New feature or request question Further information is requested
Projects
Status: Todo
Development

No branches or pull requests

4 participants