Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make the use of loader_file_format parameter consistent #815

Open
rudolfix opened this issue Dec 11, 2023 · 0 comments
Open

make the use of loader_file_format parameter consistent #815

rudolfix opened this issue Dec 11, 2023 · 0 comments
Labels
good first issue Good for newcomers

Comments

@rudolfix
Copy link
Collaborator

Background
The loader_file_format can be configured on the pipeline level and this is fine. But technically it is a normalize setting and we should be able to configure it there as well.

Also not all data types are supported by all formats on all destinations. We should detect those and fail early.

Tasks

    • add loader_file_format to normalize configuration and use it to set the file format. If the explicit value is passed from the pipeline, it has the precedence.
    • add the exceptions to destination capabilities where we declare unsupported types per file format. Take the current cases from particular destination code (ie Bigquery and Snowflake have plenty of exceptions)
    • detect such exceptions in normalize (preferably in worker: before updating a schema check if any type is on exception list and raise exception with an explanation)
    • the behavior above must be tested
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
Status: Todo
Development

No branches or pull requests

2 participants