INFO Required : python dataproc_templates GCSTOJDBC - Input Options #794
Replies: 5 comments 2 replies
-
I've not used Airflow but I can use the template as you suggest you'd like to:
I used Oracle rather than PostgreSQL but the parameters are accepted and the table is created:
What error are you getting? |
Beta Was this translation helpful? Give feedback.
-
Hi nj1973, Thanks a lot for swift response ...!!! I am not getting error while running Airflow with GCStoJDBC template , but 1st line of the file is ingested as a single column ( header ) and rest of row into one column . we are running with main.py not start.sh . Could you please share the line of code where Spark option is getting added as option for header true/false and delimiter type. Test file attached for reference . Thanks |
Beta Was this translation helpful? Give feedback.
-
Looking at your test file it appears that you have spaces and not tabs separating your fields? |
Beta Was this translation helpful? Give feedback.
-
The function call here is the one that ingests the data using Spark and adds options: This is the line within that function that generates a dict of the options: |
Beta Was this translation helpful? Give feedback.
-
I've tested with your TSV file to PostgreSQL and it imported successfully:
My test command:
|
Beta Was this translation helpful? Give feedback.
-
Hi Team ,
Could you please guide for the below Query .
we are using GCSTOJDBC Template , how to pass below arguments . I have tried adding below parameter but its not parsing or working.
Could you please let me know how we can pass header , inferschema and csv_sep as tab for GCSTOJDBC template.
https://github.com/GoogleCloudPlatform/dataproc-templates/blob/main/python/dataproc_templates/gcs/gcs_to_jdbc.py
"--gcs.jdbc.input.header",
"false",
"--gcs.jdbc.input.sep",
"\t",
we are using Airflow and dataproc template . Arguments are attached .
discussion_git.txt
Beta Was this translation helpful? Give feedback.
All reactions