Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows ODBC data source always uses :memory: no matter what you set it to in Control Panel - ODBC Data Sources #29

Closed
shwivel opened this issue Jul 30, 2024 · 32 comments · Fixed by #53

Comments

@shwivel
Copy link

shwivel commented Jul 30, 2024

On a Windows system after installing the ODBC driver a data source name of "DuckDB" is created with the default database set to :memory: but of course we may wish to specify a specific database file.

When you change :memory: to some file (ie. C:\some_file.db ) and you try to use that data source, you'll get "table not found" errors (if running a query) or no tables will list (if listing tables) because it's actually using the transient :memory: database even though you've specified a persistent one, as depicted below:

image

It shows up in regedit fine:

image

But it doesn't actually use the given database. For example if you open Excel or Power BI and click Data > Get Data > From other sources > From ODBC > DuckDB > no tables will be listed because none are there (it's using :memory: and not the one specified in Control Panel > ODBC data sources). You can manually set the database file location and get the tables to list, within every Excel or whatever file using the DSN, however this is not only annoying to have to do everywhere and every time, but certain features become limited when you set a manual connection string. In any case, I figure this cannot be intended, because why would you allow the specification of a database (with a default of :memory:) if no matter what you change it to, it is still going to use :memoy:?

Let me know if you have any questions. Thanks

@renbud
Copy link

renbud commented Aug 11, 2024

I have this issue as well on Windows 11 with the 64 bit drivers and power BI/Excel.

Also noting that a couple of the tests are failing when I run test_odbc.exe that comes with the drivers.

D:\a\duckdb\duckdb\tools\odbc\test\tests\connect.cpp(132): FAILED:
{Unknown expression after the reported line}
due to unexpected exception with message:
Could not find storage_version.db file.

Test SQLColAttribute for a query that returns an int

===============================================================================
test cases: 28 | 27 passed | 1 failed
assertions: 45118 | 45117 passed | 1 failed

@malcook
Copy link

malcook commented Sep 16, 2024

I also find this issue prevents from using window's version DuckDB ODBC Driver with a connection to database other than :memory:.

Symptoms include empty list of tables displayed using Excel > Data > Get Data > From Other Sources > From ODBC > choose my configured DuckDB connection > Navigator

image

FWIW I note this issue is surfacing elsewhere

Might this help ??? ... I too found some of the tests failing:

 .\test_odbc.exe
[6/31] (19%): Test SQLConnect and SQLDriverConnect
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_odbc.exe is a Catch v2.13.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
Test SQLConnect and SQLDriverConnect
-------------------------------------------------------------------------------
D:\a\duckdb-odbc\duckdb-odbc\test\tests\connect.cpp(132)
...............................................................................

D:\a\duckdb-odbc\duckdb-odbc\test\tests\connect.cpp(132): FAILED:
  {Unknown expression after the reported line}
due to unexpected exception with message:
  Could not find storage_version.db file.

[29/31] (93%): Test SQLColAttribute for a query that returns an interval        SQLColAttribute: Success with info
[30/31] (96%): Test SQLColAttribute for a query that returns a uuid             SQLColAttribute: Success with info
[31/31] (100%): Test SQLColAttribute for a query that returns a uuid
===============================================================================
test cases:    31 |    30 passed | 1 failed
assertions: 45345 | 45344 passed | 1 failed

and

\test_connection_odbc.exe
[0/1] (0%): Test SQLConnect with Ini File                                       SQLExecDirect (FROM string_values): Error: Error

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_connection_odbc.exe is a Catch v2.13.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
Test SQLConnect with Ini File
-------------------------------------------------------------------------------
D:\a\duckdb-odbc\duckdb-odbc\test\tests\connect_with_ini.cpp(9)
...............................................................................

D:\a\duckdb-odbc\duckdb-odbc\test\common.cpp(27): FAILED:
  REQUIRE( (((ret)&(~1))==0) )
with expansion:
  false

[1/1] (100%): Test SQLConnect with Ini File
===============================================================================
test cases: 1 | 1 failed
assertions: 6 | 5 passed | 1 failed

success here:

.\SystemDataODBC_tests.exe
[1/1] (100%): System.Data.ODBC
===============================================================================
All tests passed (26 assertions in 1 test case)

I also installed bleeding edge from https://duckdb.org/docs/installation/?version=main&environment=odbc&platform=win&download_method=direct&architecture=x86_64 and found this issue persists.

@shwivel
Copy link
Author

shwivel commented Sep 16, 2024

It is clear they just messed up the setting of the database file location in ODBC data sources (regardless of what you set it to, it just uses :memory: no matter what) If they didn't want to enable users to set a database file location, there would be no reason for this field. But the field is there and it just doesn't work.

Until the team fixes this, the solution is to specify the database file in the connection string of power query, like:
image

The downside is if you ever change where that file is, you'd have to update this string in every file, rather than the configuration of the DSN in ODBC data sources, which would be just one single change to one thing in one place.

Undoubtedly this is not intended. Hopefully they will fix it soon.

@malcook
Copy link

malcook commented Sep 16, 2024

Follow up with a hint and possible step toward a workaround...

Excel > Data > Get Data > From Other Sources > From ODBC > choose my configured DuckDB connection allows providing "Advanced Options" which can include a an option "SQL Statement".

I put the following in as the SQL statement:

attach 'C:\path\to\my\results.duckdb';
select * from results.mytable

and I got a step further as mytable is displayed in a grid...

...however when I click on "Load" I am advised:

image

@shwivel
Copy link
Author

shwivel commented Sep 16, 2024

If you edit your query and click "Advanced Editor" I believe the following would give you what you expect:

let
   Source = Odbc.Query("driver={DuckDB Driver};database=C:\path\to\my\results.duckdb;access_mode=read_only;dsn=DuckDB", "select * from mytable;")
in
    Source

It should not be necessary to specify the location of the file, of course. But right now that's what I do, under the circumstances.

@malcook
Copy link

malcook commented Sep 16, 2024

@shwivel thanks for that tip - i saw it before too in duckdb/duckdb#11380 (comment) - I just don't know how to apply it to my use of Excel, as opposed to your use of Power Query. Guidance much appreciated!

@shwivel
Copy link
Author

shwivel commented Sep 16, 2024

You are in the correct place. On the very first screen (depicted below) you would separate the connection string and query as depicted below:

image

Later, it is helpful for formatting purposes to use the advanced editor. (ribbon menu > Data > Queryies & Connections [this opens a side panel on the right with all your queries] then right click your query (from that side panel), click edit, and in the window that pops up, use the "Advanced Editor" button on the top left of ribbon menu. Then you'll see what I had pasted above and can more easily tweak the connection string and query. (still easier to to copy and paste from a syntax editor, but at least you get the line breaks and such) This looks like hell compared to the equivalent functionality on Google Sheets, but it is what it is and is mostly not terrible. Not as good as it could be, for sure.

@malcook
Copy link

malcook commented Sep 16, 2024

Thanks @shwivel - I'm so close!

Using your suggestion, I'm now getting this error message:

image

... whether or not I qualify the table name with main..

I'm pretty confident I am connecting to my database since if I mistype the database path the error message instead refers to :memory:.

I am also confident my select statement is correct - it works when run connecting using CLI (linux side), with or without the main. prefix.

Do you possibly see any other variation I should be trying?

@shwivel
Copy link
Author

shwivel commented Sep 16, 2024

If you go to the advanced editor (where the first word should be "let") you see a connection string and your query in the format depicted here, correct? #29 (comment)

Are you attaching databases (within SQL) or solely using the database you've specified in the connection string?

What happens if you just run a query like:

select 1 as x;
Do you get an error (or the single row/column result)?

To confirm for certain you are connecting to the correct .db file you could also run this query:

select * from duckdb_settings() where name = 'temp_directory';

Should return a value of something like "c:\your_path\your_name.your_extension.tmp" (the db file you're connecting to, with .tmp added at the end). If you don't see that, and you see merely ".tmp" that means you've connected to :memory:

@malcook
Copy link

malcook commented Sep 16, 2024

Hi!

Yes, it begins with "let".

re:

To confirm for certain you are connecting to the correct .db file you could also run this query:

I'm not sure I appreciate how this test provides any diagnositic as to whether I've connected to my database. Here I demo I'm connected to my db, but the temp_directory is still just '.tmp'.

image

The above was using CLI unix side.

I am now unsure whether I was ever attaching to my database using the query connection string in Excel.

I am quite sure I can issue an explicit attach statement and query my database with success:
image

but if I comment out the attach the query fails, despite the connection string setting the database to the same path:
image

So, it is pretty clear to me given the above that the connect string is not being respected. I did try quoting the path to database to no effect.

Thanks for all your suggestions.

@shwivel
Copy link
Author

shwivel commented Sep 16, 2024

By default the temp file will be in the same folder as the database file (that you're connected to), and the temp file will be named the same, but with .tmp at the end. If you've connected to :memory: then the result will just be .tmp. You mention having running that test query via CLI on a linux machine, but that's not where you need to test because it is not where you're having trouble connecting. I would test out running that query in Excel (rather than the query you're actually running) to make sure that you're actually connecting to the database you want to connect to, within Excel.

Attached is an Excel file named test_excel.xlsx with a query that accesses a database named testdb.db at path C:\data\testdb.db which has one table called poo. An image below depicts the setup.

image

Excel file:
test_excel.xlsx

Database file:
https://drive.google.com/file/d/1MVAAxcs5-9Yz12j99QfxJfUN1kRx30DR/view?usp=sharing

If you put the database file onto your Windows machine at path C:\data\ and this Excel file will not allow you to refresh the query result, then something is going on environment-wise, because I can do so, with these two files.

@malcook
Copy link

malcook commented Sep 17, 2024

Yes, thanks you so much for helping my sleuth this out. Something is indeed going on environment-wise. I'll try on another PC later that may help me resolve my issue.

I am getting the same issue trying to connect within Excel, despite everything being "in the right place", viz:
image

FWIW: I connect to your testdb.db just fine using DBeaver.

@malcook
Copy link

malcook commented Sep 17, 2024

FWIW: my workaround for now is to mirror the duckdb in sqlite for purposes of slapping an excel data dashboard in front of it. 😄

@malcook
Copy link

malcook commented Sep 17, 2024

Alas: the issue persists for me on two other windows computers.

re: environment-wise possibilities - I retested after installing latest supported Microsoft Visual C++ Redistributable (and restarting). No fix 😢

Still hoping this issue just gets fixed upstream...

@HildebKa
Copy link

The error persists even with nightly build.

https://artifacts.duckdb.org/duckdb-odbc/latest/odbc-windows-amd64.zip

staticlibs added a commit to staticlibs/duckdb-odbc that referenced this issue Jan 26, 2025
When connection string includes a trailing comma, existing impl in
Connect::ParseInputStr was doing an early exit without setting DSN
value to dbc->dsn. Because of this DSN was not used and default
:memory: instance was created.

The problem is not OS specific, but was happening on Windows all the
time because Windows ODBC Driver Manager is appending comma to the
passed connection string when calling SQLDriverConnect.

test_connect_odbc was extended to cover the trailing comma and also
parameters overriding in connection string.

Fixes: duckdb#29, duckdb#48
staticlibs added a commit to staticlibs/duckdb-odbc that referenced this issue Jan 26, 2025
When connection string includes a trailing comma, existing impl in
Connect::ParseInputStr was doing an early exit without setting DSN
value to dbc->dsn. Because of this DSN was not used and default
:memory: instance was created.

The problem is not OS specific, but was happening on Windows all the
time because Windows ODBC Driver Manager is appending comma to the
passed connection string when calling SQLDriverConnect.

test_connect_odbc was extended to cover the trailing comma and also
parameters overriding in connection string.

Fixes: duckdb#29, duckdb#48
@staticlibs
Copy link
Contributor

The problem is not Windows specific, it can be also reproduced on Linux when the connection string ends with a semicolon:

On Linux, with the following odbcinst.ini:

[DuckDB Driver]
Description=Driver for DuckDB
Driver=/path/to/libduckdb_odbc.so

And the followig odbc.ini:

[DuckDB]
Driver = DuckDB Driver
database = /path/to/test.db

Running with PyODBC:

Python 3.12.8 (main, Dec  6 2024, 00:00:00) [GCC 14.2.1 20240912 (Red Hat 14.2.1-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyodbc
>>> print(pyodbc.connect("DSN=DuckDB").cursor().execute("select current_catalog()").fetchone())
DSN=DuckDB
('test',)
>>> print(pyodbc.connect("DSN=DuckDB;").cursor().execute("select current_catalog()").fetchone())
DSN=DuckDB;
('memory',)

However on Windows the problem is worse because Windows ODBC Driver Manager always appends a semicolon to the connection string before passing it to SQLDriverConnect. I've filed #53 to fix this.

@malcook
Copy link

malcook commented Feb 3, 2025

In my hands @Mytherin's changes do not resolve the initially reported issue by @shwivel.

As @HildebKa previously reported "The issue still persists in nightly build"

I think it should be re-opened, but I lack the privs to do so.

I installed nightly bleeding edge with files dated 2/3.

I find that as before, testing using Excel to make an ODBC connection (Alt-A PN OD) to a well formed DDB database yields an empty list, as before.

As a possible workaround, I tried setting the 'Advanced Options > SQL Statementto attach to the database and select from a table as [earlier discussed](https://github.com/duckdb/duckdb-odbc/issues/29#issuecomment-2353756824) with the same effect (being that the contents of the table are displayed, and after hitting enter:[DataFormat.Errorl We cannot return the data produced by this query because of its complexity.`)

Note, I first tried just attaching to the database without the select, and got this possibly informative error.

Unable to connect
We encountered an error while trying to connect.
Details: 'This native database query isn't currently supported.-

Perhaps these test results are informative:

 ls


    Directory: C:\Users\mec\Desktop\duckdb_odbc-windows-amd64


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----          2/3/2025   9:29 AM       30064640 duckdb_odbc.dll
-a----          2/3/2025   9:29 AM          76288 duckdb_odbc_setup.dll
-a----          2/3/2025   9:29 AM          18432 odbc_install.exe
-a----          2/3/2025   9:29 AM        1289216 SystemDataODBC_tests.exe
-a----          2/3/2025   9:46 AM         536576 test.duckdb
-a----          2/3/2025   9:29 AM         437248 test_connection_odbc.exe
-a----          2/3/2025   9:29 AM         751104 test_odbc.exe
-a----          2/3/2025   9:40 AM         274432 test_odbc_named.db
-a----          2/3/2025   9:39 AM          12288 test_odbc_named_ua.db
 .\SystemDataODBC_tests.exe
[1/1] (100%): System.Data.ODBC
===============================================================================
All tests passed (26 assertions in 1 test case)
 .\test_odbc.exe
[6/31] (19%): Test SQLConnect and SQLDriverConnect
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_odbc.exe is a Catch v2.13.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
Test SQLConnect and SQLDriverConnect
-------------------------------------------------------------------------------
D:\a\duckdb-odbc\duckdb-odbc\test\tests\connect.cpp(132)
...............................................................................

D:\a\duckdb-odbc\duckdb-odbc\test\tests\connect.cpp(132): FAILED:
  {Unknown expression after the reported line}
due to unexpected exception with message:
  Could not find storage_version.db file.

[29/31] (93%): Test SQLColAttribute for a query that returns an interval        SQLColAttribute: Success with info
[30/31] (96%): Test SQLColAttribute for a query that returns a uuid             SQLColAttribute: Success with info
[31/31] (100%): Test SQLColAttribute for a query that returns a uuid
===============================================================================
test cases:    31 |    30 passed | 1 failed
assertions: 45429 | 45428 passed | 1 failed

 .\test_connection_odbc.exe
[0/2] (0%): Test SQLConnect with Ini File                                       SQLExecDirect (FROM string_values): Error: Error

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_connection_odbc.exe is a Catch v2.13.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
Test SQLConnect with Ini File
-------------------------------------------------------------------------------
D:\a\duckdb-odbc\duckdb-odbc\test\tests\connect_with_ini.cpp(9)
...............................................................................

D:\a\duckdb-odbc\duckdb-odbc\test\common.cpp(27): FAILED:
  REQUIRE( (((ret)&(~1))==0) )
with expansion:
  false

[1/2] (50%): Test SQLConnect with Ini File with extra options                   SQLExecDirect (FROM string_values): Error: Error
-------------------------------------------------------------------------------
Test SQLConnect with Ini File with extra options
-------------------------------------------------------------------------------
D:\a\duckdb-odbc\duckdb-odbc\test\tests\connect_with_ini.cpp(32)
...............................................................................

D:\a\duckdb-odbc\duckdb-odbc\test\common.cpp(27): FAILED:
  REQUIRE( (((ret)&(~1))==0) )
with expansion:
  false

[2/2] (100%): Test SQLConnect with Ini File with extra options
===============================================================================
test cases:  2 |  0 passed | 2 failed
assertions: 12 | 10 passed | 2 failed

@shwivel
Copy link
Author

shwivel commented Feb 3, 2025

I just downloaded the nightly build and this issue is not resolved.

Please test it on Windows so that you can see it does not work.

@Mytherin Mytherin reopened this Feb 3, 2025
@staticlibs
Copy link
Contributor

@shwivel

Checking this on Windows using the latest nightly build with the following duckdb_odbc.dll:

  • size: 30,064,640 bytes
  • modified: Sunday, ‎February ‎2, ‎2025, ‏‎12:24:02 PM
  • sha256: ab909dfcbdf07ba71dcdf798245466421bebaac231576a5b18dfcbb04d660092

No DuckDB driver registered before running odbc_install.exe. The registry records after running it:

$ reg query "HKLM\SOFTWARE\ODBC\ODBCINST.INI\DuckDB Driver" /s

HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBCINST.INI\DuckDB Driver
    UsageCount    REG_DWORD    0x1
    Driver    REG_SZ    C:\projects\duck\duckdb_odbc-windows-amd64\duckdb_odbc.dll
    Setup    REG_SZ    C:\projects\duck\duckdb_odbc-windows-amd64\duckdb_odbc_setup.dll
    APILevel    REG_SZ    1
    ConnectFunctions    REG_SZ    YYN
    DriverODBCVer    REG_SZ    3.0
    SQLLevel    REG_SZ    3
$ reg query "HKLM\SOFTWARE\ODBC\ODBC.INI\DuckDB" /s

HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBC.INI\DuckDB
    Driver    REG_SZ    C:\projects\duck\duckdb_odbc-windows-amd64\duckdb_odbc.dll
    database    REG_SZ    :memory:

Change the database parameter from :memory: to test1.db:

$ reg add "HKLM\SOFTWARE\ODBC\ODBC.INI\DuckDB" /v database /t REG_SZ /d test1.db /f
The operation completed successfully.
$ reg query "HKLM\SOFTWARE\ODBC\ODBC.INI\DuckDB" /s

HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBC.INI\DuckDB
    Driver    REG_SZ    C:\projects\duck\duckdb_odbc-windows-amd64\duckdb_odbc.dll
    database    REG_SZ    test1.db

Check with PyODBC:

$ python
Python 3.9.2 (tags/v3.9.2:1a79785, Feb 19 2021, 13:44:55) [MSC v.1928 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyodbc
>>> print(pyodbc.connect("DSN=DuckDB").cursor().execute("select current_catalog()").fetchone())
('test1',)
>>> print(pyodbc.connect("DSN=DuckDB;").cursor().execute("select current_catalog()").fetchone())
('test1',)

Check that test1 actually created in current dir:

$ dir
 Volume in drive C has no label.
 Volume Serial Number is 4446-172C

 Directory of C:\projects\duck\odbc_test

02/03/2025  05:41 PM    <DIR>          .
02/03/2025  05:41 PM    <DIR>          ..
02/03/2025  05:41 PM            12,288 test1.db
               1 File(s)         12,288 bytes
               2 Dir(s)  56,153,931,776 bytes free

So I believe the OP issue is indeed fixed in the latest nightly build.

And to troubleshoot the problem in your env, can you share the output of all following registry queries:

reg query "HKLM\SOFTWARE\ODBC\ODBC.INI\DuckDB" /s
reg query "HKLM\SOFTWARE\ODBC\ODBCINST.INI\DuckDB Driver" /s
reg query "HKCU\SOFTWARE\ODBC\ODBC.INI\DuckDB" /s
reg query "HKCU\SOFTWARE\ODBC\ODBCINST.INI\DuckDB Driver" /s

And also share the details (SHA-256 sum) of the duckdb_odbc.dll that is listed in registry?

@staticlibs
Copy link
Contributor

@malcook

Perhaps these test results are informative:

I can confirm that the nightly version you've used is passing all the test-suite on Windows including the test_connection_odbc.exe part. These tests are run on GitHub actions, just they require additional registry setup before running them, see details in ODBC.yml worflow file.

This is an excerpt from this nightly run:

Run ./build/release/bin/Release/test_connection_odbc.exe

[0/2] (0%): Test SQLConnect with Ini File                                       
[1/2] (50%): Test SQLConnect with Ini File with extra options                   
[2/2] (100%): Test SQLConnect with Ini File with extra options                  
===============================================================================
All tests passed (44 assertions in 2 test cases)

I find that as before, testing using Excel to make an ODBC connection

I can investigate/debug DuckDB ODBC usage from Excel, but I don't have the cycles to do that right now. Would you mind filing a separate issue with the details how exactly you are setting up the ODBC in Excel (or any other ODBC-capable Windows apps you may use)?

@shwivel
Copy link
Author

shwivel commented Feb 3, 2025

@staticlibs I provided a complete example with screenshots in my comment. This is not Excel-specific. It does not matter what software you use to connect via ODBC, for example if you use Power BI the result is the same. It uses the transient memory database and not the database you specify in ODBC Data Sources. The instructions on the DuckDB website state that it will use the database specified in ODBC Data Sources. It does not.

@malcook
Copy link

malcook commented Feb 3, 2025

@staticlibs - thanks for educating me regarding configuring the environment for executing tests GitHub actions and that indeed they should pass

I am configuring ODBC in Excel exactly as in @shwivel 's report.

@staticlibs
Copy link
Contributor

@shwivel

Would it be possible for you to install Python and PyODBC in your environment? I can guide you step by step if necessary.

It is possible (though I think - unlikely) that ODBC access from Excel uses a different code path in ODBC driver than access from PyODBC. So lets check first that PyODBC example works in your env to narrow down the problem.

@shwivel
Copy link
Author

shwivel commented Feb 3, 2025

@staticlibs

The behavior is the same when using python. It uses the transient memory database (when a persistent database is not specified) rather than the one I've specified in ODBC data sources.

For example, if you set the database to C:\data\testdb.db in ODBC data sources, using the test database linked in my previous comment, and run the following python script, you will get no rows returned:

import duckdb
duckdb.sql("SELECT * from information_schema.tables").show()

Result:

┌───────────────┬──────────────┬────────────┬───┬────────────────────┬──────────┬───────────────┬───────────────┐
│ table_catalog │ table_schema │ table_name │ … │ is_insertable_into │ is_typed │ commit_action │ TABLE_COMMENT │
│    varchar    │   varchar    │  varchar   │   │      varchar       │ varchar  │    varchar    │    varchar    │
├───────────────┴──────────────┴────────────┴───┴────────────────────┴──────────┴───────────────┴───────────────┤
│                                                    0 rows                                                     │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

This is despite the existence of a table named poo in that database. (the one I set in ODBC data sources)

I can specify the database file in the Python script like this:

import duckdb
con = duckdb.connect("c:/data/testdb.db")
con.sql("SELECT * from information_schema.tables").show()

Then I get the results I expect:

┌───────────────┬──────────────┬────────────┬───┬────────────────────┬──────────┬───────────────┬───────────────┐
│ table_catalog │ table_schema │ table_name │ … │ is_insertable_into │ is_typed │ commit_action │ TABLE_COMMENT │
│    varchar    │   varchar    │  varchar   │   │      varchar       │ varchar  │    varchar    │    varchar    │
├───────────────┼──────────────┼────────────┼───┼────────────────────┼──────────┼───────────────┼───────────────┤
│ testdb        │ main         │ poo        │ … │ YES                │ NO       │ NULL          │ NULL          │
├───────────────┴──────────────┴────────────┴───┴────────────────────┴──────────┴───────────────┴───────────────┤
│ 1 rows                                                                                   13 columns (7 shown) │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

But of course the whole point is to be able to specify the database in one place, and for it to default to that database, rather than the empty, transient, "memory" database.

@staticlibs
Copy link
Contributor

staticlibs commented Feb 3, 2025

@shwivel

The behavior is the same when using python.

I believe there is a confusion about the usage of DuckDB from Python. Accessing the existing DB file with the Python API of DuckDB (sources) is NOT the same as accessing it with PyODBC using DuckDB ODBC driver (sources).

The Python API (import duckdb) does NOT use the ODBC driver (that is a standartized C wrapper over C++ API of DuckDB), it uses pybind11 instead to call DuckDB C++ API directly. It is the preferred way to use the DuckDB from Python, but it knows nothing about the ODBC DSNs and cannot be configured with ODBC tools or HKLM\SOFTWARE\ODBC\ODBC.INI\DuckDB registry keys.

The PyODBC (import pyodbc) Python lib is a generic library to use any ODBC data source from Python. It will call Windows ODBC Driver manager and will use the ODBC settings from registry (they are actually read from the ODBC driver itself). In case of DuckDB it is convenient to use it for testing ODBC DSN configuration that is set in registry. If the ODBC configuration of DuckDB is successfully read from Python using PyODBC - that means that this configuration is correct in principle and then problems using it from other apps (like Excel) may lie elsewhere. But if it does not work from PyODBC - this means that the ODBC config in registry is incorrect and there is no point even trying it from Excel.

So, to troubleshoot the problem in your env, lets make the PyODBC working first, writing the detailed setup below, checked on a clean Windows:

  1. install the Latest Microsoft Visual C++ Redistributable Version for x64 (direct download link):

Image

  1. install Python 64-bit:

Image

  1. install PyODBC by running the following in Admin command prompt:
pip install pyodbc

Image

  1. install DuckDB ODBC driver by running odbc_install.exe:

Image

  1. set the database parameter in registry:

Image

  1. run the following in python command prompt:
import pyodbc
print(pyodbc.connect("DSN=DuckDB;").cursor().execute("select current_catalog()").fetchone())

If the DSN read successfully from the registry, then the output will be:

('test1',)

Otherwise, when DSN is not found (that was always the case on Windows before the latest nightly build), the output will be:

('memory',)

And then check that the test1.db file was actually created in the current directory:

Image

edit: redist link fix

@shwivel
Copy link
Author

shwivel commented Feb 3, 2025

@staticlibs

That works in python:

Image

But the same does not work in Excel:

Image

It does work in Power BI:

Image

@shwivel
Copy link
Author

shwivel commented Feb 3, 2025

@staticlibs

Oddly, simply omitting the driver specification from the connection string solves the issue:

Image

However, there is a reason I did that. The reason is because setting access_mode to read_only does not otherwise seem possible.

For example, this does not work (see error) (trailing semicolon does not matter whether included or not):

Image

Simply removing access_mode=read_only fixes the issue:

Image

But I want to connect in read only mode. It does not seem that we can connect in read only mode unless we specify the database file location (and the name of the driver). Setting the location in the data source name, and then using that named data source name (rather than the file location) in a connection string seems to prohibit the setting of the mode to read_only. Is that intentional? If that could be changed, it would be ideal. I cannot use it otherwise. Ultimately, I'd still need to set it manually in every file.

@shwivel
Copy link
Author

shwivel commented Feb 3, 2025

By manually adding a string value to the registry named access_mode subsequent use of the given DSN appears to obey that setting (see screenshot below). So I'm all set.

Image

It might be helpful to modify the UI for the ODBC configuration such that you can specify that setting (and perhaps others) rather than exclusively the database file location. So that the user doesn't need to manually add string values to the registry, afterward. There does not appear to be a way to set the setting in the connection string, unless your connecting string specifies the database file location (it won't work if just using a named DSN).

@staticlibs
Copy link
Contributor

@shwivel

Oddly, simply omitting the driver specification from the connection string solves the issue:

Yes, this thing comes from ODBC itself (not specific to DuckDB - details). Basically either DRIVER or DSN can be specified in the connection string. DRIVER will only use registry settings from HKLM\SOFTWARE\ODBC\ODBCINST.INI. While DSN will use settings from HKLM\SOFTWARE\ODBC\ODBC.INI. If both are specified - DRIVER takes the preference.

There does not appear to be a way to set the setting in the connection string

Additional DuckDB options can be specified in the connection string itself. Though they currently only work properly when DSN is specified (I've filed #56 to track this problem when DRIVER is specified). The settings in connection string will take preference over the settings in registry:

print(pyodbc.connect("DSN=DuckDB;").cursor().execute("select current_setting('access_mode')").fetchone())
('automatic',)
print(pyodbc.connect("DSN=DuckDB;access_mode=read_only").cursor().execute("select current_setting('access_mode')").fetchone())
('read_only',)
print(pyodbc.connect("DSN=DuckDB;").cursor().execute("select current_catalog()").fetchone())
('test1',)
print(pyodbc.connect("DSN=DuckDB;database=another_test2.db").cursor().execute("select current_catalog()").fetchone())
('another_test2',)

@shwivel
Copy link
Author

shwivel commented Feb 3, 2025

Got it, thanks @staticlibs

@shwivel shwivel closed this as completed Feb 3, 2025
@malcook
Copy link

malcook commented Feb 3, 2025

@shwivel I'm confused - it appears to me that your original observation still holds

When you change :memory: to some file (ie. C:\some_file.db ) and you try to use that data source, you'll get "table not found" errors (if running a query) or no tables will list (if listing tables) because it's actually using the transient :memory: database even though you've specified a persistent one,

Why would you want to close this issue? What am I missing?

@shwivel
Copy link
Author

shwivel commented Feb 4, 2025

@malcook Per my screenshot from my last comment, with the new ODBC driver, Excel is now using the database I have specified in Windows ODBC data sources. Previously, with the old ODBC driver, I had to specify the location of the database in Excel. Now I don't. Let me know if you have any questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants