Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue performing geometry operation in GeoJSON file "Error: Referenced column "geometry" not found in FROM clause" #487

Open
monasound opened this issue Jan 21, 2025 · 4 comments

Comments

@monasound
Copy link

monasound commented Jan 21, 2025

Hi,

I am exploring Duckdb to work with spatial files.

I am trying to import GeoJSON files and perform ST_UNION. I can read the files without any issue, but when performing ST_UNION operation, it result in the following error:

An error occurred while performing ST_UNION: Binder Error: Referenced column "geometry" not found in FROM clause!
Candidate bindings: "geo_data_0.geom", "geo_data_0.Reference_Geography", "geo_data_0.ObjectId"
LINE 5: SELECT geometry FROM geo_data_0

  1. Can someone help me explain the issue?
  2. Looks like there is no compatibility in reading feature class of ESRI Geodatabase. Please confirm.

FYI: Somebody raised similar error a year ago. However, I have no idea whether it is updated or not. Please kindly share your comments

@Maxxen
Copy link
Member

Maxxen commented Jan 21, 2025

Hello!

  1. What is the actual SQL query you're trying to run?
    As the error message explains, you're trying to pass a column named "geometry" to the ST_Union function, but there is no column with that name. Supposedly you got a table with a column named "geom" when you imported the geojson file.

  2. Im not too familiar with ESRI Geodatabases, but what have you tried? I assume you've tried to pass a gdb file to st_read? Can you share the GDB file?

  3. If there are other related issues it's generally a good idea to link them so that we know what you're talking about.

@monasound
Copy link
Author

monasound commented Jan 21, 2025

Below is the code I used.

  1. Yes, there is a geometry column which I can see.
  2. It is one of the geodatabase that I have downloaded and it opens correctly. What I encountered was error with respect to path. I am trying to fix this. If I have issues, I will share my code.

Thank you for your kind time.

`
import duckdb
import requests

geojson_urls = [
'https://services-eu1.arcgis.com/BuS9rtTsYEV5C0xh/arcgis/rest/services/CensusHub2022_T1_2_LEA/FeatureServer/0/query?outFields=&where=1%3D1&f=geojson',
'https://services-eu1.arcgis.com/BuS9rtTsYEV5C0xh/ArcGIS/rest/services/grid_population_2022_view/FeatureServer/0/query?outFields=
&where=1%3D1&f=geojson' # Replace with your second URL
]

con = duckdb.connect()

con.execute("INSTALL spatial;")
con.execute("LOAD spatial;")

for index, geojson_url in enumerate(geojson_urls):

response = requests.get(geojson_url)
response.raise_for_status()  # Raise an error for bad responses


if 'application/json' not in response.headers.get('Content-Type', ''):
    raise ValueError(f"The URL {geojson_url} did not return a valid GeoJSON file.")


local_geojson_path = f'data_{index}.geojson'
with open(local_geojson_path, 'w', encoding='utf-8') as f:
    f.write(response.text)


try:
    query = f'''
    CREATE TABLE geo_data_{index} AS SELECT * FROM ST_Read('{local_geojson_path}');
    '''
    
    con.execute(query)

except Exception as e:
    print(f"An error occurred while reading the GeoJSON file from {local_geojson_path}: {e}")

try:
union_query = '''
CREATE TABLE geo_data_union AS
SELECT ST_Union(geometry) AS geometry
FROM (
SELECT geometry FROM geo_data_0
UNION ALL
SELECT geometry FROM geo_data_1
);
'''

con.execute(union_query)


union_result = con.execute("SELECT * FROM geo_data_union").fetchdf()


print("Union of data_0 and data_1:")
print(union_result)

except Exception as e:
print(f"An error occurred while performing ST_UNION: {e}")

con.close()`

@Maxxen
Copy link
Member

Maxxen commented Jan 21, 2025

Geometry is the name of the type not the column. The geometry column you get from ST_Read is called "geom", not "geometry"

@monasound
Copy link
Author

Thank you for your reply.

I have issue using that url of GeoJSON. I have downloaded two GeoJSON files for a country and tried to do the same operations. It works and creates output.

Columns in GeoJSON_data1: [('OBJECTID', 'INTEGER', 'YES', None, None, None), ('FCsubtype', 'INTEGER', 'YES', None, None, None), ('geom', 'GEOMETRY', 'YES', None, None, None)]

`import duckdb

Connect to DuckDB

conn = duckdb.connect()

Install and load the spatial extension

conn.execute("INSTALL spatial;")
conn.execute("LOAD spatial;")

Load the first GeoJSON file

conn.execute("""
CREATE TABLE geo_data1 AS
SELECT * FROM ST_Read('M:\R\DuckDb\gEOjson\Built_Up_Areas.geojson')
""")

Load the second GeoJSON file

conn.execute("""
CREATE TABLE geo_data2 AS
SELECT * FROM ST_Read('M:\R\DuckDb\gEOjson\Vegetation_Areas.geojson')
""")

Check the column names in geo_data1

columns_geo_data1 = conn.execute("DESCRIBE geo_data1").fetchall()
print("Columns in geo_data1:", columns_geo_data1)

Check the column names in geo_data2

columns_geo_data2 = conn.execute("DESCRIBE geo_data2").fetchall()
print("Columns in geo_data2:", columns_geo_data2)
for row in columns_geo_data2:
print(row)

Example of performing a geometry operation: calculating the area of geometries in the second GeoJSON

area_result = conn.execute("""
SELECT ST_Area(geom) AS area
FROM geo_data1
""").fetchall()

Print the area results

print("Areas of geometries in the second GeoJSON file:")
for area in area_result:
print(area)

Example of a spatial join or union between the two datasets

union_result = conn.execute("""
SELECT a.OBJECTID, b.OBJECTID, ST_Union(a.geom, b.geom) AS union_geom
FROM geo_data1 AS a
JOIN geo_data2 AS b
ON ST_Intersects_Extent(a.geom, b.geom)
""").fetchall()

Create a new table for the union results

conn.execute("""
CREATE TABLE union_table AS
SELECT a.OBJECTID AS id_a, b.OBJECTID AS id_b, ST_AsText(ST_Union(a.geom, b.geom)) AS union_geom
FROM geo_data1 AS a
JOIN geo_data2 AS b
ON ST_Intersects_Extent(a.geom, b.geom)
""")

Export the union results to a CSV file

output_directory = 'M:\R\DuckDb\union_results.csv'
conn.execute(f"COPY union_table TO '{output_directory}' (FORMAT CSV, HEADER)")

Close the connection

conn.close()

print(f"Union results exported to {output_directory}")
`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants