diff --git a/notebooks/README.md b/notebooks/README.md index daf20b44a..064d37ed8 100644 --- a/notebooks/README.md +++ b/notebooks/README.md @@ -13,7 +13,6 @@ documentation tree, ## Notebook Information Notebook Title | Data set(s) | Notebook Description | External Download (Size) --- | --- | --- | --- -[NYC Taxi Years Correlation](nyc_taxi_years_correlation.ipynb) | [NYC Taxi Yellow 01/2016, 01/2017, taxi zone data](https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page) | Demonstrates using Point in Polygon to correlate the NYC Taxi datasets pre-2017 `lat/lon` locations with the post-2017 `LocationID` for cross format comparisons. | Yes (~3GB) [Stop Sign Counting By Zipcode Boundary](ZipCodes_Stops_PiP_cuSpatial.ipynb) | [Stop Sign Locations](https://wiki.openstreetmap.org/wiki/Tag:highway%3Dstop) [Zipcode Boundaries](https://catalog.data.gov/dataset/tiger-line-shapefile-2019-2010-nation-u-s-2010-census-5-digit-zip-code-tabulation-area-zcta5-na) [USA States Boundaries](https://wiki.openstreetmap.org/wiki/Tag:boundary%3Dadministrative) | Demonstrates Quadtree Point-in-Polygon to categorize stop signs by zipcode boundaries. | Yes (~1GB) [Taxi Dropoff Reverse Geocoding (GTC 2023)](Taxi_Dropoff_Reverse_Geocoding.ipynb) | [National Address Database](https://nationaladdressdata.s3.amazonaws.com/NAD_r12_TXT.zip) [NYC Taxi Zones](https://d37ci6vzurychx.cloudfront.net/misc/taxi_zones.zip) [taxi2015.csv](https://rapidsai-data.s3.us-east-2.amazonaws.com/viz-data/nyc_taxi.tar.gz) | Reverse Geocoding of 22GB of datasets in NYC delivered for GTC 2023 | Yes (~22GB) diff --git a/notebooks/nyc_taxi_years_correlation.ipynb b/notebooks/nyc_taxi_years_correlation.ipynb deleted file mode 100644 index d1c377e6c..000000000 --- a/notebooks/nyc_taxi_years_correlation.ipynb +++ /dev/null @@ -1,516 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Using cuSpatial to Correlate Taxi Data after a Format Change\n", - "In 2017, the NYC Taxi data switched from giving their pickup and drop off locations in `lat/lon` to one of 262 `LocationID`s. While these `LocationID`s made it easier to determine some regions and borough information that was lacking in the previous datasets, it made it difficult to compare datasets before and after this transition. \n", - "\n", - "\n", - "\n", - "By using cuSpatial `Points in Polygon` (PIP), we can quickly and easily map the latitude and longitude of the pre-2017 taxi dataset to the `LocationID`s of the 2017+ dataset. In this notebook, we will show you how to do so. cuSpatial 0.14 PIP only works on 31 polygons per call, so we will show how to process this larger 263 polygon shapefile with minimal memory impact. cuSpatial 0.15 will eliminate the 31 polygon limitation and provide substantial additional speedup.\n", - "\n", - "You may need a 16GB card or larger." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Imports" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import cuspatial\n", - "import geopandas as gpd\n", - "import cudf\n", - "from numba import cuda\n", - "import numpy as np" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Download the Data\n", - "We're going to download the January NYC Taxi datasets for 2016 and 2017. We also need the NYC Taxi Zones " - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "tzones_lonlat.json found\n", - "taxi2016.csv found\n", - " % Total % Received % Xferd Average Speed Time Time Time Current\n", - " Dload Upload Total Spent Left Speed\n", - "100 128M 100 128M 0 0 53.5M 0 0:00:02 0:00:02 --:--:-- 53.4M\n" - ] - } - ], - "source": [ - "!if [ ! -f \"tzones_lonlat.json\" ]; then curl \"https://data.cityofnewyork.us/api/geospatial/d3c5-ddgc?method=export&format=GeoJSON\" -o tzones_lonlat.json; else echo \"tzones_lonlat.json found\"; fi\n", - "!if [ ! -f \"taxi2016.csv\" ]; then curl https://storage.googleapis.com/anaconda-public-data/nyc-taxi/csv/2016/yellow_tripdata_2016-01.csv -o taxi2016.csv; else echo \"taxi2016.csv found\"; fi \n", - "!if [ ! -f \"taxi2017.parquet\" ]; then curl https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2017-01.parquet -o taxi2017.parquet; else echo \"taxi2017.csv found\"; fi" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Read in the Taxi Data with cuDF\n", - "Let's read in the pickups and dropoffs for 2016 and 2017." - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "taxi2016 = cudf.read_csv(\"taxi2016.csv\")\n", - "taxi2017 = cudf.read_parquet(\"taxi2017.parquet\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's have a look at the columns in `taxi2016` and `taxi2017` to verify the difference." - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{'DOLocationID', 'PULocationID', 'airport_fee', 'congestion_surcharge'}" - ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "set(taxi2017.columns).difference(set(taxi2016.columns))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Read in the Spatial Data with cuSpatial\n", - "\n", - "cuSpatial loads polygons into a `cudf.Series` of _Feature offsets_, a `cudf.Series` of _ring offsets_, and a `cudf.DataFrame` of `x` and `y` coordinates (which can be used for lon/lat as in this case) with the `read_polygon_shapefile` function. We're working on more advanced I/O integrations and nested `Columns` this year." - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "tzones = gpd.GeoDataFrame.from_file('tzones_lonlat.json')\n", - "taxi_zones = cuspatial.from_geopandas(tzones).geometry\n", - "taxi_zone_rings = cuspatial.GeoSeries.from_polygons_xy(\n", - " polygons_xy=taxi_zones.polygons.xy,\n", - " ring_offset=taxi_zones.polygons.ring_offset,\n", - " part_offset=taxi_zones.polygons.part_offset,\n", - " geometry_offset=cudf.Series(range(len(taxi_zones.polygons.part_offset)))\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Converting lon/lat coordinates to LocationIDs with cuSpatial\n", - "Looking at the taxi zones and the taxi2016 data, you can see that\n", - "- 10.09 million pickup locations\n", - "- 10.09 million dropoff locations\n", - "- 263 LocationID features\n", - "- 354 LocationID rings\n", - "- 98,192 LocationID coordinates\n", - "\n", - "Now that we've collected the set of pickup locations and dropoff locations, we can use `cuspatial.point_in_polygon` to quickly determine which pickups and dropoffs occur in each borough.\n", - "\n", - "To do this in a memory efficient way, instead of creating two massive 10.09 million x 263 arrays, we're going to use the 31 polygon limit to our advantage and map the resulting true values in the array a new `PULocationID` and `DOLocationID`, matching the 2017 schema. Locations outside of the `LocationID` areas are `264` and `265`. We'll be using 264 to indicate our out-of-bounds zones." - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[0, 31, 62, 93, 124, 155, 186, 217, 248, 263]\n" - ] - } - ], - "source": [ - "pip_iterations = list(np.arange(0, 263, 31))\n", - "pip_iterations.append(263)\n", - "print(pip_iterations)" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "def make_geoseries_from_lonlat(lon, lat):\n", - " lonlat = cudf.DataFrame({\"lon\": lon, \"lat\": lat}).interleave_columns()\n", - " return cuspatial.GeoSeries.from_points_xy(lonlat)\n", - "\n", - "pickup2016 = make_geoseries_from_lonlat(taxi2016['pickup_longitude'] , taxi2016['pickup_latitude'])\n", - "dropoff2016 = make_geoseries_from_lonlat(taxi2016['dropoff_longitude'] , taxi2016['dropoff_latitude'])" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "CPU times: user 9.85 s, sys: 6.31 s, total: 16.2 s\n", - "Wall time: 16 s\n" - ] - } - ], - "source": [ - "%%time\n", - "taxi2016['PULocationID'] = 264\n", - "taxi2016['DOLocationID'] = 264\n", - "for i in range(len(pip_iterations)-1):\n", - " start = pip_iterations[i]\n", - " end = pip_iterations[i+1]\n", - " \n", - " zone = taxi_zone_rings[start:end]\n", - " pickups = cuspatial.point_in_polygon(pickup2016, zone)\n", - " dropoffs = cuspatial.point_in_polygon(dropoff2016, zone)\n", - "\n", - " for j in pickups.columns:\n", - " taxi2016['PULocationID'].loc[pickups[j]] = j\n", - " for j in dropoffs.columns:\n", - " taxi2016['DOLocationID'].loc[dropoffs[j]] = j" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [], - "source": [ - "del pickups\n", - "del dropoffs" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "I wonder how many taxi rides in 2016 started and ended in the same location." - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0.09205741858888485\n" - ] - } - ], - "source": [ - "print(taxi2016['DOLocationID'].corr(taxi2016['PULocationID']))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Not nearly as many as I thought. How many exactly?" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0.19 %\n" - ] - } - ], - "source": [ - "print(format((taxi2016['DOLocationID'] == taxi2016['PULocationID']).sum()/taxi2016.shape[0], '.2f'), '%')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Something with perhaps a higher correlation: It seems likely that pickups and dropoffs by zone are not likely to change much from year to year, especially within a given month. Let's see how similar the pickup and dropoff patterns are in January 2016 and 2017:" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0.07718639422194801\n", - "0.06962159908563305\n" - ] - } - ], - "source": [ - "print(taxi2016['DOLocationID'].value_counts().corr(taxi2017['DOLocationID'].value_counts()))\n", - "print(taxi2016['PULocationID'].value_counts().corr(taxi2017['PULocationID'].value_counts()))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "tags": [] - }, - "source": [ - "## Bringing Them All Together\n", - "If you wanted to include this as part of a larger clean up of Taxi data, you'd then concatenate this dataframe into a `dask_cudf` dataframe and delete its `cuDF` version, or convert it into arrow memory format and process it similar to how we did in the mortgage notebook. For now, as we are only working on a couple of GBs, we'll concatenate in cuDF." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### A little Data checking and cleaning before merging..." - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "VendorID int64\n", - "tpep_pickup_datetime datetime64[us]\n", - "tpep_dropoff_datetime datetime64[us]\n", - "passenger_count int64\n", - "trip_distance float64\n", - "RatecodeID int64\n", - "store_and_fwd_flag object\n", - "PULocationID int64\n", - "DOLocationID int64\n", - "payment_type int64\n", - "fare_amount float64\n", - "extra float64\n", - "mta_tax float64\n", - "tip_amount float64\n", - "tolls_amount float64\n", - "improvement_surcharge float64\n", - "total_amount float64\n", - "congestion_surcharge object\n", - "airport_fee object\n", - "dtype: object" - ] - }, - "execution_count": 13, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "taxi2017.dtypes" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "VendorID int64\n", - "tpep_pickup_datetime object\n", - "tpep_dropoff_datetime object\n", - "passenger_count int64\n", - "trip_distance float64\n", - "pickup_longitude float64\n", - "pickup_latitude float64\n", - "RatecodeID int64\n", - "store_and_fwd_flag object\n", - "dropoff_longitude float64\n", - "dropoff_latitude float64\n", - "payment_type int64\n", - "fare_amount float64\n", - "extra float64\n", - "mta_tax float64\n", - "tip_amount float64\n", - "tolls_amount float64\n", - "improvement_surcharge float64\n", - "total_amount float64\n", - "PULocationID int64\n", - "DOLocationID int64\n", - "dtype: object" - ] - }, - "execution_count": 14, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "taxi2016.dtypes" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Seems that we have type differences in `tpep_pickup_datetime` and `tpep_dropoff_datetime` to fix" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [], - "source": [ - "taxi2016[\"tpep_pickup_datetime\"]=taxi2016[\"tpep_pickup_datetime\"].astype(\"datetime64[us]\")\n", - "taxi2016[\"tpep_dropoff_datetime\"]=taxi2016[\"tpep_dropoff_datetime\"].astype(\"datetime64[us]\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Now we can safely concat" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [], - "source": [ - "df = cudf.concat([taxi2017, taxi2016])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Final Check\n", - "Now to test to see if both years are present as expected." - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - " VendorID tpep_pickup_datetime tpep_dropoff_datetime passenger_count \\\n", - "4487880 1 2017-01-15 17:33:37 2017-01-15 17:35:32 1 \n", - "\n", - " trip_distance RatecodeID store_and_fwd_flag PULocationID \\\n", - "4487880 0.0 5 N 204 \n", - "\n", - " DOLocationID payment_type ... tip_amount tolls_amount \\\n", - "4487880 204 1 ... 10.0 0.0 \n", - "\n", - " improvement_surcharge total_amount congestion_surcharge \\\n", - "4487880 0.3 103.22 \n", - "\n", - " airport_fee pickup_longitude pickup_latitude dropoff_longitude \\\n", - "4487880 \n", - "\n", - " dropoff_latitude \n", - "4487880 \n", - "\n", - "[1 rows x 23 columns]\n" - ] - } - ], - "source": [ - "print(df.query('PULocationID == 204'))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "As you can see, 2017 values lack longitude and latitude. It is trivial and fast using our existing `DataFrame` to compute the mean of each `PULocationID` and `DOLocationID`s. We could inject those into the missing values to easily see how well the new LocationIDs map to pickup and dropoff locations." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Back To Your Workflow\n", - "So now you've seen how to use cuSpatial to clean and correlate your spatial data using the NYC taxi data. You can now perform multi year analytics across the entire range of taxi datasets using your favorite RAPIDS libraries," - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.9" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/python/cuspatial/cuspatial/_lib/distance.pyx b/python/cuspatial/cuspatial/_lib/distance.pyx index 4864970cc..6b4b6073f 100644 --- a/python/cuspatial/cuspatial/_lib/distance.pyx +++ b/python/cuspatial/cuspatial/_lib/distance.pyx @@ -1,10 +1,10 @@ -# Copyright (c) 2022-2024, NVIDIA CORPORATION. +# Copyright (c) 2022-2025, NVIDIA CORPORATION. from libcpp.memory cimport make_shared, shared_ptr, unique_ptr from libcpp.utility cimport move, pair -from cudf._lib.column cimport Column -from pylibcudf cimport Table as plc_Table +from cudf.core.column.column import Column +from pylibcudf cimport Column as plc_Column, Table as plc_Table from pylibcudf.libcudf.column.column cimport column from pylibcudf.libcudf.column.column_view cimport column_view from pylibcudf.libcudf.table.table_view cimport table_view @@ -26,7 +26,12 @@ from cuspatial._lib.cpp.types cimport collection_type_id, geometry_type_id from cuspatial._lib.types cimport collection_type_py_to_c -cpdef haversine_distance(Column x1, Column y1, Column x2, Column y2): +cpdef haversine_distance( + plc_Column x1, + plc_Column y1, + plc_Column x2, + plc_Column y2, +): cdef column_view c_x1 = x1.view() cdef column_view c_y1 = y1.view() cdef column_view c_x2 = x2.view() @@ -37,13 +42,13 @@ cpdef haversine_distance(Column x1, Column y1, Column x2, Column y2): with nogil: c_result = move(cpp_haversine_distance(c_x1, c_y1, c_x2, c_y2)) - return Column.from_unique_ptr(move(c_result)) + return Column.from_pylibcudf(plc_Column.from_libcudf(move(c_result))) def directed_hausdorff_distance( - Column xs, - Column ys, - Column space_offsets, + plc_Column xs, + plc_Column ys, + plc_Column space_offsets, ): cdef column_view c_xs = xs.view() cdef column_view c_ys = ys.view() @@ -60,8 +65,9 @@ def directed_hausdorff_distance( ) ) - owner_col = Column.from_unique_ptr( - move(result.first), data_ptr_exposed=True + owner_col = Column.from_pylibcudf( + plc_Column.from_libcudf(move(result.first)), + data_ptr_exposed=True ) cdef plc_Table plc_owner_table = plc_Table( [owner_col.to_pylibcudf(mode="read")] * result.second.num_columns() @@ -75,8 +81,8 @@ def directed_hausdorff_distance( def pairwise_point_distance( lhs_point_collection_type, rhs_point_collection_type, - Column points1, - Column points2, + plc_Column points1, + plc_Column points2, ): cdef collection_type_id lhs_point_multi_type = collection_type_py_to_c( lhs_point_collection_type @@ -102,12 +108,12 @@ def pairwise_point_distance( c_multipoints_lhs.get()[0], c_multipoints_rhs.get()[0], )) - return Column.from_unique_ptr(move(c_result)) + return Column.from_pylibcudf(plc_Column.from_libcudf(move(c_result))) def pairwise_linestring_distance( - Column multilinestrings1, - Column multilinestrings2 + plc_Column multilinestrings1, + plc_Column multilinestrings2 ): cdef shared_ptr[geometry_column_view] c_multilinestring_lhs = \ make_shared[geometry_column_view]( @@ -128,13 +134,13 @@ def pairwise_linestring_distance( c_multilinestring_rhs.get()[0], )) - return Column.from_unique_ptr(move(c_result)) + return Column.from_pylibcudf(plc_Column.from_libcudf(move(c_result))) def pairwise_point_linestring_distance( point_collection_type, - Column points, - Column linestrings, + plc_Column points, + plc_Column linestrings, ): cdef collection_type_id points_multi_type = collection_type_py_to_c( point_collection_type @@ -158,13 +164,13 @@ def pairwise_point_linestring_distance( c_multilinestrings.get()[0], )) - return Column.from_unique_ptr(move(c_result)) + return Column.from_pylibcudf(plc_Column.from_libcudf(move(c_result))) def pairwise_point_polygon_distance( point_collection_type, - Column multipoints, - Column multipolygons + plc_Column multipoints, + plc_Column multipolygons ): cdef collection_type_id points_multi_type = collection_type_py_to_c( point_collection_type @@ -189,12 +195,12 @@ def pairwise_point_polygon_distance( c_multipoints.get()[0], c_multipolygons.get()[0] )) - return Column.from_unique_ptr(move(c_result)) + return Column.from_pylibcudf(plc_Column.from_libcudf(move(c_result))) def pairwise_linestring_polygon_distance( - Column multilinestrings, - Column multipolygons + plc_Column multilinestrings, + plc_Column multipolygons ): cdef shared_ptr[geometry_column_view] c_multilinestrings = \ make_shared[geometry_column_view]( @@ -215,10 +221,10 @@ def pairwise_linestring_polygon_distance( c_multilinestrings.get()[0], c_multipolygons.get()[0] )) - return Column.from_unique_ptr(move(c_result)) + return Column.from_pylibcudf(plc_Column.from_libcudf(move(c_result))) -def pairwise_polygon_distance(Column lhs, Column rhs): +def pairwise_polygon_distance(plc_Column lhs, plc_Column rhs): cdef shared_ptr[geometry_column_view] c_lhs = \ make_shared[geometry_column_view]( lhs.view(), @@ -238,4 +244,4 @@ def pairwise_polygon_distance(Column lhs, Column rhs): c_lhs.get()[0], c_rhs.get()[0] )) - return Column.from_unique_ptr(move(c_result)) + return Column.from_pylibcudf(plc_Column.from_libcudf(move(c_result))) diff --git a/python/cuspatial/cuspatial/_lib/intersection.pyx b/python/cuspatial/cuspatial/_lib/intersection.pyx index 1e8fe9cc2..9b99cb46a 100644 --- a/python/cuspatial/cuspatial/_lib/intersection.pyx +++ b/python/cuspatial/cuspatial/_lib/intersection.pyx @@ -1,9 +1,10 @@ -# Copyright (c) 2023, NVIDIA CORPORATION. +# Copyright (c) 2023-2025, NVIDIA CORPORATION. from libcpp.memory cimport make_shared, shared_ptr from libcpp.utility cimport move -from cudf._lib.column cimport Column +from cudf.core.column.column import Column +from pylibcudf cimport Column as plc_Column from cuspatial._lib.types import CollectionType, GeometryType @@ -21,7 +22,7 @@ from cuspatial._lib.types cimport ( ) -def pairwise_linestring_intersection(Column lhs, Column rhs): +def pairwise_linestring_intersection(plc_Column lhs, plc_Column rhs): """ Compute the intersection of two (multi)linestrings. """ @@ -51,22 +52,34 @@ def pairwise_linestring_intersection(Column lhs, Column rhs): c_lhs.get()[0], c_rhs.get()[0] )) - geometry_collection_offset = Column.from_unique_ptr( - move(c_result.geometry_collection_offset) + geometry_collection_offset = Column.from_pylibcudf( + plc_Column.from_libcudf(move(c_result.geometry_collection_offset)) ) - types_buffer = Column.from_unique_ptr(move(c_result.types_buffer)) - offset_buffer = Column.from_unique_ptr(move(c_result.offset_buffer)) - points = Column.from_unique_ptr(move(c_result.points)) - segments = Column.from_unique_ptr(move(c_result.segments)) - lhs_linestring_id = Column.from_unique_ptr( - move(c_result.lhs_linestring_id) + types_buffer = Column.from_pylibcudf( + plc_Column.from_libcudf(move(c_result.types_buffer)) ) - lhs_segment_id = Column.from_unique_ptr(move(c_result.lhs_segment_id)) - rhs_linestring_id = Column.from_unique_ptr( - move(c_result.rhs_linestring_id) + offset_buffer = Column.from_pylibcudf( + plc_Column.from_libcudf(move(c_result.offset_buffer)) + ) + points = Column.from_pylibcudf( + plc_Column.from_libcudf(move(c_result.points)) + ) + segments = Column.from_pylibcudf( + plc_Column.from_libcudf(move(c_result.segments)) + ) + lhs_linestring_id = Column.from_pylibcudf( + plc_Column.from_libcudf(move(c_result.lhs_linestring_id)) + ) + lhs_segment_id = Column.from_pylibcudf( + plc_Column.from_libcudf(move(c_result.lhs_segment_id)) + ) + rhs_linestring_id = Column.from_pylibcudf( + plc_Column.from_libcudf(move(c_result.rhs_linestring_id)) + ) + rhs_segment_id = Column.from_pylibcudf( + plc_Column.from_libcudf(move(c_result.rhs_segment_id)) ) - rhs_segment_id = Column.from_unique_ptr(move(c_result.rhs_segment_id)) # Map linestring type codes from libcuspatial to cuspatial types_buffer[types_buffer == GeometryType.LINESTRING.value] = ( diff --git a/python/cuspatial/cuspatial/_lib/linestring_bounding_boxes.pyx b/python/cuspatial/cuspatial/_lib/linestring_bounding_boxes.pyx index 9f773f4d7..e5bc8e983 100644 --- a/python/cuspatial/cuspatial/_lib/linestring_bounding_boxes.pyx +++ b/python/cuspatial/cuspatial/_lib/linestring_bounding_boxes.pyx @@ -1,10 +1,10 @@ -# Copyright (c) 2020-2024, NVIDIA CORPORATION. +# Copyright (c) 2020-2025, NVIDIA CORPORATION. from libcpp.memory cimport unique_ptr from libcpp.utility cimport move -from cudf._lib.column cimport Column -from pylibcudf cimport Table as plc_Table +from cudf.core.column.column import Column +from pylibcudf cimport Column as plc_Column, Table as plc_Table from pylibcudf.libcudf.column.column_view cimport column_view from pylibcudf.libcudf.table.table cimport table @@ -13,8 +13,8 @@ from cuspatial._lib.cpp.linestring_bounding_boxes cimport ( ) -cpdef linestring_bounding_boxes(Column poly_offsets, - Column x, Column y, +cpdef linestring_bounding_boxes(plc_Column poly_offsets, + plc_Column x, plc_Column y, double R): cdef column_view c_poly_offsets = poly_offsets.view() cdef column_view c_x = x.view() diff --git a/python/cuspatial/cuspatial/_lib/nearest_points.pyx b/python/cuspatial/cuspatial/_lib/nearest_points.pyx index 6f64dba13..82f3931c5 100644 --- a/python/cuspatial/cuspatial/_lib/nearest_points.pyx +++ b/python/cuspatial/cuspatial/_lib/nearest_points.pyx @@ -1,8 +1,9 @@ -# Copyright (c) 2022-2024, NVIDIA CORPORATION. +# Copyright (c) 2022-2025, NVIDIA CORPORATION. from libcpp.utility cimport move -from cudf._lib.column cimport Column +from cudf.core.column.column import Column +from pylibcudf cimport Column as plc_Column from pylibcudf.libcudf.column.column_view cimport column_view from cuspatial._lib.cpp.nearest_points cimport ( @@ -14,9 +15,9 @@ from cuspatial._lib.utils cimport unwrap_pyoptcol def pairwise_point_linestring_nearest_points( - Column points_xy, - Column linestring_part_offsets, - Column linestring_points_xy, + plc_Column points_xy, + plc_Column linestring_part_offsets, + plc_Column linestring_points_xy, multipoint_geometry_offset=None, multilinestring_geometry_offset=None, ): @@ -41,17 +42,22 @@ def pairwise_point_linestring_nearest_points( multipoint_geometry_id = None if multipoint_geometry_offset is not None: - multipoint_geometry_id = Column.from_unique_ptr( - move(c_result.nearest_point_geometry_id.value())) + multipoint_geometry_id = Column.from_pylibcudf(plc_Column.from_libcudf( + move(c_result.nearest_point_geometry_id.value()))) multilinestring_geometry_id = None if multilinestring_geometry_offset is not None: - multilinestring_geometry_id = Column.from_unique_ptr( - move(c_result.nearest_linestring_geometry_id.value())) + multilinestring_geometry_id = Column.from_pylibcudf( + plc_Column.from_libcudf( + move(c_result.nearest_linestring_geometry_id.value()) + ) + ) - segment_id = Column.from_unique_ptr(move(c_result.nearest_segment_id)) - point_on_linestring_xy = Column.from_unique_ptr( - move(c_result.nearest_point_on_linestring_xy)) + segment_id = Column.from_pylibcudf( + plc_Column.from_libcudf(move(c_result.nearest_segment_id)) + ) + point_on_linestring_xy = Column.from_pylibcudf(plc_Column.from_libcudf( + move(c_result.nearest_point_on_linestring_xy))) return ( multipoint_geometry_id, diff --git a/python/cuspatial/cuspatial/_lib/pairwise_multipoint_equals_count.pyx b/python/cuspatial/cuspatial/_lib/pairwise_multipoint_equals_count.pyx index cca06afcc..b23eab808 100644 --- a/python/cuspatial/cuspatial/_lib/pairwise_multipoint_equals_count.pyx +++ b/python/cuspatial/cuspatial/_lib/pairwise_multipoint_equals_count.pyx @@ -1,9 +1,10 @@ -# Copyright (c) 2023-2024, NVIDIA CORPORATION. +# Copyright (c) 2023-2025, NVIDIA CORPORATION. from libcpp.memory cimport make_shared, shared_ptr, unique_ptr from libcpp.utility cimport move -from cudf._lib.column cimport Column +from cudf.core.column.column import Column +from pylibcudf cimport Column as plc_Column from pylibcudf.libcudf.column.column cimport column from cuspatial._lib.cpp.column.geometry_column_view cimport ( @@ -16,8 +17,8 @@ from cuspatial._lib.cpp.types cimport collection_type_id, geometry_type_id def pairwise_multipoint_equals_count( - Column _lhs, - Column _rhs, + plc_Column _lhs, + plc_Column _rhs, ): cdef shared_ptr[geometry_column_view] lhs = \ make_shared[geometry_column_view]( @@ -41,4 +42,4 @@ def pairwise_multipoint_equals_count( ) ) - return Column.from_unique_ptr(move(result)) + return Column.from_pylibcudf(plc_Column.from_libcudf(move(result))) diff --git a/python/cuspatial/cuspatial/_lib/pairwise_point_in_polygon.pyx b/python/cuspatial/cuspatial/_lib/pairwise_point_in_polygon.pyx index 84b442823..2d8941692 100644 --- a/python/cuspatial/cuspatial/_lib/pairwise_point_in_polygon.pyx +++ b/python/cuspatial/cuspatial/_lib/pairwise_point_in_polygon.pyx @@ -1,9 +1,10 @@ -# Copyright (c) 2022-2024, NVIDIA CORPORATION. +# Copyright (c) 2022-2025, NVIDIA CORPORATION. from libcpp.memory cimport unique_ptr from libcpp.utility cimport move -from cudf._lib.column cimport Column +from cudf.core.column.column import Column +from pylibcudf cimport Column as plc_Column from pylibcudf.libcudf.column.column cimport column from pylibcudf.libcudf.column.column_view cimport column_view @@ -13,12 +14,12 @@ from cuspatial._lib.cpp.pairwise_point_in_polygon cimport ( def pairwise_point_in_polygon( - Column test_points_x, - Column test_points_y, - Column poly_offsets, - Column poly_ring_offsets, - Column poly_points_x, - Column poly_points_y + plc_Column test_points_x, + plc_Column test_points_y, + plc_Column poly_offsets, + plc_Column poly_ring_offsets, + plc_Column poly_points_x, + plc_Column poly_points_y ): cdef column_view c_test_points_x = test_points_x.view() cdef column_view c_test_points_y = test_points_y.view() @@ -41,4 +42,4 @@ def pairwise_point_in_polygon( ) ) - return Column.from_unique_ptr(move(result)) + return Column.from_pylibcudf(plc_Column.from_libcudf(move(result))) diff --git a/python/cuspatial/cuspatial/_lib/point_in_polygon.pyx b/python/cuspatial/cuspatial/_lib/point_in_polygon.pyx index 833e87cc3..52e89d1e3 100644 --- a/python/cuspatial/cuspatial/_lib/point_in_polygon.pyx +++ b/python/cuspatial/cuspatial/_lib/point_in_polygon.pyx @@ -1,9 +1,10 @@ -# Copyright (c) 2020-2024, NVIDIA CORPORATION. +# Copyright (c) 2020-2025, NVIDIA CORPORATION. from libcpp.memory cimport unique_ptr from libcpp.utility cimport move -from cudf._lib.column cimport Column +from cudf.core.column.column import Column +from pylibcudf cimport Column as plc_Column from pylibcudf.libcudf.column.column cimport column from pylibcudf.libcudf.column.column_view cimport column_view @@ -13,12 +14,12 @@ from cuspatial._lib.cpp.point_in_polygon cimport ( def point_in_polygon( - Column test_points_x, - Column test_points_y, - Column poly_offsets, - Column poly_ring_offsets, - Column poly_points_x, - Column poly_points_y + plc_Column test_points_x, + plc_Column test_points_y, + plc_Column poly_offsets, + plc_Column poly_ring_offsets, + plc_Column poly_points_x, + plc_Column poly_points_y ): cdef column_view c_test_points_x = test_points_x.view() cdef column_view c_test_points_y = test_points_y.view() @@ -41,4 +42,4 @@ def point_in_polygon( ) ) - return Column.from_unique_ptr(move(result)) + return Column.from_pylibcudf(plc_Column.from_libcudf(move(result))) diff --git a/python/cuspatial/cuspatial/_lib/points_in_range.pyx b/python/cuspatial/cuspatial/_lib/points_in_range.pyx index e8ffbddb3..181525eae 100644 --- a/python/cuspatial/cuspatial/_lib/points_in_range.pyx +++ b/python/cuspatial/cuspatial/_lib/points_in_range.pyx @@ -1,10 +1,10 @@ -# Copyright (c) 2020-2024, NVIDIA CORPORATION. +# Copyright (c) 2020-2025, NVIDIA CORPORATION. from libcpp.memory cimport unique_ptr from libcpp.utility cimport move -from cudf._lib.column cimport Column -from pylibcudf cimport Table as plc_Table +from cudf.core.column.column import Column +from pylibcudf cimport Column as plc_Column, Table as plc_Table from pylibcudf.libcudf.column.column_view cimport column_view from pylibcudf.libcudf.table.table cimport table @@ -18,8 +18,8 @@ cpdef points_in_range( double range_max_x, double range_min_y, double range_max_y, - Column x, - Column y + plc_Column x, + plc_Column y ): cdef column_view x_v = x.view() cdef column_view y_v = y.view() diff --git a/python/cuspatial/cuspatial/_lib/polygon_bounding_boxes.pyx b/python/cuspatial/cuspatial/_lib/polygon_bounding_boxes.pyx index 38b42cc0f..d3bdf3422 100644 --- a/python/cuspatial/cuspatial/_lib/polygon_bounding_boxes.pyx +++ b/python/cuspatial/cuspatial/_lib/polygon_bounding_boxes.pyx @@ -1,10 +1,10 @@ -# Copyright (c) 2020-2024, NVIDIA CORPORATION. +# Copyright (c) 2020-2025, NVIDIA CORPORATION. from libcpp.memory cimport unique_ptr from libcpp.utility cimport move -from cudf._lib.column cimport Column -from pylibcudf cimport Table as plc_Table +from cudf.core.column.column import Column +from pylibcudf cimport Column as plc_Column, Table as plc_Table from pylibcudf.libcudf.column.column_view cimport column_view from pylibcudf.libcudf.table.table cimport table @@ -13,9 +13,9 @@ from cuspatial._lib.cpp.polygon_bounding_boxes cimport ( ) -cpdef polygon_bounding_boxes(Column poly_offsets, - Column ring_offsets, - Column x, Column y): +cpdef polygon_bounding_boxes(plc_Column poly_offsets, + plc_Column ring_offsets, + plc_Column x, plc_Column y): cdef column_view c_poly_offsets = poly_offsets.view() cdef column_view c_ring_offsets = ring_offsets.view() cdef column_view c_x = x.view() diff --git a/python/cuspatial/cuspatial/_lib/quadtree.pyx b/python/cuspatial/cuspatial/_lib/quadtree.pyx index e43d25320..0fd6e9da4 100644 --- a/python/cuspatial/cuspatial/_lib/quadtree.pyx +++ b/python/cuspatial/cuspatial/_lib/quadtree.pyx @@ -1,12 +1,12 @@ -# Copyright (c) 2020-2024, NVIDIA CORPORATION. +# Copyright (c) 2020-2025, NVIDIA CORPORATION. from libc.stdint cimport int8_t from libcpp.memory cimport unique_ptr from libcpp.pair cimport pair from libcpp.utility cimport move -from cudf._lib.column cimport Column -from pylibcudf cimport Table as plc_Table +from cudf.core.column.column import Column +from pylibcudf cimport Column as plc_Column, Table as plc_Table from pylibcudf.libcudf.column.column cimport column from pylibcudf.libcudf.column.column_view cimport column_view from pylibcudf.libcudf.table.table cimport table @@ -17,7 +17,7 @@ from cuspatial._lib.cpp.quadtree cimport ( ) -cpdef quadtree_on_points(Column x, Column y, +cpdef quadtree_on_points(plc_Column x, plc_Column y, double x_min, double x_max, double y_min, double y_max, double scale, @@ -39,7 +39,7 @@ cpdef quadtree_on_points(Column x, Column y, "offset" ] return ( - Column.from_unique_ptr(move(result.first)), + Column.from_pylibcudf(plc_Column.from_libcudf(move(result.first))), ( { name: Column.from_pylibcudf(col) diff --git a/python/cuspatial/cuspatial/_lib/spatial.pyx b/python/cuspatial/cuspatial/_lib/spatial.pyx index 0afe2484e..473d6408d 100644 --- a/python/cuspatial/cuspatial/_lib/spatial.pyx +++ b/python/cuspatial/cuspatial/_lib/spatial.pyx @@ -1,10 +1,11 @@ -# Copyright (c) 2019-2024, NVIDIA CORPORATION. +# Copyright (c) 2019-2025, NVIDIA CORPORATION. from libcpp.memory cimport unique_ptr from libcpp.pair cimport pair from libcpp.utility cimport move -from cudf._lib.column cimport Column +from cudf.core.column.column import Column +from pylibcudf cimport Column as plc_Column from pylibcudf.libcudf.column.column cimport column from pylibcudf.libcudf.column.column_view cimport column_view @@ -16,8 +17,8 @@ from cuspatial._lib.cpp.projection cimport ( def sinusoidal_projection( double origin_lon, double origin_lat, - Column input_lon, - Column input_lat + plc_Column input_lon, + plc_Column input_lat ): cdef column_view c_input_lon = input_lon.view() cdef column_view c_input_lat = input_lat.view() @@ -34,5 +35,7 @@ def sinusoidal_projection( ) ) - return (Column.from_unique_ptr(move(result.first)), - Column.from_unique_ptr(move(result.second))) + return ( + Column.from_pylibcudf(plc_Column.from_libcudf(move(result.first))), + Column.from_pylibcudf(plc_Column.from_libcudf(move(result.second))) + ) diff --git a/python/cuspatial/cuspatial/_lib/spatial_join.pyx b/python/cuspatial/cuspatial/_lib/spatial_join.pyx index e54adca0c..61168a10b 100644 --- a/python/cuspatial/cuspatial/_lib/spatial_join.pyx +++ b/python/cuspatial/cuspatial/_lib/spatial_join.pyx @@ -1,11 +1,11 @@ -# Copyright (c) 2020-2024, NVIDIA CORPORATION. +# Copyright (c) 2020-2025, NVIDIA CORPORATION. from libc.stdint cimport int8_t from libcpp.memory cimport unique_ptr from libcpp.utility cimport move -from cudf._lib.column cimport Column -from pylibcudf cimport Table as plc_Table +from cudf.core.column.column import Column +from pylibcudf cimport Column as plc_Column, Table as plc_Table from pylibcudf.libcudf.column.column_view cimport column_view from pylibcudf.libcudf.table.table cimport table, table_view @@ -53,13 +53,13 @@ cpdef join_quadtree_and_bounding_boxes(object quadtree, cpdef quadtree_point_in_polygon(object poly_quad_pairs, object quadtree, - Column point_indices, - Column points_x, - Column points_y, - Column poly_offsets, - Column ring_offsets, - Column poly_points_x, - Column poly_points_y): + plc_Column point_indices, + plc_Column points_x, + plc_Column points_y, + plc_Column poly_offsets, + plc_Column ring_offsets, + plc_Column poly_points_x, + plc_Column poly_points_y): cdef plc_Table plc_poly_quad_pairs = plc_Table( [col.to_pylibcudf(mode="read") for col in poly_quad_pairs._columns] ) @@ -102,12 +102,12 @@ cpdef quadtree_point_in_polygon(object poly_quad_pairs, cpdef quadtree_point_to_nearest_linestring(object linestring_quad_pairs, object quadtree, - Column point_indices, - Column points_x, - Column points_y, - Column linestring_offsets, - Column linestring_points_x, - Column linestring_points_y): + plc_Column point_indices, + plc_Column points_x, + plc_Column points_y, + plc_Column linestring_offsets, + plc_Column linestring_points_x, + plc_Column linestring_points_y): cdef plc_Table plc_quad_pairs = plc_Table( [ col.to_pylibcudf(mode="read") diff --git a/python/cuspatial/cuspatial/_lib/trajectory.pyx b/python/cuspatial/cuspatial/_lib/trajectory.pyx index dc92a508b..0b4f28ba6 100644 --- a/python/cuspatial/cuspatial/_lib/trajectory.pyx +++ b/python/cuspatial/cuspatial/_lib/trajectory.pyx @@ -1,11 +1,11 @@ -# Copyright (c) 2019-2024, NVIDIA CORPORATION. +# Copyright (c) 2019-2025, NVIDIA CORPORATION. from libcpp.memory cimport unique_ptr from libcpp.pair cimport pair from libcpp.utility cimport move -from cudf._lib.column cimport Column -from pylibcudf cimport Table as plc_Table +from cudf.core.column.column import Column +from pylibcudf cimport Column as plc_Column, Table as plc_Table from pylibcudf.libcudf.column.column cimport column from pylibcudf.libcudf.column.column_view cimport column_view from pylibcudf.libcudf.table.table cimport table @@ -18,8 +18,8 @@ from cuspatial._lib.cpp.trajectory cimport ( ) -cpdef derive_trajectories(Column object_id, Column x, - Column y, Column timestamp): +cpdef derive_trajectories(plc_Column object_id, plc_Column x, + plc_Column y, plc_Column timestamp): cdef column_view c_id = object_id.view() cdef column_view c_x = x.view() cdef column_view c_y = y.view() @@ -37,11 +37,18 @@ cpdef derive_trajectories(Column object_id, Column x, }, None ) - return first_result, Column.from_unique_ptr(move(result.second)) + return ( + first_result, + Column.from_pylibcudf(plc_Column.from_libcudf(move(result.second))) + ) -cpdef trajectory_bounding_boxes(size_type num_trajectories, - Column object_id, Column x, Column y): +cpdef trajectory_bounding_boxes( + size_type num_trajectories, + plc_Column object_id, + plc_Column x, + plc_Column y, +): cdef column_view c_id = object_id.view() cdef column_view c_x = x.view() cdef column_view c_y = y.view() @@ -63,8 +70,8 @@ cpdef trajectory_bounding_boxes(size_type num_trajectories, cpdef trajectory_distances_and_speeds(size_type num_trajectories, - Column object_id, Column x, - Column y, Column timestamp): + plc_Column object_id, plc_Column x, + plc_Column y, plc_Column timestamp): cdef column_view c_id = object_id.view() cdef column_view c_x = x.view() cdef column_view c_y = y.view() diff --git a/python/cuspatial/cuspatial/_lib/utils.pyx b/python/cuspatial/cuspatial/_lib/utils.pyx index 9bc0b01ab..00597c6ee 100644 --- a/python/cuspatial/cuspatial/_lib/utils.pyx +++ b/python/cuspatial/cuspatial/_lib/utils.pyx @@ -1,5 +1,5 @@ -# Copyright (c) 2022-2024, NVIDIA CORPORATION. -from cudf._lib.column cimport Column +# Copyright (c) 2022-2025, NVIDIA CORPORATION. +from pylibcudf cimport Column from pylibcudf.libcudf.column.column_view cimport column_view from cuspatial._lib.cpp.optional cimport nullopt, optional diff --git a/python/cuspatial/cuspatial/core/binops/distance_dispatch.py b/python/cuspatial/cuspatial/core/binops/distance_dispatch.py index d55822ab8..a4e9c0c17 100644 --- a/python/cuspatial/cuspatial/core/binops/distance_dispatch.py +++ b/python/cuspatial/cuspatial/core/binops/distance_dispatch.py @@ -1,4 +1,4 @@ -# Copyright (c) 2024, NVIDIA CORPORATION +# Copyright (c) 2024-2025, NVIDIA CORPORATION import cudf from cudf.core.column import as_column @@ -181,9 +181,17 @@ def __call__(self): (self._lhs_type, self._rhs_type) ] if reverse: - dist = func(*collection_types, self._rhs_column, self._lhs_column) + dist = func( + *collection_types, + self._rhs_column.to_pylibcudf(mode="read"), + self._lhs_column.to_pylibcudf(mode="read"), + ) else: - dist = func(*collection_types, self._lhs_column, self._rhs_column) + dist = func( + *collection_types, + self._lhs_column.to_pylibcudf(mode="read"), + self._rhs_column.to_pylibcudf(mode="read"), + ) # Rows with misaligned indices contains nan. Here we scatter the # distance values to the correct indices. diff --git a/python/cuspatial/cuspatial/core/binops/equals_count.py b/python/cuspatial/cuspatial/core/binops/equals_count.py index b86358f48..b2d2dd1cd 100644 --- a/python/cuspatial/cuspatial/core/binops/equals_count.py +++ b/python/cuspatial/cuspatial/core/binops/equals_count.py @@ -1,4 +1,4 @@ -# Copyright (c) 2023-2024, NVIDIA CORPORATION. +# Copyright (c) 2023-2025, NVIDIA CORPORATION. import cudf @@ -72,8 +72,8 @@ def pairwise_multipoint_equals_count(lhs, rhs): if any(not contains_only_multipoints(s) for s in [lhs, rhs]): raise ValueError("Input GeoSeries must contain only multipoints.") - lhs_column = lhs._column.mpoints._column - rhs_column = rhs._column.mpoints._column + lhs_column = lhs._column.mpoints._column.to_pylibcudf(mode="read") + rhs_column = rhs._column.mpoints._column.to_pylibcudf(mode="read") result = c_pairwise_multipoint_equals_count(lhs_column, rhs_column) return cudf.Series._from_column(result) diff --git a/python/cuspatial/cuspatial/core/binops/intersection.py b/python/cuspatial/cuspatial/core/binops/intersection.py index 62eac4227..67e5ceed6 100644 --- a/python/cuspatial/cuspatial/core/binops/intersection.py +++ b/python/cuspatial/cuspatial/core/binops/intersection.py @@ -1,4 +1,4 @@ -# Copyright (c) 2023-2024, NVIDIA CORPORATION. +# Copyright (c) 2023-2025, NVIDIA CORPORATION. from typing import TYPE_CHECKING @@ -72,7 +72,8 @@ def pairwise_linestring_intersection( raise ValueError("Input GeoSeries must contain only linestrings.") geoms, look_back_ids = c_pairwise_linestring_intersection( - linestrings1.lines.column(), linestrings2.lines.column() + linestrings1.lines.column().to_pylibcudf(mode="read"), + linestrings2.lines.column().to_pylibcudf(mode="read"), ) ( diff --git a/python/cuspatial/cuspatial/core/binpreds/contains.py b/python/cuspatial/cuspatial/core/binpreds/contains.py index f73b2f159..ba330e98b 100644 --- a/python/cuspatial/cuspatial/core/binpreds/contains.py +++ b/python/cuspatial/cuspatial/core/binpreds/contains.py @@ -1,4 +1,4 @@ -# Copyright (c) 2022-2024, NVIDIA CORPORATION. +# Copyright (c) 2022-2025, NVIDIA CORPORATION. from math import ceil, sqrt @@ -97,12 +97,12 @@ def _brute_force_contains_properly(points, polygons): within its corresponding polygon. """ pip_result = cpp_byte_point_in_polygon( - as_column(points.points.x), - as_column(points.points.y), - as_column(polygons.polygons.part_offset), - as_column(polygons.polygons.ring_offset), - as_column(polygons.polygons.x), - as_column(polygons.polygons.y), + as_column(points.points.x).to_pylibcudf(mode="read"), + as_column(points.points.y).to_pylibcudf(mode="read"), + as_column(polygons.polygons.part_offset).to_pylibcudf(mode="read"), + as_column(polygons.polygons.ring_offset).to_pylibcudf(mode="read"), + as_column(polygons.polygons.x).to_pylibcudf(mode="read"), + as_column(polygons.polygons.y).to_pylibcudf(mode="read"), ) result = DataFrame( pip_bitmap_column_to_binary_array( @@ -144,12 +144,12 @@ def _pairwise_contains_properly(points, polygons): within its corresponding polygon. """ result_column = cpp_pairwise_point_in_polygon( - as_column(points.points.x), - as_column(points.points.y), - as_column(polygons.polygons.part_offset), - as_column(polygons.polygons.ring_offset), - as_column(polygons.polygons.x), - as_column(polygons.polygons.y), + as_column(points.points.x).to_pylibcudf(mode="read"), + as_column(points.points.y).to_pylibcudf(mode="read"), + as_column(polygons.polygons.part_offset).to_pylibcudf(mode="read"), + as_column(polygons.polygons.ring_offset).to_pylibcudf(mode="read"), + as_column(polygons.polygons.x).to_pylibcudf(mode="read"), + as_column(polygons.polygons.y).to_pylibcudf(mode="read"), ) # Pairwise returns a boolean column with a True value for each (polygon, # point) pair where the point is contained properly by the polygon. We can diff --git a/python/cuspatial/cuspatial/core/spatial/bounding.py b/python/cuspatial/cuspatial/core/spatial/bounding.py index 5ee813137..ab06bfd55 100644 --- a/python/cuspatial/cuspatial/core/spatial/bounding.py +++ b/python/cuspatial/cuspatial/core/spatial/bounding.py @@ -1,4 +1,4 @@ -# Copyright (c) 2022, NVIDIA CORPORATION. +# Copyright (c) 2022-2025, NVIDIA CORPORATION. from cudf import DataFrame from cudf.core.column import as_column @@ -67,10 +67,10 @@ def polygon_bounding_boxes(polygons: GeoSeries): zip( column_names, cpp_polygon_bounding_boxes( - as_column(poly_offsets), - as_column(ring_offsets), - as_column(x), - as_column(y), + as_column(poly_offsets).to_pylibcudf(mode="read"), + as_column(ring_offsets).to_pylibcudf(mode="read"), + as_column(x).to_pylibcudf(mode="read"), + as_column(y).to_pylibcudf(mode="read"), ), ) ) @@ -121,7 +121,10 @@ def linestring_bounding_boxes(linestrings: GeoSeries, expansion_radius: float): y = linestrings.lines.y results = cpp_linestring_bounding_boxes( - as_column(line_offsets), as_column(x), as_column(y), expansion_radius + as_column(line_offsets).to_pylibcudf(mode="read"), + as_column(x).to_pylibcudf(mode="read"), + as_column(y).to_pylibcudf(mode="read"), + expansion_radius, ) return DataFrame._from_data(dict(zip(column_names, results))) diff --git a/python/cuspatial/cuspatial/core/spatial/distance.py b/python/cuspatial/cuspatial/core/spatial/distance.py index 0a95baeba..23a74ec00 100644 --- a/python/cuspatial/cuspatial/core/spatial/distance.py +++ b/python/cuspatial/cuspatial/core/spatial/distance.py @@ -1,4 +1,4 @@ -# Copyright (c) 2022-2024, NVIDIA CORPORATION. +# Copyright (c) 2022-2025, NVIDIA CORPORATION. import cudf from cudf import DataFrame, Series @@ -86,9 +86,11 @@ def directed_hausdorff_distance(multipoints: GeoSeries): raise ValueError("Input must be a series of multipoints.") result = cpp_directed_hausdorff_distance( - multipoints.multipoints.x._column, - multipoints.multipoints.y._column, - as_column(multipoints.multipoints.geometry_offset[:-1]), + multipoints.multipoints.x._column.to_pylibcudf(mode="read"), + multipoints.multipoints.y._column.to_pylibcudf(mode="read"), + as_column(multipoints.multipoints.geometry_offset[:-1]).to_pylibcudf( + mode="read" + ), ) # the column label of each column in result should be its int position @@ -151,7 +153,14 @@ def haversine_distance(p1: GeoSeries, p2: GeoSeries): p2_lat = p2.points.y._column return cudf.Series._from_data( - {"None": cpp_haversine_distance(p1_lon, p1_lat, p2_lon, p2_lat)} + { + "None": cpp_haversine_distance( + p1_lon.to_pylibcudf(mode="read"), + p1_lat.to_pylibcudf(mode="read"), + p2_lon.to_pylibcudf(mode="read"), + p2_lat.to_pylibcudf(mode="read"), + ) + } ) @@ -221,8 +230,8 @@ def pairwise_point_distance(points1: GeoSeries, points2: GeoSeries): cpp_pairwise_point_distance( lhs_point_collection_type, rhs_point_collection_type, - lhs_column, - rhs_column, + lhs_column.to_pylibcudf(mode="read"), + rhs_column.to_pylibcudf(mode="read"), ) ) @@ -293,8 +302,8 @@ def pairwise_linestring_distance( return Series._from_column( cpp_pairwise_linestring_distance( - multilinestrings1.lines.column(), - multilinestrings2.lines.column(), + multilinestrings1.lines.column().to_pylibcudf(mode="read"), + multilinestrings2.lines.column().to_pylibcudf(mode="read"), ) ) @@ -413,8 +422,8 @@ def pairwise_point_linestring_distance( { None: c_pairwise_point_linestring_distance( point_collection_type, - point_column, - linestrings.lines.column(), + point_column.to_pylibcudf(mode="read"), + linestrings.lines.column().to_pylibcudf(mode="read"), ) } ) @@ -497,7 +506,9 @@ def pairwise_point_polygon_distance(points: GeoSeries, polygons: GeoSeries): return Series._from_data( { None: c_pairwise_point_polygon_distance( - point_collection_type, points_column, polygon_column + point_collection_type, + points_column.to_pylibcudf(mode="read"), + polygon_column.to_pylibcudf(mode="read"), ) } ) @@ -576,8 +587,8 @@ def pairwise_linestring_polygon_distance( if not contains_only_polygons(polygons): raise ValueError("`polygon` array must contain only polygons") - linestrings_column = linestrings.lines.column() - polygon_column = polygons.polygons.column() + linestrings_column = linestrings.lines.column().to_pylibcudf(mode="read") + polygon_column = polygons.polygons.column().to_pylibcudf(mode="read") return Series._from_column( c_pairwise_line_poly_dist(linestrings_column, polygon_column) @@ -649,8 +660,8 @@ def pairwise_polygon_distance(polygons1: GeoSeries, polygons2: GeoSeries): if not contains_only_polygons(polygons2): raise ValueError("`polygons2` array must contain only polygons") - polygon1_column = polygons1.polygons.column() - polygon2_column = polygons2.polygons.column() + polygon1_column = polygons1.polygons.column().to_pylibcudf(mode="read") + polygon2_column = polygons2.polygons.column().to_pylibcudf(mode="read") return Series._from_data( {None: c_pairwise_polygon_distance(polygon1_column, polygon2_column)} diff --git a/python/cuspatial/cuspatial/core/spatial/filtering.py b/python/cuspatial/cuspatial/core/spatial/filtering.py index 615784fee..2e10af0ee 100644 --- a/python/cuspatial/cuspatial/core/spatial/filtering.py +++ b/python/cuspatial/cuspatial/core/spatial/filtering.py @@ -1,4 +1,4 @@ -# Copyright (c) 2022-2023, NVIDIA CORPORATION. +# Copyright (c) 2022-2025, NVIDIA CORPORATION. from cudf import DataFrame from cudf.core.column import as_column @@ -48,8 +48,8 @@ def points_in_spatial_window(points: GeoSeries, min_x, max_x, min_y, max_y): if not contains_only_points(points): raise ValueError("GeoSeries must contain only points.") - xs = as_column(points.points.x) - ys = as_column(points.points.y) + xs = as_column(points.points.x).to_pylibcudf(mode="read") + ys = as_column(points.points.y).to_pylibcudf(mode="read") res_xy = DataFrame._from_data( *points_in_range.points_in_range(min_x, max_x, min_y, max_y, xs, ys) diff --git a/python/cuspatial/cuspatial/core/spatial/indexing.py b/python/cuspatial/cuspatial/core/spatial/indexing.py index a7116d435..9bab115c0 100644 --- a/python/cuspatial/cuspatial/core/spatial/indexing.py +++ b/python/cuspatial/cuspatial/core/spatial/indexing.py @@ -1,4 +1,4 @@ -# Copyright (c) 2022-2024, NVIDIA CORPORATION. +# Copyright (c) 2022-2025, NVIDIA CORPORATION. import warnings @@ -159,8 +159,8 @@ def quadtree_on_points( if not len(points) == 0 and not contains_only_points(points): raise ValueError("GeoSeries must contain only points.") - xs = as_column(points.points.x) - ys = as_column(points.points.y) + xs = as_column(points.points.x).to_pylibcudf(mode="read") + ys = as_column(points.points.y).to_pylibcudf(mode="read") x_min, x_max, y_min, y_max = ( min(x_min, x_max), diff --git a/python/cuspatial/cuspatial/core/spatial/join.py b/python/cuspatial/cuspatial/core/spatial/join.py index fdaeb8f4a..f9323d367 100644 --- a/python/cuspatial/cuspatial/core/spatial/join.py +++ b/python/cuspatial/cuspatial/core/spatial/join.py @@ -1,4 +1,4 @@ -# Copyright (c) 2022-2024, NVIDIA CORPORATION. +# Copyright (c) 2022-2025, NVIDIA CORPORATION. import warnings @@ -71,15 +71,19 @@ def point_in_polygon(points: GeoSeries, polygons: GeoSeries): ): raise ValueError("GeoSeries cannot contain multipolygon.") - x = as_column(points.points.x) - y = as_column(points.points.y) + x = as_column(points.points.x).to_pylibcudf(mode="read") + y = as_column(points.points.y).to_pylibcudf(mode="read") poly_offsets = as_column(polygons.polygons.part_offset) - ring_offsets = as_column(polygons.polygons.ring_offset) - px = as_column(polygons.polygons.x) - py = as_column(polygons.polygons.y) + ring_offsets = as_column(polygons.polygons.ring_offset).to_pylibcudf( + mode="read" + ) + px = as_column(polygons.polygons.x).to_pylibcudf(mode="read") + py = as_column(polygons.polygons.y).to_pylibcudf(mode="read") - result = cpp_point_in_polygon(x, y, poly_offsets, ring_offsets, px, py) + result = cpp_point_in_polygon( + x, y, poly_offsets.to_pylibcudf(mode="read"), ring_offsets, px, py + ) result = DataFrame( pip_bitmap_column_to_binary_array( polygon_bitmap_column=result, width=len(poly_offsets) - 1 @@ -215,20 +219,24 @@ def quadtree_point_in_polygon( "`polygons` Geoseries must contains only polygons geometries." ) points_data = points.points - points_x = as_column(points_data.x) - points_y = as_column(points_data.y) + points_x = as_column(points_data.x).to_pylibcudf(mode="read") + points_y = as_column(points_data.y).to_pylibcudf(mode="read") polygon_data = polygons.polygons - poly_offsets = as_column(polygon_data.part_offset) - ring_offsets = as_column(polygon_data.ring_offset) - poly_points_x = as_column(polygon_data.x) - poly_points_y = as_column(polygon_data.y) + poly_offsets = as_column(polygon_data.part_offset).to_pylibcudf( + mode="read" + ) + ring_offsets = as_column(polygon_data.ring_offset).to_pylibcudf( + mode="read" + ) + poly_points_x = as_column(polygon_data.x).to_pylibcudf(mode="read") + poly_points_y = as_column(polygon_data.y).to_pylibcudf(mode="read") return DataFrame._from_data( *spatial_join.quadtree_point_in_polygon( poly_quad_pairs, quadtree, - point_indices._column, + point_indices._column.to_pylibcudf(mode="read"), points_x, points_y, poly_offsets, @@ -296,21 +304,27 @@ def quadtree_point_to_nearest_linestring( ): raise ValueError("GeoSeries cannot contain multilinestrings.") - points_x = as_column(points.points.x) - points_y = as_column(points.points.y) + points_x = as_column(points.points.x).to_pylibcudf(mode="read") + points_y = as_column(points.points.y).to_pylibcudf(mode="read") - linestring_points_x = as_column(linestrings.lines.x) - linestring_points_y = as_column(linestrings.lines.y) + linestring_points_x = as_column(linestrings.lines.x).to_pylibcudf( + mode="read" + ) + linestring_points_y = as_column(linestrings.lines.y).to_pylibcudf( + mode="read" + ) linestring_offsets = as_column(linestrings.lines.part_offset) return DataFrame._from_data( *spatial_join.quadtree_point_to_nearest_linestring( linestring_quad_pairs, quadtree, - as_column(point_indices, dtype="uint32"), + as_column(point_indices, dtype="uint32").to_pylibcudf(mode="read"), points_x, points_y, - as_column(linestring_offsets, dtype="uint32"), + as_column(linestring_offsets, dtype="uint32").to_pylibcudf( + mode="read" + ), linestring_points_x, linestring_points_y, ) diff --git a/python/cuspatial/cuspatial/core/spatial/nearest_points.py b/python/cuspatial/cuspatial/core/spatial/nearest_points.py index 289716f74..0aaf65b25 100644 --- a/python/cuspatial/cuspatial/core/spatial/nearest_points.py +++ b/python/cuspatial/cuspatial/core/spatial/nearest_points.py @@ -1,4 +1,4 @@ -# Copyright (c) 2024, NVIDIA CORPORATION. +# Copyright (c) 2024-2025, NVIDIA CORPORATION. import cudf from cudf.core.column import as_column @@ -81,7 +81,9 @@ def pairwise_point_linestring_nearest_points( points_geometry_offset = ( None if len(points.points.xy) > 0 - else as_column(points.multipoints.geometry_offset) + else as_column(points.multipoints.geometry_offset).to_pylibcudf( + mode="read" + ) ) ( @@ -90,11 +92,11 @@ def pairwise_point_linestring_nearest_points( segment_id, point_on_linestring_xy, ) = nearest_points.pairwise_point_linestring_nearest_points( - points_xy._column, - as_column(linestrings.lines.part_offset), - linestrings.lines.xy._column, + points_xy._column.to_pylibcudf(mode="read"), + as_column(linestrings.lines.part_offset).to_pylibcudf(mode="read"), + linestrings.lines.xy._column.to_pylibcudf(mode="read"), points_geometry_offset, - as_column(linestrings.lines.geometry_offset), + as_column(linestrings.lines.geometry_offset).to_pylibcudf(mode="read"), ) nearest_points_on_linestring = GeoColumn._from_points_xy( diff --git a/python/cuspatial/cuspatial/core/spatial/projection.py b/python/cuspatial/cuspatial/core/spatial/projection.py index 82ce8719e..6f3ccc595 100644 --- a/python/cuspatial/cuspatial/core/spatial/projection.py +++ b/python/cuspatial/cuspatial/core/spatial/projection.py @@ -1,4 +1,4 @@ -# Copyright (c) 2022, NVIDIA CORPORATION. +# Copyright (c) 2022-2025, NVIDIA CORPORATION. from cudf import DataFrame @@ -51,8 +51,8 @@ def sinusoidal_projection(origin_lon, origin_lat, lonlat: GeoSeries): result = cpp_sinusoidal_projection( origin_lon, origin_lat, - lonlat.points.x._column, - lonlat.points.y._column, + lonlat.points.x._column.to_pylibcudf(mode="read"), + lonlat.points.y._column.to_pylibcudf(mode="read"), ) lonlat_transformed = DataFrame( {"x": result[0], "y": result[1]} diff --git a/python/cuspatial/cuspatial/core/trajectory.py b/python/cuspatial/cuspatial/core/trajectory.py index 27fdb227c..cac055513 100644 --- a/python/cuspatial/cuspatial/core/trajectory.py +++ b/python/cuspatial/cuspatial/core/trajectory.py @@ -1,4 +1,4 @@ -# Copyright (c) 2019-2024, NVIDIA CORPORATION. +# Copyright (c) 2019-2025, NVIDIA CORPORATION. import numpy as np @@ -64,10 +64,14 @@ def derive_trajectories(object_ids, points: GeoSeries, timestamps): if len(points) > 0 and not contains_only_points(points): raise ValueError("`points` must only contain point geometries.") - object_ids = as_column(object_ids, dtype=np.int32) - xs = as_column(points.points.x) - ys = as_column(points.points.y) - timestamps = normalize_timestamp_column(as_column(timestamps)) + object_ids = as_column(object_ids, dtype=np.int32).to_pylibcudf( + mode="read" + ) + xs = as_column(points.points.x).to_pylibcudf(mode="read") + ys = as_column(points.points.y).to_pylibcudf(mode="read") + timestamps = normalize_timestamp_column( + as_column(timestamps) + ).to_pylibcudf(mode="read") objects, traj_offsets = cpp_derive_trajectories( object_ids, xs, ys, timestamps ) @@ -135,9 +139,11 @@ def trajectory_bounding_boxes(num_trajectories, object_ids, points: GeoSeries): if len(points) > 0 and not contains_only_points(points): raise ValueError("`points` must only contain point geometries.") - object_ids = as_column(object_ids, dtype=np.int32) - xs = as_column(points.points.x) - ys = as_column(points.points.y) + object_ids = as_column(object_ids, dtype=np.int32).to_pylibcudf( + mode="read" + ) + xs = as_column(points.points.x).to_pylibcudf(mode="read") + ys = as_column(points.points.y).to_pylibcudf(mode="read") return DataFrame._from_data( *cpp_trajectory_bounding_boxes(num_trajectories, object_ids, xs, ys) ) @@ -190,10 +196,14 @@ def trajectory_distances_and_speeds( if len(points) > 0 and not contains_only_points(points): raise ValueError("`points` must only contain point geometries.") - object_ids = as_column(object_ids, dtype=np.int32) - xs = as_column(points.points.x) - ys = as_column(points.points.y) - timestamps = normalize_timestamp_column(as_column(timestamps)) + object_ids = as_column(object_ids, dtype=np.int32).to_pylibcudf( + mode="read" + ) + xs = as_column(points.points.x).to_pylibcudf(mode="read") + ys = as_column(points.points.y).to_pylibcudf(mode="read") + timestamps = normalize_timestamp_column( + as_column(timestamps) + ).to_pylibcudf(mode="read") df = DataFrame._from_data( *cpp_trajectory_distances_and_speeds( num_trajectories, object_ids, xs, ys, timestamps