Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyarrow.lib.ArrowNotImplementedError: Function 'and_kleene' has no kernel matching input types (bool, double) #45161

Open
mlondschien opened this issue Jan 3, 2025 · 0 comments

Comments

@mlondschien
Copy link

Describe the bug, including details regarding any error messages, version, and platform.

Upon trying to reproduce a different error that occurs when using filters (predicates) which select none of the data, I came across this error:

import numpy as np
import polars as pl
import pyarrow.dataset as ds
from pyarrow.parquet import ParquetDataset

n = 1_000_000
rng = np.random.default_rng(seed=42)

data = pl.DataFrame(
    {
        "a": rng.uniform(low=0, high=2, size=n),
        "b": rng.choice(["a", "b"], n),
        "c": rng.normal(size=n),
    }
)

data.write_parquet("data.parquet", row_group_size=500_000)

df = pl.from_arrow(
    ParquetDataset(
        ["data.parquet"],
        filters=~ds.field("c").is_null() & ds.field("a") >= 3,
    ).read(columns=["b"])
)
print(df)

This yields

Traceback (most recent call last):
  File "test_arrow.py", line 24, in <module>
    ).read(columns=["b"])
      ^^^^^^^^^^^^^^^^^^^
  File "/cluster/home/lmalte/nobackup/micromamba/envs/psutil/lib/python3.11/site-packages/pyarrow/parquet/core.py", line 1485, in read
    table = self._dataset.to_table(
            ^^^^^^^^^^^^^^^^^^^^^^^
  File "pyarrow/_dataset.pyx", line 553, in pyarrow._dataset.Dataset.to_table
  File "pyarrow/_dataset.pyx", line 399, in pyarrow._dataset.Dataset.scanner
  File "pyarrow/_dataset.pyx", line 3557, in pyarrow._dataset.Scanner.from_dataset
  File "pyarrow/_dataset.pyx", line 3475, in pyarrow._dataset.Scanner._make_scan_options
  File "pyarrow/_dataset.pyx", line 3409, in pyarrow._dataset._populate_builder
  File "pyarrow/_compute.pyx", line 2724, in pyarrow._compute._bind
  File "pyarrow/error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowNotImplementedError: Function 'and_kleene' has no kernel matching input types (bool, double)

This is using

polars  1.14.0   py311hcc3b33b_1  conda-forge
pyarrow       18.1.0   py311h38be061_0      conda-forge
pyarrow-core  18.1.0   py311h4854187_0_cpu  conda-forge

Component(s)

Python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant