BUG: NotImplementedError: Cannot apply ufunc <ufunc 'hyp2f1'> to mixed DataFrame and Series inputs. #46138

timmy-ops · 2022-02-24T12:29:14Z

Pandas version checks

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

#imports
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
import pandas as pd
import numpy as np
from datetime import datetime
!pip install lifetimes
from lifetimes import ParetoNBDFitter, GammaGammaFitter

#data
f_and_t = drive.CreateFile({'id': '1sXcv0SUUygyFvjVEtdV3kk8zyjp4GGQC'})
f_and_t.GetContentFile('f_and_t.csv')
f_and_t = pd.read_csv('f_and_t.csv')

#reproducable example

time_days = 126
time_months = int(math.ceil(time_days / 30.0))   

#column-selection

summary = f_and_t[['customer_id', 'frequency_btyd', 'recency', 'T',
                 'monetary_btyd']]
                 
summary.columns = ['customer_id', 'frequency', 'recency', 'T',
                     'monetary_value']
summary = summary.set_index('customer_id')

actual_df = f_and_t[['customer_id', 'frequency_btyd', 'monetary_dnn',
                     'target_monetary']]
actual_df.columns = ['customer_id', 'train_frequency', 'train_monetary',
                       'act_target_monetary']

#PARETO/NBD fitter
paretof = ParetoNBDFitter(penalizer_coef= 0.01)
paretof.fit(summary['frequency'], summary['recency'], summary['T'])

#Gamma Gamma Fitter

ggf = GammaGammaFitter(penalizer_coef=0)
ggf.fit(summary['frequency'], summary['monetary_value'])

#pareto predict

pareto_pred = paretof.predict(time_days,
                               summary['frequency'].values,
                                summary['recency'],
                                 summary['T'])

trans_pred = pareto_pred.fillna(0)

#gg predict

predicted_value = ggf.customer_lifetime_value(paretof,
                                                summary['frequency'],#.values,
                                                summary['recency'],
                                                summary['T'],
                                                summary['monetary_value'],
                                                time=time_months,
                                                discount_rate= 0.01)



### Issue Description

I was using the lifetimes library to calculate CLV for a list of customers. From one day to an other this issue appeared. I work on Google Colab with Pandas 1.3.5 (their current version). The error below appears for both functions: paretof.predict and ggf.customer_lifetime_value. For paretof.

I already found posts to this issue, from half a  year ago (https://stackoverflow.com/questions/69071130/lifetimes-library-issue-of-calculating-clv-when-using-function-customer-lifet).  The solution to use ".values" only worked for the paretof.predict function. At the ggf.customer_lifetime_value function I am stuck.

NotImplementedError Traceback (most recent call last)
in ()
58 summary['monetary_value'],
59 time=time_months,
---> 60 discount_rate=discount_rate)
61
62

6 frames
/usr/local/lib/python3.7/dist-packages/lifetimes/fitters/gamma_gamma_fitter.py in customer_lifetime_value(self, transaction_prediction_model, frequency, recency, T, monetary_value, time, discount_rate, freq)
294
295 return _customer_lifetime_value(
--> 296 transaction_prediction_model, frequency, recency, T, adjusted_monetary_value, time, discount_rate, freq=freq
297 )

/usr/local/lib/python3.7/dist-packages/lifetimes/utils.py in _customer_lifetime_value(transaction_prediction_model, frequency, recency, T, monetary_value, time, discount_rate, freq)
496 # since the prediction of number of transactions is cumulative, we have to subtract off the previous periods
497 expected_number_of_transactions = transaction_prediction_model.predict(
--> 498 i, frequency, recency, T
499 ) - transaction_prediction_model.predict(i - factor, frequency, recency, T)
500 # sum up the CLV estimates of all of the periods and apply discounted cash flow

/usr/local/lib/python3.7/dist-packages/lifetimes/fitters/pareto_nbd_fitter.py in conditional_expected_number_of_purchases_up_to_time(self, t, frequency, recency, T)
277 r, alpha, s, beta = params
278
--> 279 likelihood = self._conditional_log_likelihood(params, x, t_x, T)
280 first_term = (
281 gammaln(r + x) - gammaln(r) + r * log(alpha) + s * log(beta) - (r + x) * log(alpha + T) - s * log(beta + T)

/usr/local/lib/python3.7/dist-packages/lifetimes/fitters/pareto_nbd_fitter.py in _conditional_log_likelihood(params, freq, rec, T)
212
213 A_1 = gammaln(r + x) - gammaln(r) + r * log(alpha) + s * log(beta)
--> 214 log_A_0 = ParetoNBDFitter._log_A_0(params, x, rec, T)
215
216 A_2 = logaddexp(-(r + x) * log(alpha + T) - s * log(beta + T), log(s) + log_A_0 - log(r_s_x))

/usr/local/lib/python3.7/dist-packages/lifetimes/fitters/pareto_nbd_fitter.py in _log_A_0(params, freq, recency, age)
179
180 rsf = r + s + freq
--> 181 p_1 = hyp2f1(rsf, t, rsf + 1.0, abs_alpha_beta / (max_of_alpha_beta + recency))
182 q_1 = max_of_alpha_beta + recency
183 p_2 = hyp2f1(rsf, t, rsf + 1.0, abs_alpha_beta / (max_of_alpha_beta + age))

/usr/local/lib/python3.7/dist-packages/pandas/core/generic.py in array_ufunc(self, ufunc, method, *inputs, **kwargs)
2030 self, ufunc: np.ufunc, method: str, *inputs: Any, **kwargs: Any
2031 ):
-> 2032 return arraylike.array_ufunc(self, ufunc, method, *inputs, **kwargs)
2033
2034 # ideally we would define this to avoid the getattr checks, but

/usr/local/lib/python3.7/dist-packages/pandas/core/arraylike.py in array_ufunc(self, ufunc, method, *inputs, **kwargs)
292 raise NotImplementedError(
293 "Cannot apply ufunc {} to mixed DataFrame and Series "
--> 294 "inputs.".format(ufunc)
295 )
296 axes = self.axes

NotImplementedError: Cannot apply ufunc <ufunc 'hyp2f1'> to mixed DataFrame and Series inputs.




### Expected Behavior

Sometimes it works, but mostly it doesnt anymore. It should just no Error appear...

### Installed Versions

<details>

/usr/local/lib/python3.7/dist-packages/psycopg2/__init__.py:144: UserWarning:

The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.


INSTALLED VERSIONS
------------------
commit           : 66e3805b8cabe977f40c05259cc3fcf7ead5687d
python           : 3.7.12.final.0
python-bits      : 64
OS               : Linux
OS-release       : 5.4.144+
Version          : #1 SMP Tue Dec 7 09:58:10 PST 2021
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.3.5
numpy            : 1.21.5
pytz             : 2018.9
dateutil         : 2.8.2
pip              : 21.1.3
setuptools       : 57.4.0
Cython           : 0.29.28
pytest           : 3.6.4
hypothesis       : None
sphinx           : 1.8.6
blosc            : None
feather          : 0.4.1
xlsxwriter       : None
lxml.etree       : 4.2.6
html5lib         : 1.0.1
pymysql          : None
psycopg2         : 2.7.6.1 (dt dec pq3 ext lo64)
jinja2           : 2.11.3
IPython          : 5.5.0
pandas_datareader: 0.9.0
bs4              : 4.6.3
bottleneck       : 1.3.2
fsspec           : None
fastparquet      : None
gcsfs            : None
matplotlib       : 3.2.2
numexpr          : 2.8.1
odfpy            : None
openpyxl         : 3.0.9
pandas_gbq       : 0.13.3
pyarrow          : 6.0.1
pyxlsb           : None
s3fs             : None
scipy            : 1.4.1
sqlalchemy       : 1.4.31
tables           : 3.7.0
tabulate         : 0.8.9
xarray           : 0.18.2
xlrd             : 1.1.0
xlwt             : 1.3.0
numba            : 0.51.2

</details>

The text was updated successfully, but these errors were encountered:

jreback · 2022-02-24T12:39:19Z

pls show a minimal copy pastable and reproducible example w/o any external dependencies

timmy-ops · 2022-02-24T12:43:45Z

pls show a minimal copy pastable and reproducible example w/o any external dependencies

Hi jreback,

Yes I am sorry and I tried to produce one, but the problem is the whole model cannot work without this bigger dataset.

mroeschke · 2022-02-24T16:28:47Z

It will be difficult to determine whether there is a true bug here without a more minimal example: https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports

ColtAllen · 2022-03-05T00:01:06Z

Hey @timmy-ops ,

This is not an issue with pandas, but rather the lifetimes library. Please repost this issue in the lifetimes repository.

The scipy.hyp2f1 method in the final line of your error trace is a lifetimes dependency expecting to receive numpy arrays as inputs. When using any of the lifetimes modeling methods, it is important to always use a df['COL_NAME'].values syntax in all of the arguments, otherwise hyp2f1 will receive a sliced-up Pandas dataframe and create the unstable behavior you are seeing.

Unfortunately, in the case of the lifetimes.GammaGammaFitter.customer_lifetime_value method, Pandas slices are being used in the internal operations. It's an easy fix, but the lifetimes project is no longer being actively maintained. Some other contributors and I are planning a Zoom meeting in a few weeks to discuss taking over development of this library. If you wish to contribute, please let us know in this issue link:

CamDavidsonPilon/lifetimes#414

swasthikshettyhcl · 2024-06-28T13:29:32Z

@timmy-ops did you find any solution for this?
[error] Cannot apply ufunc <ufunc 'hyp2f1'> to mixed DataFrame and Series input

timmy-ops added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 24, 2022

mroeschke added Needs Info Clarification about behavior needed to assess issue and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 24, 2022

jreback closed this as completed Mar 5, 2022

dbbnicole mentioned this issue Oct 12, 2022

BUG: RCG CLV accelerator sporadically fails with NotImplementedError: Cannot apply ufunc <ufunc 'hyp2f1'> to mixed DataFrame and Series inputs databricks-industry-solutions/customer-lifetime-value#1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: NotImplementedError: Cannot apply ufunc <ufunc 'hyp2f1'> to mixed DataFrame and Series inputs. #46138

BUG: NotImplementedError: Cannot apply ufunc <ufunc 'hyp2f1'> to mixed DataFrame and Series inputs. #46138

timmy-ops commented Feb 24, 2022

jreback commented Feb 24, 2022

timmy-ops commented Feb 24, 2022

mroeschke commented Feb 24, 2022

ColtAllen commented Mar 5, 2022

swasthikshettyhcl commented Jun 28, 2024

BUG: NotImplementedError: Cannot apply ufunc <ufunc 'hyp2f1'> to mixed DataFrame and Series inputs. #46138

BUG: NotImplementedError: Cannot apply ufunc <ufunc 'hyp2f1'> to mixed DataFrame and Series inputs. #46138

Comments

timmy-ops commented Feb 24, 2022

Pandas version checks

Reproducible Example

jreback commented Feb 24, 2022

timmy-ops commented Feb 24, 2022

mroeschke commented Feb 24, 2022

ColtAllen commented Mar 5, 2022

swasthikshettyhcl commented Jun 28, 2024