Skip to content

Commit

Permalink
add xarray capabilities; refactor __init__; redo examples
Browse files Browse the repository at this point in the history
  • Loading branch information
taobrienlbl committed Nov 5, 2023
1 parent ca2b54f commit 2910bf4
Show file tree
Hide file tree
Showing 10 changed files with 1,043 additions and 154 deletions.
54 changes: 18 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[![PyPI version](https://badge.fury.io/py/fastkde.svg)](https://badge.fury.io/py/fastkde)
![GitHub Workflow Status (with event)](https://img.shields.io/github/actions/workflow/status/lbl-eesa/fastkde/test.yml?event=push&label=tests)
<a target="_blank" href="https://colab.research.google.com/github/LBL-EESA/fastkde/blob/main/testing/readme_test.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a>
<a target="_blank" href="https://colab.research.google.com/github/LBL-EESA/fastkde/blob/main/examples/readme_test.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a>

# fastKDE

Expand All @@ -23,26 +23,20 @@ dimensionality).
**For a standard PDF**

```python
""" Demonstrate the first README example. """
import numpy as np
from fastkde import fastKDE
import fastkde
import matplotlib.pyplot as plt

#Generate two random variables dataset (representing 100000 pairs of datapoints)
N = int(2e5)
var1 = 50*np.random.normal(size=N) + 0.1
var2 = 0.01*np.random.normal(size=N) - 300
#Generate two random variables dataset (representing 100,000 pairs of datapoints)
N = int(1e5)
x = 50*np.random.normal(size=N) + 0.1
y = 0.01*np.random.normal(size=N) - 300

#Do the self-consistent density estimate
myPDF, values = fastKDE.pdf(var1,var2)
PDF = fastkde.pdf(x, y, var_names = ['x', 'y'])

#Extract the axes from the pdf values list
v1,v2 = values

#Plot contours of the PDF should be a set of concentric ellipsoids centered on
#(0.1, -300) Comparitively, the y axis range should be tiny and the x axis range
#should be large
plt.contour(v1,v2,myPDF)
plt.show()
PDF.plot();
```


Expand Down Expand Up @@ -83,7 +77,7 @@ conditional
#***************************
# Calculate the conditional
#***************************
pOfYGivenX, values = fastKDE.conditional(y,x)
cPDF = fastkde.conditional(y, x, var_names = ['y', 'x'])
```

The following plot shows the results:
Expand All @@ -92,31 +86,20 @@ The following plot shows the results:
#***************************
# Plot the conditional
#***************************
fig,axs = plt.subplots(1,2,figsize=(10,5))
fig,axs = plt.subplots(1,2,figsize=(10,5), sharex=True, sharey=True)

#Plot a scatter plot of the incoming data
axs[0].plot(x,y,'k.',alpha=0.1)
axs[0].set_title('Original (x,y) data')

#Set axis labels
for i in (0,1):
axs[i].set_xlabel('x')
axs[i].set_ylabel('y')
axs[0].set_xlabel('x')
axs[0].set_ylabel('y')

#Draw a contour plot of the conditional
axs[1].contourf(values[0],values[1],pOfYGivenX,64)
cPDF.plot(ax = axs[1], add_colorbar = False)
#Overplot the original underlying relationship
axs[1].plot(values[0],underlyingFunction(values[0]),linewidth=3,linestyle='--',alpha=0.5)
axs[1].plot(cPDF.x,underlyingFunction(cPDF.x),linewidth=3,linestyle='--',alpha=0.5)
axs[1].set_title('P(y|x)')

#Set axis limits to be the same (limit to the range of the data)
xlim = [x.min(),x.max()]
ylim = [y.min(),y.max()]
axs[1].set_xlim(xlim)
axs[1].set_ylim(ylim)
axs[0].set_xlim(xlim)
axs[0].set_ylim(ylim)

plt.savefig('conditional_demo.png')
plt.show()
```
Expand All @@ -128,17 +111,16 @@ plt.show()
To see the KDE values at specified points (not necessarily those that were used to generate the KDE):

```python
import numpy as np
from fastkde import fastKDE

""" Demonstrate using the pdf_at_points function. """""
import fastkde
train_x = 50*np.random.normal(size=100) + 0.1
train_y = 0.01*np.random.normal(size=100) - 300

test_x = 50*np.random.normal(size=100) + 0.1
test_y = 0.01*np.random.normal(size=100) - 300

test_points = list(zip(test_x, test_y))
test_point_pdf_values = fastKDE.pdf_at_points(train_x, train_y, list_of_points = test_points)
test_point_pdf_values = fastkde.pdf_at_points(train_x, train_y, list_of_points = test_points)
```

Note that this method can be significantly slower than calls to `fastkde.pdf()` since it does not benefit from using a fast Fourier transform during the final stage in which the PDF estimate is transformed from spectral space into data space, whereas `fastkde.pdf()` does.
Expand Down
15 changes: 7 additions & 8 deletions examples/Plotting.ipynb

Large diffs are not rendered by default.

Binary file added examples/conditional_demo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
653 changes: 653 additions & 0 deletions examples/fastKDE and xarray.ipynb

Large diffs are not rendered by default.

62 changes: 16 additions & 46 deletions examples/readme_test.ipynb

Large diffs are not rendered by default.

3 changes: 2 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
numpy
scipy
scipy
xarray
16 changes: 16 additions & 0 deletions src/fastkde/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,19 @@
# if the user did `pip install -e .`, there will be no version.py file
# therefore indicate that the version is "editable"
__version__ = "editable"


# make some core functions available at the package level
import fastkde.fastKDE

pdf = fastkde.fastKDE.pdf
conditional = fastkde.fastKDE.conditional
pdf_at_points = fastkde.fastKDE.pdf_at_points

# import fastkde.plot if matplotlib is available
try:
import fastkde.plot

pair_plot = fastkde.plot.pair_plot
except ImportError:
pass
Loading

0 comments on commit 2910bf4

Please sign in to comment.