Skip to content

Commit

Permalink
docs
Browse files Browse the repository at this point in the history
  • Loading branch information
andrewgsavage committed Dec 30, 2024
1 parent 3445aab commit 0657bfc
Show file tree
Hide file tree
Showing 8 changed files with 133 additions and 112 deletions.
42 changes: 0 additions & 42 deletions docs/getting/faq.rst

This file was deleted.

2 changes: 1 addition & 1 deletion docs/getting/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,4 @@ That's all! You can check that Pint is correctly installed by starting up python
:hidden:

tutorial
faq
projects
57 changes: 57 additions & 0 deletions docs/getting/projects.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
*****************************
Pint-Pandas in your projects
*****************************

Using a Shared Unit Registry
----------------------------

As described `in the documentation of the main pint package: <https://pint.readthedocs.io/en/stable/getting/pint-in-your-projects.html#using-pint-in-your-projects>`_:

If you use Pint in multiple modules within your Python package, you normally want to avoid creating multiple instances of the unit registry. The best way to do this is by instantiating the registry in a single place. For example, you can add the following code to your package ``__init__.py``

When using `pint_pandas`, this extends to using the same unit registry that was created by the main `pint` package. This is done by using the :func:`pint.get_application_registry() <pint:get_application_registry>` function.

In a sample project structure of this kind:

.. code-block:: text
.
└── mypackage/
├── __init__.py
├── main.py
└── mysubmodule/
├── __init__.py
└── calculations.py
After defining the registry in the ``mypackage.__init__`` module:

.. code-block:: python
from pint import UnitRegistry, set_application_registry
ureg = UnitRegistry()
ureg.formatter.default_format = "P"
set_application_registry(ureg)
In the ``mypackage.mysubmodule.calculations`` module, you should *get* the shared registry like so:

.. code-block:: python
import pint
ureg = pint.get_application_registry()
@ureg.check(
'[length]',
)
def multiply_value(distance):
return distance * 2
Failure to use the application registry will result in a ``DimensionalityError`` of the kind:

Cannot convert from '<VALUE> <UNIT>' ([<DIMENSION>]) to 'a quantity of' ([<DIMENSION>])".

For example:

.. code-block:: text
DimensionalityError: Cannot convert from '200 metric_ton' ([mass]) to 'a quantity of' ([mass])"
14 changes: 14 additions & 0 deletions docs/getting/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,17 @@ The PintArray contains a Quantity
df.power.values.quantity
DataFrame Index
-----------------------

PintArrays can be used as the DataFrame's index.

.. ipython:: python
time = pd.Series([1, 2, 2, 3], dtype="pint[second]")
df.index = time
df.index
Pandas Series Accessors
-----------------------
Pandas Series accessors are provided for most Quantity properties and methods.
Expand All @@ -84,3 +95,6 @@ Methods that return arrays will be converted to Series.
df.power.pint.units
df.power.pint.to("kW")
That's the basics! More examples are given at :doc:`Reading from csv <../user/reading>`.
56 changes: 2 additions & 54 deletions docs/user/common.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Common Issues
Pandas support for ``ExtensionArray`` is still in development. As a result, there are some common issues that pint-pandas users may encounter.
This page provides some guidance on how to resolve these issues.

Units in Cells (Object dtype columns)
Units in Cells
-------------------------------------

The most common issue pint-pandas users encouter is that they have a DataFrame with column that aren't PintArrays.
Expand Down Expand Up @@ -58,63 +58,11 @@ Creating DataFrames from Series
The default operation of Pandas `pd.concat` function is to perform row-wise concatenation. When given a list of Series, each of which is backed by a PintArray, this will inefficiently convert all the PintArrays to arrays of `object` type, concatenate the several series into a DataFrame with that many rows, and then leave it up to you to convert that DataFrame back into column-wise PintArrays. A much more efficient approach is to concatenate Series in a column-wise fashion:

.. ipython:: python
:suppress:
:okexcept:
list_of_series = [pd.Series([1.0, 2.0], dtype="pint[m]") for i in range(0, 10)]
df = pd.concat(list_of_series, axis=1)
df
This will preserve all the PintArrays in each of the Series.


Using a Shared Unit Registry
----------------------------

As described `in the documentation of the main pint package: <https://pint.readthedocs.io/en/stable/getting/pint-in-your-projects.html#using-pint-in-your-projects>`_:

If you use Pint in multiple modules within your Python package, you normally want to avoid creating multiple instances of the unit registry. The best way to do this is by instantiating the registry in a single place. For example, you can add the following code to your package ``__init__.py``

When using `pint_pandas`, this extends to using the same unit registry that was created by the main `pint` package. This is done by using the :func:`pint.get_application_registry() <pint:get_application_registry>` function.

In a sample project structure of this kind:

.. code-block:: text
.
└── mypackage/
├── __init__.py
├── main.py
└── mysubmodule/
├── __init__.py
└── calculations.py
After defining the registry in the ``mypackage.__init__`` module:

.. code-block:: python
import pint
ureg = pint.get_application_registry()
In the ``mypackage.mysubmodule.calculations`` module, you should *get* the shared registry like so:

.. code-block:: python
import pint
ureg = pint.get_application_registry()
@ureg.check(
'[length]',
)
def multiply_value(distance):
return distance * 2
Failure to do this will result in a ``DimensionalityError`` of the kind:

Cannot convert from '<VALUE> <UNIT>' ([<DIMENSION>]) to 'a quantity of' ([<DIMENSION>])".

For example:

.. code-block:: text
DimensionalityError: Cannot convert from '200 metric_ton' ([mass]) to 'a quantity of' ([mass])"
1 change: 1 addition & 0 deletions docs/user/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@ examples that describe many common tasks that you can accomplish with pint.

reading
initializing
numpy
common
33 changes: 18 additions & 15 deletions docs/user/initializing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,33 +4,36 @@
Initializing data
**************************

There are several ways to initialize a `PintArray`s` in a `DataFrame`. Here's the most common methods. We'll use `PA_` and `Q_` as shorthand for `PintArray` and `Quantity`.


There are several ways to initialize a ``PintArray`` in a ``DataFrame``. Here's the most common methods.

.. ipython:: python
:okwarning:
:suppress:
import pandas as pd
import pint
import pint_pandas
import numpy as np
PA_ = pint_pandas.PintArray
PintArray = pint_pandas.PintArray
ureg = pint_pandas.PintType.ureg
Q_ = ureg.Quantity
Quantity = ureg.Quantity
.. ipython:: python
:okwarning:
df = pd.DataFrame(
{
"Ser1": pd.Series([1, 2], dtype="pint[m]"),
"Ser2": pd.Series([1, 2]).astype("pint[m]"),
"Ser3": pd.Series([1, 2], dtype="pint[m][Int64]"),
"Ser4": pd.Series([1, 2]).astype("pint[m][Int64]"),
"PArr1": PA_([1, 2], dtype="pint[m]"),
"PArr2": PA_([1, 2], dtype="pint[m][Int64]"),
"PArr3": PA_([1, 2], dtype="m"),
"PArr4": PA_([1, 2], dtype=ureg.m),
"PArr5": PA_(Q_([1, 2], ureg.m)),
"PArr6": PA_([1, 2],"m"),
"PArr1": PintArray([1, 2], dtype="pint[m]"),
"PArr2": PintArray([1, 2], dtype="pint[m][Int64]"),
"PArr3": PintArray([1, 2], dtype="m"),
"PArr4": PintArray([1, 2], dtype=ureg.m),
"PArr5": PintArray(Quantity([1, 2], ureg.m)),
"PArr6": PintArray([1, 2],"m"),
}
)
df
Expand All @@ -43,11 +46,11 @@ In the first two Series examples above, the data was converted to Float64.
df.dtypes
To avoid this conversion, specify the subdtype (dtype of the magnitudes) in the dtype `"pint[m][Int64]"` when constructing using a `Series`. The default data dtype that pint-pandas converts to can be changed by modifying `pint_pandas.DEFAULT_SUBDTYPE`.
To avoid this conversion, specify the subdtype (dtype of the magnitudes) in the dtype ``"pint[m][Int64]"`` when constructing using a ``Series``. The default data dtype that pint-pandas converts to can be changed by modifying ``pint_pandas.DEFAULT_SUBDTYPE``.

`PintArray` infers the subdtype from the data passed into it when there is no subdtype specified in the dtype. It also accepts a pint `Unit`` or unit string as the dtype.
``PintArray`` infers the subdtype from the data passed into it when there is no subdtype specified in the dtype. It also accepts a pint ``Unit`` or unit string as the dtype.


.. note::

`"pint[unit]"` or `"pint[unit][subdtype]"` must be used for the Series or DataFrame constuctor.
``"pint[unit]"`` or ``"pint[unit][subdtype]"`` must be used for the Series or DataFrame constuctor.
40 changes: 40 additions & 0 deletions docs/user/numpy.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
.. _numpy:

**************************
Numpy support
**************************

Numpy functions that work on pint ``Quantity`` ``ndarray`` objects also work on ``PintArray``.


.. ipython:: python
:suppress:
import pandas as pd
import pint
import pint_pandas
import numpy as np
PintArray = pint_pandas.PintArray
ureg = pint_pandas.PintType.ureg
Q_ = ureg.Quantity
.. ipython:: python
pa = PintArray([1, 2, np.nan, 4, 10], dtype="pint[m]")
np.clip(pa, 3 * ureg.m, 5 * ureg.m)
Note that this function errors when applied to a ``Series``.

.. ipython:: python
:okexcept:
df = pd.DataFrame({"A": pa})
np.clip(df['A'], 3 * ureg.m, 5 * ureg.m)
Apply the function to the ``PintArray`` instead of the ``Series`` using ``Series.values``.

.. ipython:: python
:okexcept:
np.clip(df['A'].values, 3 * ureg.m, 5 * ureg.m)

0 comments on commit 0657bfc

Please sign in to comment.