This document describes how to use pochoir
. In this section, high
level concepts are introduced. Subsequent sections provide concrete
guides to get started using pochoir
, followed by a tutorial and
finally a developer guide.
- describe the steps and data products
A domain of a problem is the space on which it is defined. Driven by
the finite-difference method (FDM), a pochoir
domain is a finite,
uniform, rectilinear (Cartesian) grid of points. In general, the
domain may be N-dimensional though for practical problems we are
restricted to 2D or 3D.
- characteristics of problem where FDM is applicable (grid vs geometry feature sizes)
- define terms
- domain
- boundary value problem
- boundary and initial values and boundary condition
- initial value problem
- paths
- get package
- venv options (
python -m venv venv ; source venv/bin/activate
andecho layout python3 > .envrc; direnv allow
) - install (
pip install -e .
orpip install [email protected]:wirecell/pochoir.git
or whatever) - testing (
pytest
)
- CLI help
- general CLI vs command-level options
- environment variables (
POCHOIR_STORE
)
- devices: “best” vs “numpy” vs torch “cpu” and “gpu” (still needs actual implementation to pick “best” or otherwise globally set)
- selection via CLI options
- when to care what device is used
- main data objects
- domain
- scalar fields
- vector fields
- path start points
- full paths
- responses
- separate input and output vs input+output store
- HDF5 vs NPZ+JSON+directory
- latter can work with snakemake or similar
- converting between the two
- export formats
- vtk
- input formats
- invent something for electrode shape description, probably JSON based
- maybe jsonnet
- do we allow import of data object from, say, external NPZ?
- invent something for electrode shape description, probably JSON based
This tutorial walks through calculating field response for a detector very similar to real detectors, if idealized for simplicity. It contains parallel ground plane and a cathode plane separated along the Z-axis which provides the nominal drift field. Near the ground plane are two “anode” planes each with a series of strips parallel in their plane and mutually perpendicular across planes. The anode planes have each a “bias voltage” applied and share through-holes so that electron drift paths may pass by the first “induction anode” plane and terminate on the second “collection anode” plane.
The tutorial will make a compromise of using a low-resolution grid in order to perform the solution quickly. After an initial solution, finer resolution will be explored in the context of comparing the use of CPU and GPU.
A number of “tricks” are used to further reduce computation. The drift field and paths are solved on a small domain exploiting periodic boundary conditions. The weighting fields far from a strip of interest do not vary in the dimension parallel to the strip which allows 2D “far fields” to be calculated on a broad 2D domain and then connected to 3D “near fields” calculated on a smaller 3D domain.
The tutorial then combines paths and weighting fields to produce
current waveforms (the field response functions) and converts them to
a format suitable for use by the Wire-Cell Toolkit detector simulation
and signal processing. Along the way, the tutorial will show how to
visualize and export intermediate results from the pochoir
data store.
The tutorial will first walk through each step manually. The last step will show how to put these steps together so they may be repeated automatically and how this can facilitate exploring “hyper parameters” such as grid density. As the automation mechanism requires datasets as individual files we will use the NPZ+JSON store instead of a monolithic HDF5 file.
Install pochoir
as described elsewhere. To avoid repeating the
pochoi -s/--store
option we also set:
export POCHOIR_STORE=$(pwd)/tutorial-store
A domain can be created simply with the pochoir domain
command:
pochoir domain --help
Usage: pochoir domain [OPTIONS] NAME Produce a "domain" and store it to the output dataset. A domain describes a finite, uniform grid in N-D space in these terms: - shape :: an N-D integer vector giving the number of grid points in each dimension. Required. - origin :: an N-D spatial vector identifying the location of the grid point with all indices zero. - spacing :: a scalar or N-D vector in same distance units as used in origin and which gives a common or a per-dimension spacing between neighboring grid points. A vector is given as a comma-separated list of numbers. Note: this description corresponds to vtk/paraview uniform rectilinear grid, aka an "image". Options: -s, --shape TEXT The number of grid points in each dimension -o, --origin TEXT The spatial location of zero index grid point (def=0's) -s, --spacing TEXT The grid spacing as scalar or vector (def=1's) --help Show this message and exit.
The origin
vector is optional but allows one to adopt a convention
other than identifying it with the “zero index” grid point. The shape
describes the number of grid points on each dimension and spacing
gives their common separation.
We will store our domains in a common group named domain
. We need
three:
drift
- a 3D domain for drift field and paths
wfar
- common 2D domain for all “far” weighting fields
wnear
- common 3D domain for all “near” weighting field
Next we describe electrode shapes to be defined in the domain. To do
this properly we must already know our models for the electrode shapes
and how we wish to stitch wfar
and wnear
. For now, we just show the
commands do create the domains and later will explain how the numbers
are chosen.
pochoir domain --shape 51,51,201 --spacing 0.1 drift
pochoir domain --shape 151,151,201 --spacing 0.1 wnear
pochoir domain --shape 1051,1051 --spacing 0.1 wfar
ls -l $POCHOIR_STORE
ls -l $POCHOIR_STORE/drift
#+RESULTS[63028b25cb770b7d413ef3a72610e8ad4f9ae97e]:
total 12 drwxrwxr-x 2 bv bv 4096 Mar 22 14:12 drift drwxrwxr-x 2 bv bv 4096 Mar 22 14:12 wfar drwxrwxr-x 2 bv bv 4096 Mar 22 14:12 wnear total 12 -rw-rw-r-- 1 bv bv 290 Mar 22 14:12 origin.npz -rw-rw-r-- 1 bv bv 288 Mar 22 14:12 shape.npz -rw-rw-r-- 1 bv bv 292 Mar 22 14:12 spacing.npz
Once defined, a domain is later referenced by its full path name for
the group, eg domain/drift
. The datasets are named according to their
interpretation by pochoir
.
An initial value array (IVA) provides a scalar field from which the
FDM solution begins. Each element holds either a known, applied
potential or an initial guess of its solution. A boundary value array
(BVA) asserts which of these two interpretations hold. A True
element
of a BVA implies that the corresponding element of an IVA is fixed,
else it may be updated by FDM.
In general, the user must provide an IVA and BVA for drift and each weighting field calculation, though typically a common IVA is needed. To provide these arrays one must map a model of the real electrode geometry to the arrays via the constraints of the domain.
A user is free to do that any way that is convenient. pochoir
provides two features that may help with the task.
First is support for “shape array painting” whereby the user provides
descriptions of 2D or 3D shapes and a set of potential values and
pochoir
will apply them to produce IVA and BVA. See details in Shape Painting.
Second is pochoir
provides high-level IVA/BVA generators for
particular electrode configurations with are governed by higher level
parameters. For this tutorial we will use the sandh
(strips and
holes) high level generator.
The Laplace equation can be solved by specifying initial and boundary value arrays, boundary conditions and convergence requirements.
pochoir fdm --help
Usage: pochoir fdm [OPTIONS] SOLUTION ERROR Solve a Laplace boundary value problem with finite difference method storing the result as named solution. The error names an output array to hold difference in last two iterations. Options: -i, --initial TEXT Name initial value array, elements include boundary values -b, --boundary TEXT Name the boundary array, zero value elemnts subject to solving -e, --edges TEXT Comma separated list of 'fixed' or 'periodic' giving domain edge conditions --precision FLOAT Finish when no changes larger than precision --epoch INTEGER Number of iterations before any check -n, --nepochs INTEGER Limit number of epochs (def: one epoch) --help Show this message and exit.
We may make a trial solution which we save it and its error to caps-solution1
and caps-error1
arrays, respectively
pochoir fdm -e periodic,periodic \
-i caps-initial -b caps-boundary \
--epoch 10 -n 1 \
caps-solution1 caps-error1
maxerr: 43.046966552734375
The maximum difference between the solution at the penultimate and
last iteration is the printed maxerr
.
We can visualize solution and error:
pochoir plot-image caps-solution1 docs/caps-solution1.png
pochoir plot-image caps-error1 docs/caps-error1.png
The error is rather high and although this domain is small which makes the solution fast, we may reuse this first solution as the initial value array for continued solution:
pochoir fdm -e periodic,periodic \
-i caps-solution1 -b caps-boundary \
--epoch 10 -n 1 \
caps-solution2 caps-error2
maxerr: 21.263214111328125
pochoir plot-image caps-solution2 docs/caps-solution2.png
pochoir plot-image caps-error2 docs/caps-error2.png
We can continue this manual, high-level iteration or take a guess for
the total number of internal iterations to reach the desired error
level. Or, we may tell pochoir fdm
to continue until either the
requested number of epochs are reached or the maxerr
falls below a
requested precision. When using a precision, it is checked only after
each epoch is complete and so the result will typically be
over-precise.
pochoir fdm -e periodic,periodic \
-i caps-solution2 -b caps-boundary \
--epoch 10 -n 100 --precision 0.1 \
caps-solution3 caps-error3
maxerr: 13.02484130859375 maxerr: 8.48162841796875 maxerr: 5.6507568359375 maxerr: 3.921722412109375 maxerr: 2.8282470703125 maxerr: 2.056884765625 maxerr: 1.50787353515625 maxerr: 1.12823486328125 maxerr: 0.855926513671875 maxerr: 0.65093994140625 maxerr: 0.496307373046875 maxerr: 0.37939453125 maxerr: 0.2906494140625 maxerr: 0.2232666015625 maxerr: 0.17201995849609375 maxerr: 0.13336181640625 maxerr: 0.10352325439453125 maxerr: 0.08056640625
The solution is reached prior to 100 epochs. Again, let’s see it:
pochoir plot-image caps-solution3 docs/caps-solution3.png
pochoir plot-image caps-error3 docs/caps-error3.png
- change in args w.r.t. 2D
- understand time/resource scaling with 2D
- visualize (matplotlib and paraview)
- derive 3D boundary values to 2D and merge with 3D boundary values
- understand precision of 2D as a function of 3D domain size
The fantasy example of caps
sets boundary values applicable for
calculating the “real”, applied potential. The overall field response
is tabulated for each sensitive electrode by calculating that
electrode’s weighting potential. Thus we must apply the pochoir fdm
command as above to a boundary value which sets the grid points on the
sensitive electrode to unity and all others to zero.
The gradient of a scalar field gives a vector field. The E-field is the gradient of the applied potential scalar field and is needed for the next step of calculating paths. The W-fields, one per sensitive electrode are needed for the step after, calculating responses to paths.
The pochoir grad
command will calculate and store the gradient
allowing for visualization and later use.
pochoir grad --help
Usage: pochoir grad [OPTIONS] SCALAR VECTOR Calculate the gradient of a scalar field. Options: --help Show this message and exit.
pochoir grad \
--domain adomain \
caps-solution3 caps-efield3
We may visualize this field with:
pochoir plot-quiver \
--domain adomain \
caps-efield3 docs/caps-efield3.png
- 2D and 3D matplotlib
- paraview
- specify problem to solve
- specify initial value
- apply solver
- store result
- visualize (matplotlib and paraview)
- combine path and fields for schockley-ramo
- exploit symmetry and equivalences
- visualize
- full chain, repeatable, performant processing
- what knobs to tune
Some major features still wanted:
- [ ] a way for a 2D potential to serve as boundary conditions for a 3D problem by defining a 2D boundary and projecting it along one of the 3D dimensions.
- [ ] normalize the collection of metadata from each command which is stored on the result datasets. Name input datasets in a uniform manner, collect CLI parameters by name, collect and store time for body of command. Goal would be to produce post-hoc visualization showing how any given result was produced or determine which rules a given result influenced.
- [ ] form “cross product” of paths and weighting fields to give responses.