Skip to content

Commit

Permalink
Adding initial C code to ramp fitting.
Browse files Browse the repository at this point in the history
Adding the setup.py necessary to install C extensions for ramp fitting.

Adding first attempt at C code.

Adding a setup.cfg file to be used for setup.

Updating calling location and C code.

Updated include ordering, as well as returning NoneType.  Can compile calling setup.py directly and running a basic script.  Still cannot run 'pip install -e .'.  This generates a failure saying it's unable to find the numpy module, raising ModuleNotFoundError.

Updating setup.

Fixing install files to properly work with C extension framework in ramp fitting.

Changing names.

Updating C code to parse input parameters.

Adding ramp handling for each pixel.

Updating ramp data structure and pixel ramp data structure.

Getting a pixel ramp and printing it to the screen.

Updating code style and adding a simple linked list to handle segment computations.

Completed median rate computation without flags and without special cases.

Cleaning up the code and comments.  The non-special case median rate computation now works.

Commenting out calls to C extension function.

Putting functions in alphabetical order to make them easier to navigate.

Alphabetizing the functions to make them easier to search.

Finding a bug in the median rate computation.

Updating setup and Nympy macro for C based on code review.

Fixed the local copy of the DQ for an integration for median rate computation.

Completed the median rate calculations that accounts for flags in the ramp.

Beginning to update type checking for ndarrays passed to C.

Figuring out endianness solution.  Still need to figure out how to detect endianness.

Checking for a computing byteswapping makes things slower than python.

Testing and comparing python to C code.

Working on the weighting of a segment for ramp fitting.

Continuing with weighted fitting.

Finished the segment computations.

Removed 'real_t' typedef to make code cleaner.  Finished pixel ramp computations but the read noise computation is different from python code.

JIC commit.

JIC commit.

Debugging the segment computation of the read noise variance.

Updated the read noise computations for normal segments.  Updated the documentation for ramp fitting to make the documentation for clear on the details of the computations.  Removed extra blank lines from CI test file.

Creating output base arrays and fix use of pixeldq in pixel_ramp.

Packaged up output results from the C computations to be passed back to the python code.

Adding a square root to the final computation of the combined error for the rate and rateints products.

Successful CI tests for basic ramps run through the C extension.

JIC commit.

Fixing segment pruner.

Adding test cases for C extension.

Adding cases.  Updated the final DQ array.  Started implementing the optional results product.

Moving debugging print statements.

Adding optional results memory managment during ramp fitting.

Outputting the optional results product.  Some of the values still need to be computed.

Updating computing the optional results product.

Updating dimensions of the pedestal array.

Adding pedestal calculation.

Updating where the group time divide happens.  Adding special case computation and testing.  Tracing bug in special case.

Working out slope computation bug.

Forcing the use of C extension to ensure CI tests use the C extensions.

Updating tests and DQ flag computations.

Adding special case for one group segment.

Working on special cases.

Updating one group ramp testing.  The variance computation is correct now, but need to investigate the slope computation, which may also be wrong in python.

Working on a new test.

Rearranging code to make it easier to read.

Refactoring module API.  Splitting ramp data getter into multiple, smaller functions.

Updating tests, as well as refactoring the module interface.

Updating the flags for suppressed ramps.

Changing the C interface to make it simpler and easier to read.

Cleaning up old code and adding one group suppression flagging.

Modifying setup.py to get it to properly install C and cython stuff.

Modifying setup to get it to work, since I am not sure how to resolve the conflicts resulting from the use of by C and cython.

Updating invalid integrations test.

ZEROFRAME test works.

Suppressed one group tests work.

Updating return code from C extension.

Updating test_2_group_cases testing.  Bugs were found in the python code, so those should be corrected first before finishing the C code.

Updating code and tests for python to account for invalid integrations and invalid groups for median rate calculations.

Updating error in median rate computation.

Investigating differences on branch with main branch.

Properly updating ols_fit.py from main branch updates.

Finishing up C translation.  Will need to further investigate two group ramp special case for rateints (see test_invalid_integrations).

Updating the setup.py file to properly install the cython and c extension modules.

Updating tests and setup.py format.

Removing unneeded comments.

Removing debugging imports.

Fixing ZEROFRAME logic bug, as well as removing debugging code.

All STCAL CI tests are passing with the C code.  Need to check the JWST tests.

Updating segment slope calculation for NaN values.

Updating computation of read noise variance for bad gain value.

Updating the debugging comments, as well as finishing the first group orphan test in JWST CI testing.

Updating how the pedestal gets computed.

Updating median rate computation for case 1 in JWST.

Updating slope fitter to pass case 4 in the JWST CI test.

Updating debugging functions.

JIC.
  • Loading branch information
kmacdonald-stsci committed Nov 8, 2023
1 parent 1889060 commit 49e2560
Show file tree
Hide file tree
Showing 8 changed files with 3,841 additions and 236 deletions.
30 changes: 19 additions & 11 deletions docs/stcal/ramp_fitting/description.rst
Original file line number Diff line number Diff line change
Expand Up @@ -158,21 +158,27 @@ the least-squares fit is calculated with the first and last samples. In most pra
cases, the data will fall somewhere in between, where the weighting is scaled between the
two extremes.

The signal-to-noise ratio :math:`S` used for weighting selection is calculated from the
last sample as:

For segment :math:`k` of length :math:`n`, which includes groups :math:`[g_{k}, ...,
g_{k+n-1}]`, the signal-to-noise ratio :math:`S` used for weighting selection is
calculated from the last sample as:

.. math::
S = \frac{data \times gain} { \sqrt{(read\_noise)^2 + (data \times gain) } } \,,
where :math:`data = g_{k+n-1} - g_{k}`.

The weighting for a sample :math:`i` is given as:

.. math::
w_i = (i - i_{midpoint})^P \,,
w_i = \frac{ [(i - i_{midpoint}) / i_{midpoint}]^P }{ (read\_noise)^2 } \,,
where :math:`i_{midpoint} = \frac{n-1}{2}` and :math:`i = 0, 1, ..., n-1`.


where :math:`i_{midpoint}` is the the sample number of the midpoint of the sequence, and
:math:`P` is the exponent applied to weights, determined by the value of :math:`S`. Fixsen
et al. 2000 found that defining a small number of P values to apply to values of S was
sufficient; they are given as:
is the the sample number of the midpoint of the sequence, and :math:`P` is the exponent
applied to weights, determined by the value of :math:`S`. Fixsen et al. 2000 found that
defining a small number of P values to apply to values of S was sufficient; they are given as:

+-------------------+------------------------+----------+
| Minimum S | Maximum S | P |
Expand All @@ -195,11 +201,13 @@ Segment-specific Computations:
The variance of the slope of a segment due to read noise is:

.. math::
var^R_{s} = \frac{12 \ R^2 }{ (ngroups_{s}^3 - ngroups_{s})(tgroup^2) } \,,
var^R_{s} = \frac{12 \ R^2 }{ (ngroups_{s}^3 - ngroups_{s})(tgroup^2)(gain^2) } \,,
where :math:`R` is the noise in the difference between 2 frames,
:math:`ngroups_{s}` is the number of groups in the segment, and :math:`tgroup` is the group
time in seconds (from the keyword TGROUP).
time in seconds (from the keyword TGROUP). The divide by gain converts to
:math:`DN`. For the special case where as segment has length 1, the
:math:`ngroups_{s}` is set to :math:`2`.

The variance of the slope in a segment due to Poisson noise is:

Expand Down Expand Up @@ -265,10 +273,10 @@ The combined variance of the slope is the sum of the variances:
The square root of the combined variance is stored in the ERR array of the primary output.

The overall slope depends on the slope and the combined variance of the slope of each integration's
segments, so is a sum over integrations and segments:
segments, so is a sum over integration values computed from the segements:

.. math::
slope_{o} = \frac{ \sum_{i,s}{ \frac{slope_{i,s}} {var^C_{i,s}}}} { \sum_{i,s}{ \frac{1} {var^C_{i,s}}}}
slope_{o} = \frac{ \sum_{i}{ \frac{slope_{i}} {var^C_{i}}}} { \sum_{i}{ \frac{1} {var^C_{i}}}}
Upon successful completion of this step, the status keyword S_RAMP will be set
Expand Down
19 changes: 18 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,22 @@
from setuptools import setup, Extension
from setuptools import setup, find_packages, Extension
from Cython.Build import cythonize
from Cython.Compiler import Options
import numpy as np

Options.docstrings = True
Options.annotate = False

# package_data values are glob patterns relative to each specific subpackage.
package_data = {
"stcal.ramp_fitting.src": ["*.c"],
}

# Setup C module include directories
include_dirs = [np.get_include()]

# Setup C module macros
define_macros = [("NUMPY", "1")]

extensions = [
Extension(
'stcal.ramp_fitting.ols_cas22._core',
Expand All @@ -31,6 +42,12 @@
include_dirs=[np.get_include()],
language='c++'
),
Extension(
"stcal.ramp_fitting.slope_fitter",
["src/stcal/ramp_fitting/src/slope_fitter.c"],
include_dirs=include_dirs,
define_macros=define_macros,
),
]

setup(ext_modules=cythonize(extensions))
129 changes: 120 additions & 9 deletions src/stcal/ramp_fitting/ols_fit.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

import warnings

from .slope_fitter import ols_slope_fitter # c extension
from . import ramp_fit_class
from . import utils

Expand Down Expand Up @@ -657,6 +658,84 @@ def ols_ramp_fit_single(
opt_info : tuple
The tuple of computed optional results arrays for fitting.
"""

use_c = False
# use_c = True
if use_c:
print(" ")
print("=" * 80)
c_start = time.time()

if ramp_data.drop_frames1 is None:
ramp_data.drop_frames1 = 0
print(" ---------------- Entering C Code ----------------")
image_info, integ_info, opt_info = ols_slope_fitter(
ramp_data, gain_2d, readnoise_2d, weighting, save_opt)
print(" ---------------- Return C Code ----------------")

c_end = time.time()
print("=" * 80)

return image_info, integ_info, opt_info

# XXX start python time
p_start = time.time()

print("\n");
print(" ---------------- Entering Python Code ----------------")
image_info, integ_info, opt_info = ols_ramp_fit_single_python(
ramp_data, buffsize, save_opt, readnoise_2d, gain_2d, weighting)
print(" ---------------- Return Python Code ----------------")

# XXX end python time
p_end = time.time()
if use_c:
c_python_time_comparison(c_start, c_end, p_start, p_end)

return image_info, integ_info, opt_info



def ols_ramp_fit_single_python(
ramp_data, buffsize, save_opt, readnoise_2d, gain_2d, weighting):
"""
Fit a ramp using ordinary least squares. Calculate the count rate for each
pixel in all data cube sections and all integrations, equal to the weighted
slope for all sections (intervals between cosmic rays) of the pixel's ramp
divided by the effective integration time.
Parameters
----------
ramp_data : RampData
Input data necessary for computing ramp fitting.
buffsize : int
The working buffer size
save_opt : bool
Whether to return the optional output model
readnoise_2d : ndarray
The read noise of each pixel
gain_2d : ndarray
The gain of each pixel
weighting : str
'optimal' is the only valid value
Return
------
image_info : tuple
The tuple of computed ramp fitting arrays.
integ_info : tuple
The tuple of computed integration fitting arrays.
opt_info : tuple
The tuple of computed optional results arrays for fitting.
"""
# MAIN
tstart = time.time()

if not ramp_data.suppress_one_group_ramps:
Expand All @@ -682,6 +761,8 @@ def ols_ramp_fit_single(
log.warning('will be calculated as the value of that 1 group divided by ')
log.warning('the group exposure time.')

# import ipdb; ipdb.set_trace()

# In this 'First Pass' over the data, loop over integrations and data
# sections to calculate the estimated median slopes, which will be used
# to calculate the variances. This is the same method to estimate slopes
Expand All @@ -693,6 +774,8 @@ def ols_ramp_fit_single(
if fit_slopes_ans[0] == "saturated":
return fit_slopes_ans[1:]

# import ipdb; ipdb.set_trace()

# In this 'Second Pass' over the data, loop over integrations and data
# sections to calculate the variances of the slope using the estimated
# median slopes from the 'First Pass'. These variances are due to Poisson
Expand All @@ -718,6 +801,15 @@ def ols_ramp_fit_single(
return image_info, integ_info, opt_info


def c_python_time_comparison(c_start, c_end, p_start, p_end):
c_diff = c_end - c_start
p_diff = p_end - p_start
c_div_p = c_diff / p_diff * 100.
print(f"{c_diff = }")
print(f"{p_diff = }")
print(f"{c_div_p = :.4f}%")


def discard_miri_groups(ramp_data):
"""
For MIRI datasets having >1 group, if all pixels in the final group are
Expand Down Expand Up @@ -894,6 +986,8 @@ def ramp_fit_slopes(ramp_data, gain_2d, readnoise_2d, save_opt, weighting):

opt_res = utils.OptRes(n_int, imshape, max_seg, ngroups, save_opt)

# import ipdb; ipdb.set_trace()

# Get Pixel DQ array from input file. The incoming RampModel has uint32
# PIXELDQ, but ramp fitting will update this array here by flagging
# the 2D PIXELDQ locations where the ramp data has been previously
Expand All @@ -909,7 +1003,6 @@ def ramp_fit_slopes(ramp_data, gain_2d, readnoise_2d, save_opt, weighting):
# as is done in the jump detection step, except here CR-affected and
# saturated groups have already been flagged. The actual, fit, slopes for
# each segment are also calculated here.

med_rates = utils.compute_median_rates(ramp_data)

# Loop over data integrations:
Expand Down Expand Up @@ -1082,6 +1175,8 @@ def ramp_fit_compute_variances(ramp_data, gain_2d, readnoise_2d, fit_slopes_ans)
num_seg_per_int = fit_slopes_ans[5]
med_rates = fit_slopes_ans[9]

# import ipdb; ipdb.set_trace()

var_p3, var_r3, var_p4, var_r4, var_both4, var_both3, \
inv_var_both4, s_inv_var_p3, s_inv_var_r3, s_inv_var_both3, segs_4 = \
utils.alloc_arrays_2(n_int, imshape, max_seg)
Expand Down Expand Up @@ -1113,7 +1208,9 @@ def ramp_fit_compute_variances(ramp_data, gain_2d, readnoise_2d, fit_slopes_ans)
var_p4[num_int, :, rlo:rhi, :] = den_p3 * med_rates[rlo:rhi, :]

# Find the segment variance due to read noise and convert back to DN
var_r4[num_int, :, rlo:rhi, :] = num_r3 * den_r3 / gain_sect**2
tmpgain = gain_sect**2
var_r4_tmp = num_r3 * den_r3 / tmpgain
var_r4[num_int, :, rlo:rhi, :] = var_r4_tmp

# Reset the warnings filter to its original state
warnings.resetwarnings()
Expand Down Expand Up @@ -1141,6 +1238,7 @@ def ramp_fit_compute_variances(ramp_data, gain_2d, readnoise_2d, fit_slopes_ans)
# variance calculations.
s_inv_var_p3[num_int, :, :] = (1. / var_p4[num_int, :, :, :]).sum(axis=0)
var_p3[num_int, :, :] = 1. / s_inv_var_p3[num_int, :, :]

s_inv_var_r3[num_int, :, :] = (1. / var_r4[num_int, :, :, :]).sum(axis=0)
var_r3[num_int, :, :] = 1. / s_inv_var_r3[num_int, :, :]

Expand All @@ -1152,6 +1250,7 @@ def ramp_fit_compute_variances(ramp_data, gain_2d, readnoise_2d, fit_slopes_ans)

var_p4[num_int, :, med_rates <= 0.] = 0.
var_both4[num_int, :, :, :] = var_r4[num_int, :, :, :] + var_p4[num_int, :, :, :]

inv_var_both4[num_int, :, :, :] = 1. / var_both4[num_int, :, :, :]

# Want to retain values in the 4D arrays only for the segments that each
Expand Down Expand Up @@ -1286,9 +1385,9 @@ def ramp_fit_overall(
var_p3, var_r3, var_p4, var_r4, var_both4, var_both3 = variances_ans[:6]
inv_var_both4, s_inv_var_p3, s_inv_var_r3, s_inv_var_both3 = variances_ans[6:]

slope_by_var4 = opt_res.slope_seg.copy() / var_both4
# import ipdb; ipdb.set_trace()

del var_both4
slope_by_var4 = opt_res.slope_seg.copy() / var_both4

s_slope_by_var3 = slope_by_var4.sum(axis=1) # sum over segments (not integs)
s_slope_by_var2 = s_slope_by_var3.sum(axis=0) # sum over integrations
Expand Down Expand Up @@ -1331,6 +1430,8 @@ def ramp_fit_overall(

slope_int = the_num / the_den

del var_both4

# Adjust DQ flags for NaNs.
wh_nans = np.isnan(slope_int)
dq_int[wh_nans] = np.bitwise_or(dq_int[wh_nans], ramp_data.flags_do_not_use)
Expand All @@ -1354,8 +1455,9 @@ def ramp_fit_overall(
for num_int in range(0, n_int):
dq_slice = groupdq[num_int, 0, :, :]
opt_res.ped_int[num_int, :, :] = \
utils.calc_pedestal(ramp_data, num_int, slope_int, opt_res.firstf_int,
dq_slice, nframes, groupgap, dropframes1)
utils.calc_pedestal(
ramp_data, num_int, slope_int, opt_res.firstf_int,
dq_slice, nframes, groupgap, dropframes1)

del dq_slice

Expand Down Expand Up @@ -1856,12 +1958,13 @@ def fit_next_segment(start, end_st, end_heads, pixel_done, data_sect, mask_2d,
opt_res, save_opt, mask_2d_init, ramp_mask_sum)

# CASE: Long enough (semiramp has >2 groups), at end of ramp
# XXX The comments says semiramp >2, but checks for length >1.
wh_check = np.where((l_interval > 1) & (end_locs == ngroups - 1) & (~pixel_done))
if len(wh_check[0]) > 0:
f_max_seg = fit_next_segment_long_end_of_ramp(
wh_check, start, end_st, end_heads, pixel_done, got_case, f_max_seg,
inv_var, num_seg, slope, intercept, variance, sig_intercept, sig_slope,
opt_res, save_opt)
opt_res, save_opt, ramp_data)

# CASE: Long enough (semiramp has >2 groups ), not at array end (meaning
# final group for this semiramp is not final group of the whole ramp)
Expand Down Expand Up @@ -2496,7 +2599,7 @@ def fit_next_segment_long_not_end_of_ramp(
def fit_next_segment_long_end_of_ramp(
wh_check, start, end_st, end_heads, pixel_done, got_case, f_max_seg,
inv_var, num_seg, slope, intercept, variance, sig_intercept, sig_slope,
opt_res, save_opt):
opt_res, save_opt, ramp_data):
"""
Long enough (semiramp has >2 groups), at end of ramp
- set start to -1 to designate all fitting done
Expand Down Expand Up @@ -2654,6 +2757,9 @@ def fit_short_ngroups(
ramp_mask_sum : ndarray
number of channels to fit for each pixel, 1-D int
ramp_data : RampData
The ramp data needed for processing, specifically flag values.
Returns
-------
f_max_seg : int
Expand Down Expand Up @@ -2822,6 +2928,7 @@ def fit_lines(data, mask_2d, rn_sect, gain_sect, ngroups, weighting, gdq_sect_r,
# the 0th and 1st group are good, set slope, etc
wh_pix_2r = np.where(c_mask_2d.sum(axis=0) == 2) # ramps with 2 good groups

# XXX Why step into this if wh_pix_2r has length zero?
slope_s, intercept_s, variance_s, sig_slope_s, sig_intercept_s = \
fit_double_read(c_mask_2d, wh_pix_2r, data_masked, slope_s, intercept_s,
variance_s, sig_slope_s, sig_intercept_s, rn_sect, ramp_data)
Expand All @@ -2847,6 +2954,8 @@ def fit_lines(data, mask_2d, rn_sect, gain_sect, ngroups, weighting, gdq_sect_r,
sumx, sumxx, sumxy, sumy, nreads_wtd, xvalues = calc_opt_sums(
ramp_data, rn_sect, gain_sect, data_masked, c_mask_2d, xvalues, good_pix)

# import ipdb; ipdb.set_trace()

slope, intercept, sig_slope, sig_intercept = \
calc_opt_fit(nreads_wtd, sumxx, sumx, sumxy, sumy)

Expand Down Expand Up @@ -3600,6 +3709,7 @@ def calc_opt_sums(ramp_data, rn_sect, gain_sect, data_masked, mask_2d, xvalues,
# get final valid group for each pixel for this semiramp
ind_lastnz = fnz + mask_2d_sum - 1


# get SCI value of initial good group for semiramp
data_zero = data_masked[fnz, range(data_masked.shape[1])]
fnz = 0
Expand Down Expand Up @@ -3648,7 +3758,6 @@ def calc_opt_sums(ramp_data, rn_sect, gain_sect, data_masked, mask_2d, xvalues,
num_nz = c_mask_2d.sum(0) # number of groups in segment
nrd_prime = (num_nz - 1) / 2.
num_nz = 0

# Calculate inverse read noise^2 for use in weights
# Suppress, then re-enable, harmless arithmetic warning
warnings.filterwarnings("ignore", ".*divide by zero.*", RuntimeWarning)
Expand All @@ -3675,6 +3784,8 @@ def calc_opt_sums(ramp_data, rn_sect, gain_sect, data_masked, mask_2d, xvalues,
xvalues[:, wh_m2d_f] = np.roll(xvalues[:, wh_m2d_f], -1, axis=0)
wh_m2d_f = np.logical_not(c_mask_2d[0, :])

# import ipdb; ipdb.set_trace()

# Create weighted sums for Poisson noise and read noise
nreads_wtd = (wt_h * c_mask_2d).sum(axis=0) # using optimal weights

Expand Down
Loading

0 comments on commit 49e2560

Please sign in to comment.