Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors propagation #150

Open
rugfri opened this issue Jul 16, 2023 · 23 comments · May be fixed by #155
Open

Errors propagation #150

rugfri opened this issue Jul 16, 2023 · 23 comments · May be fixed by #155

Comments

@rugfri
Copy link

rugfri commented Jul 16, 2023

Somehow related to #94, I was wondering if you think it could be possible to implement the error propagation (on fitting parameters, experimental uncertainties if present/enabled) to the calculated quantities like dielectric function, refractive index, and such.
I recently started using https://uncertainties-python-package.readthedocs.io/en/latest/, https://github.com/lebigot/uncertainties/
which may be useful, though I am not sure how far complex numbers/arrays are supported.

Thanks

@domna
Copy link
Member

domna commented Jul 18, 2023

Hey @rugfri ,

it certainly is possible and @MarJMue and I discussed it several times already and we wanted to bring this feature in when we refactor the internal data model. Generally, it is not straightforward to obtain uncertainties from a fit. For lmfit the approach would be to scale the residuals by σ, but I'm not sure if lmfit takes this into account for its covariance matrix / stderr values or just to adapt the actual fit. Lmfit does also support F-testing: https://lmfit.github.io/lmfit-py/confidence.html which may be a bit more robust here. If it's important/urgent for you we could try to build an example together to do a manual fit and try to calculate confidence intervals from it.

@rugfri
Copy link
Author

rugfri commented Jul 21, 2023

Thanks @domna.
With σ you refer to the experimental errors on Psi and Delta?
It seems to me that a direct call to minimize of lmfit allows to have things a bit more under control since you provide the residuals.
Anyway, yes I would be interested in pushing this a bit forward. Thanks, let me know how is best to proceed.
I had one example script loaded in Nomad, maybe I can try to resume that and we build on that?

@domna
Copy link
Member

domna commented Jul 24, 2023

Thanks @domna.
With σ you refer to the experimental errors on Psi and Delta?

yes

It seems to me that a direct call to minimize of lmfit allows to have things a bit more under control since you provide the residuals.
Anyway, yes I would be interested in pushing this a bit forward. Thanks, let me know how is best to proceed.
I had one example script loaded in Nomad, maybe I can try to resume that and we build on that?

Sounds good. I see you already added a notebook there. I'll have a more detailed look at it soon

@domna
Copy link
Member

domna commented Jul 25, 2023

Hey @rugfri

I added a scaling with the residuals in the notebook. However, if I do use this as weights the fit does not properly converge to the data. If you want you can have a look, maybe it's just a small mistake I did.

I also removed the 680nm point as the error there was around 1000 (probably some device malfunction?) and tried to use normalized weights.

@rugfri
Copy link
Author

rugfri commented Jul 25, 2023

Thanks @domna,

Thanks.
I think in the derivatives there is a "-" missing, since the derivative of exp(-jx) is -j*exp(-jx), which squared is
-exp(-2ix).
This would mean that the partial derivative of \rho respect to \Delta has a "-" in front.
This however makes even more problems with the imaginary part.

Otherwise looks everything correct.
I will check again tomorrow.

I tried derive an analytical expression for the error where real and imaginary parts are separated but I couldn't get anything usable so far.

Yes, I did also notice the large error on some points. The instrument seems to have some problems, since it is systematic.

@domna
Copy link
Member

domna commented Jul 26, 2023

Thanks @domna,

Thanks. I think in the derivatives there is a "-" missing, since the derivative of exp(-jx) is -j*exp(-jx), which squared is -exp(-2ix). This would mean that the partial derivative of \rho respect to \Delta has a "-" in front. This however makes even more problems with the imaginary part.

You are completely right. The problem now is that we are using numpys "normal" sqrt which does not generate imaginary numbers for negative inputs. I changed it to numpy.lib.scimath.sqrt.

However, the fit is still not converging to the measured data 🤔

Otherwise looks everything correct. I will check again tomorrow.

I tried derive an analytical expression for the error where real and imaginary parts are separated but I couldn't get anything usable so far.

Yes, I did also notice the large error on some points. The instrument seems to have some problems, since it is systematic.

Edit: Ok I found the problem. We need to scale the weights by the variance and not the stderr. Hence, we need to square the errors before we scale the residuals. I added this in the notebook and the fit converges now. I think we also need to use leastsq's exclusively for this as L-BFGS-B does use a gradient based approach which cannot really deal with weights (it also does not converge when I use the weights).

@rugfri
Copy link
Author

rugfri commented Jul 26, 2023

Thanks!!
I am not sure I see the change, maybe this
err_rho = data_aoi['err_rho'].to_numpy() ** 2 ?

anyway the fit message and other attributes confirm. The plot too :-).
Though chi2 and redchi are quite higher than I expected.

Now, my original question was also about the best way to error propagation to the dielectric function and the refractive index.
I assume you would go along the same lines as for rho, correct?
Thanks,
Ruggero

@domna
Copy link
Member

domna commented Jul 26, 2023

Thanks!! I am not sure I see the change, maybe this err_rho = data_aoi['err_rho'].to_numpy() ** 2 ?

Exactly

anyway the fit message and other attributes confirm. The plot too :-). Though chi2 and redchi are quite higher than I expected.

Yeah I don't know how reliable they are, i.e., whether they scale with the absolute magnitude of the errors. If they do they would be strongly influenced by the normalization we currently do (and I think it did not work w/o it). I think for really reliable results one has to calculate the confidence intervals according to https://lmfit.github.io/lmfit-py/confidence.html

Now, my original question was also about the best way to error propagation to the dielectric function and the refractive index. I assume you would go along the same lines as for rho, correct? Thanks, Ruggero

I'm not sure how to propagate these errors for multiple parameters. For one parameter I would just do param + error and param - error and use the difference of the resulting values w.r.t. to the param values. For N parameters we would need calculate 2^N additional psi/delta fits and extract the largest, smallest or average error from it (not sure what the best choice would be). I think this should be feasible for your data (4 varied parameters, thus 16 evals), but I think to be productively usable we might need some shortcuts here, but I need to read up on the theory.

@domna
Copy link
Member

domna commented Jul 28, 2023

I found a paper which deals with this issue: https://doi.org/10.1016/j.optcom.2018.07.025
I did not read it yet but I think it's exactly what we want.

@rugfri
Copy link
Author

rugfri commented Aug 2, 2023

thanks!
Yes, I did not finish to read it yet but it seems to be the one.
Let me know how you want to proceed, if I can do anything.

@domna
Copy link
Member

domna commented Aug 2, 2023

Hey,

I think I extracted the main parts and changes for us (from Is and Ic to rho). I think from this I could do an implementation. However, I contacted the author of the paper and asked if he is willing to share his implementation and reference data. I think this will be extremely helpful. So I'd wait until next week whether I hear something back (I think france has also a strong holiday season so it might take a while....).

@domna
Copy link
Member

domna commented Aug 11, 2023

Hey,

I did not get any answer, so we might just have to do an implementation ourselves. I'll compile my notes here or in our testing notebook soonish and then we can start implementing something (I'll be, however, going in holidays from mid next week and I'll probably not be able to an implementation before then. But I'll at least try to write everything down before then).

@domna
Copy link
Member

domna commented Aug 12, 2023

Hey @rugfri,

this is what I extracted from the paper:

$\Delta n_k = \sum_{k'=0}^K \left| - J_n(\lambda_k) \cdot \text{Cov} \cdot C(\lambda_{k'})\right| \Delta \Re(\rho(\lambda_{k'}))$

where $J_n(\lambda_k)$ is the Jacobian of the optical model function w.r.t. to n, i.e., in other words it is the Jacobian of the function for the layer we want to extract the dispersion for. $\text{Cov}$ is the covariance matrix from the fit. $k$ and $k'$ are wavelength indices.
$C$ is the pertubation vector in chi-squared with elements

$\frac{\partial \chi^2}{\partial p_i \partial \Re{\rho(\lambda_{k'})}} = - 2/K \frac{\partial \Re(\rho)}{\partial p_i} (\lambda_{k'})$

, where $K$ is the total number of (wavelength) points. The r.h.s. is the partial derivative of the total model function for parameter $p_i$. So it is basically the Jacobian of the total model function but only for the parameters relevant in the Jacobian of the optical model function (i.e., the Jacobian above).

The jacobians we should just be able to calcualte with numdifftools, the covariance matrix we get from lmfit and calculating C should be straightforward (i.e. calculation jacobi and selecting the right elements), too.

I think this should be pretty straightforward to implement like this. Let me know what you think and if you can spot any mistakes here.

Edit: One thing which we also need to do is to reduce the covariance matrix from lmfit to the appropriate dimension, i.e., only using the parameter used by the dispersion model. Or we just add zeros to the Jacobian $J_n$ for all additional parameters. Then we can keep Cov and C as is.

@rugfri
Copy link
Author

rugfri commented Aug 13, 2023

Thanks @domna!
I think it is correct.
Yes, agree that derivatives should come thourgh numdifftools. After all, as far as I know, also lmfit uses numdifftools.
Reducing $\mathrm{Cov}$ and $\rm{C}$ seems more efficient, especially if the non-relevant parameters are moved to the last ones.

I am just back from vacations and will double check again from tomorrow.
Enjoy your vacations!

@domna
Copy link
Member

domna commented Sep 7, 2023

Hey @rugfri,

I'm back. I would try to give an implementation a shot. Did you implement something already I can build on? I also heard back from the author and he sent me the data from the paper. So we can use it to check our results (but somehow have to convert it to Is and Ic - or we do the fitting directly in Is and Ic, which might be the better choice for the start).

@domna domna linked a pull request Sep 8, 2023 that will close this issue
@rugfri
Copy link
Author

rugfri commented Sep 8, 2023

Hi @domna,
hope you had good time!
No, sorry I didn't.
Good news about Gilliot. Yes, agree, using Is and Ic it's better even though can be more work.
Let me know if I can do anything. Maybe I can draft the error prop on Is and Ic along the same lines as for Rho? So we see if we end up with the same results :-)

@domna
Copy link
Member

domna commented Sep 8, 2023

Hi @domna,
hope you had good time!

Indeed it was very nice

No, sorry I didn't.
Good news about Gilliot. Yes, agree, using Is and Ic it's better even though can be more work.

I already started to add Is/Ic fitting and could fit the data from the paper. I shared a new upload in nomad with you, which contains the data and my fitting, so we can work together there to deduce the errors.

Let me know if I can do anything. Maybe I can draft the error prop on Is and Ic along the same lines as for Rho? So we see if we end up with the same results :-)

This should be straightforward, we just need to calculate the jacobian of Is and Ic in C. As soon as we are happy we simply can replace Is and Ic with rho if we want to do rho based fitting (actually we should get the same errors in n, so its a good check).

@rugfri
Copy link
Author

rugfri commented Sep 11, 2023

Hi @domna, thanks.
It looks correct to me.
As I understand you calculated also the Jacobian of the model, one should now reduce the covariance matrix, calculate C, and propagate (first equation above). Correct?

@domna
Copy link
Member

domna commented Sep 11, 2023

Hi @domna, thanks.
It looks correct to me.
As I understand you calculated also the Jacobian of the model, one should now reduce the covariance matrix, calculate C, and propagate (first equation above). Correct?

Yes, I already do this in the notebook under Error propagation (the calc_sum function is multiplying the jacobians and cov matrix). However, the errors I'm getting are very small (~10^-10), so there seems to be something wrong still, but I did not find the problem, yet.

@rugfri
Copy link
Author

rugfri commented Sep 11, 2023

Oh, yes, sorry.
Nice. It seems all good to me.
Do we get the variance or sigma out of calc_sum? (Likely not fixing the issue, though since 10^-10 is still very small).
I am not sure, what is the 0.005 factor in the I*_err expressions?

@domna
Copy link
Member

domna commented Sep 11, 2023

Oh, yes, sorry.

No worries :)

Nice. It seems all good to me.
Do we get the variance or sigma out of calc_sum? (Likely not fixing the issue, though since 10^-10 is still very small).

From the paper it should be $\Delta n$, so the standard error (sigma) of the quantity.

I am not sure, what is the 0.005 factor in the I*_err expressions?

It is the error of Is/Ic in the experimental data. It's constant for every data point.

I actually think my formula is not completely right. I think all uncertainties have to be taken into account and not only the ones which are parameters of n (because otherwise having a $\Delta n$ from the substrate $\Delta n_s$ makes no sense), so I think I only have to reduce the dimension in one axis of the covariance matrix. Additionally, I currently don't use any errors for the substrate refractive index error in the calculation.

@domna
Copy link
Member

domna commented Nov 1, 2023

Sorry @rugfri I did not have much time to work on this in the last month(s) and it will probably not have it in the next weeks, too (I still try to sit down and find my error in the formula, though). Is this still of interest for you?

@rugfri
Copy link
Author

rugfri commented Nov 1, 2023

Hi @domna no problem! Yes it is still of interest for me.
I have been also taken away from this topic.
Had only time to find an old code of mine where I was doing something something similar, but I didn't get much beyond.
In case I'll find something useful I will post it here.
In any case, take your time :-) thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants