feat: add Levy distribution #317

Qazalbash · 2025-01-16T08:48:43Z

This PR adds the implementation of Levy distribution. There is some numerical error in checking PDF integrate to be equal to cdf. I revisited the implementation several times and found no ambiguity. Therefore, I am pushing to seek some guidance.

…methods

YeungOnion · 2025-01-17T23:35:18Z

Hmm, while our check for consistency being an integrator is likely susceptible to the usual nuance of general forward integrators.

The error message, below, indicates that two steps into the integration it's already off by 20%. Since it checks at each step this is pretty significant.

failures:

---- distribution::levy::tests::test_continuous stdout ----
Integral of pdf doesn't equal cdf!
Integration from 1 by 0.003182932192553111 to 1.0063658643851063 = 0.006119699097797403
cdf = 0.005069737329462489

I'm not certain this is the case, but the implementations appear correct to me as well.

Options I expect could be

inappropriate step size for that part of the function, which I don't think is the case considering it's not terribly steep near the mean
the erfc implementation isn't sufficiently accurate, but we use that elsewhere, so we would have expected to see it elsewhere
my first thought, the integrator is insufficient. The current one integrates a linear approximation to the function, and perhaps we need better (but again the function isn't very steep here, so I'm unsure this is the case).

We may need to update the check for continuous distributions to be more robust integration or rely on other constraints. I can open this part as a different issue.

Qazalbash · 2025-01-18T05:27:36Z

Hi, I am also contributing a little bit to NumPyro, a Python-based PPL. Their test suites are big and quite good. Looking at them may help.

YeungOnion · 2025-01-19T18:45:53Z

That's a very cool project, along with JAX. I think I may have seen JAX before and forgot about it.

I took a pass through by skimming, searching, and with an LLM. Couldn't find one in that test suite.

I've also been thinking, perhaps this is a little unnecessary: testing values for the function will isolate precision effects to and precision checking (done by approx).

Perhaps if we had a default impl of PDF for a CDF then this may have been a stronger requirement. But instead, an alternative sanity check might be going the other way and verifying a finite difference in the derivative over some "well-chosen" interval.

@Qazalbash would you be willing to write an internal function analogous to the integrator for this and use that test instead? You could use it standalone in the test and for later development we could see if it works replacing the integrator within internal::check_continuous_distribution

Qazalbash · 2025-01-20T02:35:06Z

Perhaps if we had a default impl of PDF for a CDF then this may have been a stronger requirement. But instead, an alternative sanity check might be going the other way and verifying a finite difference in the derivative over some "well-chosen" interval.

I am in favor of checking whether the gradient of a CDF equals a PDF. Is there any good autograd we can use?

Qazalbash would you be willing to write an internal function analogous to the integrator for this and use that test instead? You could use it standalone in the test and for later development we could see if it works replacing the integrator within internal::check_continuous_distribution

I will experiment with some, Just to let you know it will not be soon.

Qazalbash · 2025-01-26T19:31:54Z

@YeungOnion check #320 for derivatives of cdf is equal to pdf.

codecov · 2025-01-29T17:12:22Z

Codecov Report

Attention: Patch coverage is 97.37991% with 12 lines in your changes missing coverage. Please review.

Project coverage is 94.40%. Comparing base (71e010b) to head (d426d4b).
Report is 2 commits behind head on master.

Files with missing lines	Patch %	Lines
src/distribution/levy.rs	97.37%	12 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #317      +/-   ##
==========================================
+ Coverage   94.27%   94.40%   +0.13%     
==========================================
  Files          58       59       +1     
  Lines       12974    13432     +458     
==========================================
+ Hits        12231    12681     +450     
- Misses        743      751       +8

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Qazalbash · 2025-01-29T17:27:44Z

On Windows, there is a difference at I guess, the 12th decimal place, should we ignore it?

YeungOnion · 2025-01-29T19:08:57Z

Was harder to find than I expected, but the toolchain is using MSVC per this workflow
so it's targeting a different math library on windows. In some sense, it's nice to know that things aren't exactly the same. If we switched to using gcc on windows, and there's still a difference that would be really useful to know.

I'm open to

windows workflows for both MSVC and GCC, and then we can modify the MSVC toolchain to run last of the 4 OS's and not fail fast, so if anyone wants to use MSVC then they'll be able to see how different things are.
or excluding MSVC from scope and making it clear on the README. Besides, people should be running unit tests on their machines.

I think I like the second idea better, but I want to know if anyone is using Rust with MSVC. nalgebra has it as a supported platform on i686, but on cursory glance, couldn't find where they test that.

Aside: It's honestly a little surprising that this is the first time the difference has been large enough to catch. The only clear distinctions appears to be the float power, since add, divide, and multiply are well defined by IEEE, but perhaps one is using fused multiply-add when not explicit and the other is not?

YeungOnion · 2025-01-29T19:13:19Z

What I can say is that this is not to do with a correct implementation on the Levy distribution. Also, can you tell me how you generated test cases? I've been wondering about people's different approaches.

Qazalbash · 2025-01-29T19:21:44Z

I think I like the second idea better, but I want to know if anyone is using Rust with MSVC. nalgebra has it as a supported platform on i686, but on cursory glance, couldn't find where they test that.

I also think this is right!

What I can say is that this is not to do with a correct implementation on the Levy distribution. Also, can you tell me how you generated test cases? I've been wondering about people's different approaches.

They are mathematically correct, I manually computed the equations in a separate Rust program.

YeungOnion · 2025-01-29T21:48:45Z

Oh that's good! I know we've got a fair bit against scipy. What's the other library or libraries you use?

Qazalbash · 2025-01-30T05:05:37Z

Oh that's good! I know we've got a fair bit against scipy. What's the other library or libraries you use?

For my research work, I predominantly use the JAX python framework. I also make open-source contributions to that ecosystem. As I mentioned before, I use numpyro for stats and writing a lightweight one for my personal research with almost similar APIs.

I saw this project quite a while ago and was learning rust, so I found it a sweet place to apply my rust knowledge while contributing to open-source.

Qazalbash added 2 commits January 16, 2025 13:42

feat: add Levy distribution

f82ee4c

feat: enhance Levy distribution with comprehensive documentation and …

347b4b6

…methods

YeungOnion mentioned this pull request Jan 17, 2025

More robust verifier in testing continuous PDF against CDF #318

Closed

YeungOnion mentioned this pull request Jan 28, 2025

test: add check for derivative of CDF to ensure it matches PDF #320

Merged

Merge branch 'master' into levy-dist

d426d4b

YeungOnion merged commit dd45aed into statrs-dev:master Jan 30, 2025
9 of 10 checks passed

Qazalbash deleted the levy-dist branch January 30, 2025 05:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Levy distribution #317

feat: add Levy distribution #317

Qazalbash commented Jan 16, 2025

YeungOnion commented Jan 17, 2025

Qazalbash commented Jan 18, 2025

YeungOnion commented Jan 19, 2025 •

edited

Loading

Qazalbash commented Jan 20, 2025

Qazalbash commented Jan 26, 2025

codecov bot commented Jan 29, 2025 •

edited

Loading

Qazalbash commented Jan 29, 2025

YeungOnion commented Jan 29, 2025

YeungOnion commented Jan 29, 2025

Qazalbash commented Jan 29, 2025 •

edited

Loading

YeungOnion commented Jan 29, 2025

Qazalbash commented Jan 30, 2025

feat: add Levy distribution #317

feat: add Levy distribution #317

Conversation

Qazalbash commented Jan 16, 2025

YeungOnion commented Jan 17, 2025

Qazalbash commented Jan 18, 2025

YeungOnion commented Jan 19, 2025 • edited Loading

Qazalbash commented Jan 20, 2025

Qazalbash commented Jan 26, 2025

codecov bot commented Jan 29, 2025 • edited Loading

Codecov Report

Qazalbash commented Jan 29, 2025

YeungOnion commented Jan 29, 2025

YeungOnion commented Jan 29, 2025

Qazalbash commented Jan 29, 2025 • edited Loading

YeungOnion commented Jan 29, 2025

Qazalbash commented Jan 30, 2025

YeungOnion commented Jan 19, 2025 •

edited

Loading

codecov bot commented Jan 29, 2025 •

edited

Loading

Qazalbash commented Jan 29, 2025 •

edited

Loading