Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault with BetaCoalescent and ploidy greater than diploid #2307

Closed
nspope opened this issue Aug 4, 2024 · 9 comments
Closed

Segfault with BetaCoalescent and ploidy greater than diploid #2307

nspope opened this issue Aug 4, 2024 · 9 comments
Labels

Comments

@nspope
Copy link
Contributor

nspope commented Aug 4, 2024

The following snippet gives me a segfault:

import msprime
print(msprime.__version__) # 1.3.2
msprime.sim_ancestry(
    samples=10,
    recombination_rate=1e-8,
    sequence_length=1e6,
    population_size=1e4,
    ploidy=3,
    random_seed=1024,
    model=msprime.BetaCoalescent(alpha=1.5),
)
@andrewkern
Copy link
Member

fwiw I can recreate a similar error on msprime 1.3.1 on my mac laptop, although it's telling me abort rather than segfault

@jeromekelleher
Copy link
Member

Confirmed also on Linux, v 1.3.1. Investigating.

@jeromekelleher
Copy link
Member

Well the problem is a pretty straighforward. We assume there are at most four "pots" to assign lineages into here:

avl_tree_t *ancestors, Q[4]; /* MSVC won't let us use num_pots here */

But a few lines later we set num_parental_copies to 2 * ploidy, which we then use the number of "pots":

num_parental_copies = 2 * self->ploidy;

@JereKoskela can you comment here please? Is the correct thing to do here just to malloc Q with num_parental_copies and not just 4 (easy). We would need to do some statistical validation as well I think, as this has clearly never been tested for e.g. triploids.

jeromekelleher added a commit to jeromekelleher/msprime that referenced this issue Aug 5, 2024
@jeromekelleher
Copy link
Member

#2308 fixes the segfault. I'd need some help in figuring out whether it's actually doing the right thing, though.

@JereKoskela
Copy link
Member

JereKoskela commented Aug 5, 2024

Making num_parental_copies the number of buckets is the right thing to do.

Statistical validation is not totally straightforward because this is the only implementation for ploidy > 2 of which I'm aware. We could estimate moments of TMRCAs of n = 2, for which the expectation and variance are easy to compute for various ploidies, say 1, ..., 6.

@jeromekelleher
Copy link
Member

Statistical validation is not totally straightforward because this is the only implementation for ploidy > 2 of which I'm aware. We could estimate moments of TMRCAs of n = 2, for which the expectation and variance are easy to compute for various ploidies, say 1, ..., 6.

That sounds great - would you be able to pick this up if I merge #2308?

I'll push out a bugfix release in the next couple of days, as we don't really want segfault bugs lying around.

@JereKoskela
Copy link
Member

JereKoskela commented Aug 5, 2024

Sure, I'll put something together. Go ahead and merge.

@jeromekelleher
Copy link
Member

Great, thanks a lot. I'll push out the bugfix once we're reasonably sure that the fix works.

@jeromekelleher
Copy link
Member

Closed in #2310 and #2308. Great work everyone!

jeromekelleher added a commit to jeromekelleher/msprime that referenced this issue Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants