Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Expected parameter loc (Parameter of shape (1, 13866)) of distribution Normal(loc: torch.Size([1, 13866]), scale: torch.Size([1, 13866])) to satisfy the constraint Real(), but found invalid values #397

Open
hua1991 opened this issue Jan 19, 2025 · 1 comment
Labels
question Further information is requested

Comments

@hua1991
Copy link

hua1991 commented Jan 19, 2025

Dear cell2location team,

I encountered an error when training Visium HD spatial transcriptome data. We first combined the bins into cells using a 10x nucleus segmentation strategy, and then used this h5ad file as the input file for the cell2location analysis. For the scRNA-seq data, we used the same tissue 5' scRNA-seq data as a reference, the “mod.train” step for the scRNA-seq data worked fine, but for the spatial data, it reported a “ValueError: Expected parameter loc (Parameter of shape (1, 13866)) of distribution Normal(loc: torch.Size([1, 13866]), scale: torch.Size([1, 13866])) to satisfy the constraint Real(), but found invalid values".

I checked the input spatial data and it was raw count data (integer) and not normalized. For the spatial model, I set “N_cells_per_location” to 1 and “detection_alpha” to 200. I don't know if these parameters make sense for the Visium HD nucleus segmentation data or if these values cause the above error. Could you help resolve this issue?

Image

Image

Best,
hua

@hua1991 hua1991 added the question Further information is requested label Jan 19, 2025
@vitkl
Copy link
Contributor

vitkl commented Jan 28, 2025

Hi @hua1991

When does this happen - immediately on training step 1 (likely issue with input data format like containing NA or 0s for all locations) or after some training (likely issue with numerical stability due to extreme values - both high and low)?

The offending parameters seem to be gene-specific tech effect difference m_g. I would recommend doing additional gene filtering based on VisiumHD - not just scRNA gene filtering shown in tutorials.

With Visium HD it could be important to normalise column and row effects before aggregating per cell (as done in this work https://github.com/Teichlab/bin2cell). That raises an issue of how to apply count methods such as cell2location and scVI. You probably need to do normalisation for column and row effects keeping the scale of values similar to the original data, then binarise by rounding to sampling integers using normalised values as Poisson mean. All these options need to be tested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants