You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As it is now you sample random fractions for each cell type. I wonder if it would be more efficient to sample based on the prior distributions in the training data, or at least have this as a setting. Maybe just from a normal distribution around the current proportions. In most cases the training data would be rather similar to the bulk; a large fraction of kidney cells would be tubular, liver would be hepatocytes, heart would be cardiomyocytes and so on. Just a suggestion :)
The text was updated successfully, but these errors were encountered:
yes that might make sense as an additional option. We intentionally didn't do it because it of course introduces some bias into the training set. If you have only one dataset for data simulation, and this is somewhat weirdly distributed, that could be problematic. And scRNA-seq data is not the best tool for estimating cell type fractions, sometimes cells are also selected.
So would be an interesting thing to try as an option - I believe that the default should still be random fractions.
But if you want to cook up a PR, I would be happy to include that :)
As it is now you sample random fractions for each cell type. I wonder if it would be more efficient to sample based on the prior distributions in the training data, or at least have this as a setting. Maybe just from a normal distribution around the current proportions. In most cases the training data would be rather similar to the bulk; a large fraction of kidney cells would be tubular, liver would be hepatocytes, heart would be cardiomyocytes and so on. Just a suggestion :)
The text was updated successfully, but these errors were encountered: