Questions Regarding Reproducibility #3

AuthorUnknown404 · 2025-02-15T19:53:32Z

I found your work very interesting, but when applying it in practice, the results do not align with those presented in the paper. Before drawing any conclusions, I would like to clarify whether we might be using the method incorrectly.

For example, we used the human cortex dataset from your paper as a test case. We directly fed the raw count matrix into the embedding pipeline and followed the tutorial provided on GitHub. We attempted to replicate the spatial domain detection task as described in the paper. Unfortunately, as shown in the attached results, the performance was significantly worse than expected—not to mention exceeding the performance of stLearn or SpaGCN. Could you confirm whether this is the expected performance of the method, or if there might be an issue with our usage?

Secondly, in the paper, you mention that the results were fine-tuned before comparison with unsupervised methods such as SpaGCN. This seems like an unfair comparison. Additionally, you only reported results on two LIBD samples. Were the other samples used for training, or were these two simply the only ones that performed reasonably well?

Lastly, the paper states that different experts were used to handle different platforms. However, the code does not seem to reflect this. While the model parameters include a platform dictionary, the input does not appear to specify platform settings explicitly. Could you clarify how the platform should be set in the implementation?

I appreciate your time and look forward to your response.

ChloeXWang · 2025-02-17T21:19:37Z

Hi @AuthorUnknown404, thank you for using our methods!

For Visium data, results presented in the paper were evaluated on finetuned models. We plan to release the finetuning tutorial in the next few weeks. The finetuning pipeline follows unsupervised workflow similar to pretraining, without using any cell type or spatial domain annotations, hence does not incur unfair comparisons. There will be some additional preprocessing steps for Visum data such as highly variable gene selection which was not included in the zero-shot tutorial.

For zero-shot inference, the embeddings are retrieved from encoders without passing through decoders. Modality-specific decoding is implemented in the decoders to support gene expression prediction objectives used in pretraining and finetuning. To summarize, decoding is used in training only but not at inference time, so the current zero-shot inference code does not use modalities as arguments.

Hope this helps clarify your questions.

NBitBuilder mentioned this issue Feb 15, 2025

Degraded performance on zero-shot clustering #4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions Regarding Reproducibility #3

Questions Regarding Reproducibility #3

AuthorUnknown404 commented Feb 15, 2025

ChloeXWang commented Feb 17, 2025

Questions Regarding Reproducibility #3

Questions Regarding Reproducibility #3

Comments

AuthorUnknown404 commented Feb 15, 2025

ChloeXWang commented Feb 17, 2025