Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allele specific expression on multiple samples #42

Open
mbosio85 opened this issue Jan 22, 2018 · 4 comments
Open

Allele specific expression on multiple samples #42

mbosio85 opened this issue Jan 22, 2018 · 4 comments

Comments

@mbosio85
Copy link

Hello,

I ran phASER for multiple samples and extracted allele specific expression for each sample independently.
Seen that these samples are split into case/control, I would like to process these expression data to see if there is anything interesting.
My question is if the allele specific expression data from multiple samples are directly compatible or not.
Can I compare aCount and bCount across multiple samples straight away or is there a risk that what is measured on aCount for sample X, ends up in bCount for sample Y ?

Do you have a suggested protocol for this task?

Thanks a lot

Mattia

@secastel
Copy link
Owner

Hi Mattia,
Specific questions about best practices for analysis of ASE data is a bit outside of the scope of phASER. It is primarily a data generation tool - after that it is up to the user to decide what the best analysis for their specific question of interest is. There is no one general protocol for the analysis of ASE data. As a role of thumb though, you should not be comparing the counts across samples, as these are determined by read depth, but instead, you should compare something like the allelic fold change (log2_aFC outputted by phaser_gene_ae), which is the log transformed ratio of aCount over bCount. How you carry out and interpret this comparison is up to you. For example, you might find certain genes that have increased aFC in cases versus controls. This could indicate an increase in strong regulatory effects that might be involved in disease risk.

I would strongly suggest reading our paper on the analysis of ASE data here:

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0762-6

Hope this helps.

Stephane

@everestial
Copy link
Contributor

HI, @mbosio85

I have used pHASER extensively and can tell from experience that aCount or Haplotype for sample X may turn up as bCount or Haplotype for sample Y. I am doing ASE analyses in F1 hybrids and I rather need a proper haplotype configuration (right phase state connected from earlier haplotype with next haplotype). PHASER has been execellent in creating local haplotype blocks, but I had write my own parser to update my GW haplotype.

Based on your question it looks like you are interested in figuring out which haplotypes are same/similar across different sample. Now, I am almost finishing another python tool, that can test the allele-genotype information across several samples for the overlapping haplotype blocks and then assign and extend that block. May be that will interest you.

Let me know if you have any questions.

@smoenga55
Copy link

Hi @everestial
Did you finish this tool?

@everestial
Copy link
Contributor

@smoenga55
The tools are phase stitcher https://github.com/everestial/phase-Extender

and phase extender https://github.com/everestial/phase-Extender

The way haplotype phase extension is done depends on the assumptions of relationships between samples, so make sure it is clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants