-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
calculate cis/total during balancing and use it to filter rows #134
Comments
Also intra-chromosomal translocations, which can, probably, be filtered similarly, but by long-range cis over total cis, or over short range cis. |
Since cis/total has a broad, locus-dependent, range influenced by mant
factors, I've found that filtering on whether a bin has any extreme
trans-values (e.g. equivalent to third cis diagonal) can be useful in
addition to just filtering on cis/total.
…On Thu, Sep 20, 2018, 7:55 AM Ilya Flyamer ***@***.***> wrote:
Also intra-chromosomal translocations, which can, probably, be filtered
similarly, but by long-range cis over total cis, or over short range cis.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#134 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ASvzZfQG8-WXMELGuQnETsU5fpUy7ipqks5uc6xigaJpZM4WyTaR>
.
|
bumping this-- @golobor suggests that cis & total sums can be set as columns during balancing |
See #210 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
A major practical issue with processing Hi-C data is the presence of genomic translocations. They lead errors into the calculated genomic distances and confuse cis contacts with trans ones, thus breaking the expectations of downstream analyses (obs/exp, eigenvectors).
Historically, @mimakaev and @gfudenberg dealt with these issues by filtering out genomic bins that form an untypically high fraction of trans contacts. Recently, @Phlya pointed out the need for such filters in cooler.
A simple suggestion would be to calculate the cis/total fraction in raw bins (cis_tot_raw), detect low-value outliers using MADmax and filter them out (on top of the already used MADmax-coverage filter).It may also be useful to report both the cis_tot_raw and cis_tot_balanced (i.e. cis/total fraction after filtering/balancing).
The text was updated successfully, but these errors were encountered: