Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New features for seqminer (8.0) #12

Open
WenjianBI opened this issue Jul 28, 2020 · 6 comments
Open

New features for seqminer (8.0) #12

WenjianBI opened this issue Jul 28, 2020 · 6 comments

Comments

@WenjianBI
Copy link

Hi Xiaowei,

I am using seqminer (v8.0) and it works pretty well under multiple OS. I am wondering if you can add some features to the current functions.

  1. Usually, we do not need all subjects in analysis. So, for readBGENToMatrixByRange() and readVCFToMatrixByRange(), can you add one more argument such as 'subjIDs' or 'subjIndex' to specify the subjects in analysis. That can save a lot of memory sometimes.

  2. Can you add one more function to split all markers into multiple ranges, and each range includes similar number of markers. When conducting a genome-wide analysis, we cannot put the genotype of all markers into memory. Hence, this function can greatly help us for that purpose. If possible, I suggest the new function should be like splitRange(fileName, memoryChunk = 4GB, subjIDs, ...). Output can be a data.frame object in which each row is for one range.

  3. Sometimes, the plink bed/bim/fam files or bgen bgen/bgi files have different prefix names. I am wondering if you can let users specify the different names for different files. That would be also helpful.

Thanks,
Wenjian

@zhanxw
Copy link
Owner

zhanxw commented Jul 28, 2020 via email

@WenjianBI
Copy link
Author

Thank you for the swift reply. Bgen files are becoming more and more popular and I think your package can be a very important tool for R users.

@garyzhubc
Copy link

I think it'd be great if there is an option load a matrix from readBGENToMatrixByRange indexed by rsid instead of position.

@zhanxw
Copy link
Owner

zhanxw commented Feb 11, 2021 via email

@garyzhubc
Copy link

Missing data imputation can be an important feature to have. I wonder how missing genotype is handled in the current version.

@zhanxw
Copy link
Owner

zhanxw commented Feb 12, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants