-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
loading package taking too much memory #44
Comments
This observation needs to be spelled out more. What kinds of high-performance applications are suffering from the size/load time you are commenting on? Can you restructure the data access tasks so that the R processes operating on the data don't need the package to be attached? Selective importing of symbols can also reduce memory footprint. |
Since it has been 3.5 years since this issue has been addressed without a solution, I wrote some help for package developers that have noticed this issue and want to minimize their memory footprint. In short, do not add SummarizedExperiment (SE) to your NAMESPACE file, but do list it under Imports in the DESCRIPTION file. This ensures that users install SE as a dependency when installing your package, but only load its namespace when needed. Also, do not use
The SE container depends on other (heavy) packages like IRanges, GenomeInfoDB, SparseArray, etc. These packages all depend on eachother, so each package requires an update that changes their namespace in order to solve this at a fundamental level. In my case, it is convenient to use a SE container in a late, optional stage of my pipeline. However, if SE is in the package namespace, it would add about 500 MB per core in a multicore cluster (on Windows). This is because R loads the entire namespace when creating a (SNOW) cluster, even if it is not being used. While it may be a niche use case, it does prevent me to import 4GB of unused dependencies when using 8 cores. Finally, if you do list SummarizedExperiment in your NAMESPACE, any other package that depends on yours will suffer from the same issue, so removing it from the NAMESPACE file could prevent similar issues in the future Hope this helps! |
SummarizedExperiment is using too much memory and time to load. The code bellow and results indicate that it is taking 328MB of memory. For a single use this is not a big issue. However using
library(SummarizedExperiment)
on high performance computing is causing the code to crash due this high memory use.The text was updated successfully, but these errors were encountered: