-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add information on recommended method of installing Bioconductor packages #292
Comments
Thanks, this mostly sounds accurate. The key distinction here is that, in my understanding, BioC packages are already frozen to the annual R version, much like Ubuntu and other Linux distros do with their default repositories. I believe the bioc installer selects the appropriate repository based on the R version, so the rocker-versioned approach here is basically to leave well enough alone. As we already freeze the R version, the corresponding BioC repo should be determined from that. Let me know if that makes sense or if I'm missing something. (I'm not an active user of many BioC packages, so I could easily be missing something in my understanding here!) Agree 💯 that we ought to improve the docs about this in any case |
How about modifying the rocker-versioned2/scripts/bin/install2.r Lines 81 to 84 in 889a33b
|
Thanks both for your replies!
If that is possible, I'd be stoked! It would solve the major problem I'm facing and also make the behaviour of the script more consistent with not just
That does indeed make sense! I'm quite new to bioconductor myself (or rather, I've never had the need to delve into the way it managages packages), so here's what I've gathered just now:
In any case, from what I can tell, BiocManager (and BiocVersion) seem to work just fine regardless of whether the bioconductor or the RSPM repository is being used. I.e., users can install a desired version of bioconductor (and will be warned when they try to use a version that is incompatible with the available version of R), and the different repository URLs (BioCSoft, BioCAnn, etc.) will be adjusted automatically (using the repository URL prefix that is set by So all of that seems to work as intended and I agree with your "leave well enough alone" assessment ;) Apologies for writing out this wall of text, but at the very least it helped me get a better grip on things. Since these specific peculiarities are pretty much unique to bioconductor, I understand that it's a bit difficult to gauge how much of it needs to be documented by the rocker project as opposed to by bioconductor though... Perhaps, the fact that BioC manager is installed in tidyverse, but that the default repository is retained, alongside a warning on how best to install BioC packages could be worthwhile additions? |
The README currently states:
It would be helpful to add info here on Bioconductor package installation (e.g., It would also be helpful to include information on how to install Bioconductor packages when |
Thanks @nick-youngblut ! PR's always welcome, we're a community-driven project. |
I can see why you'd like help, given how much of a pain writing documentation can be, but asking for help with documentation from those that are currently looking for the documentation seems like it will lead to documentation edits that do not incorporate best-practices, as defined by the software developers. For instance, I'm currently trying the following:
...but I don't know if it will work (the build is still running) or if it follows best-practices. If it does work, I can create a PR with an updated README, but I'm guessing the person(s) reviewing the PR will just have to heavily edit the changes. |
Hey @nick-youngblut , thanks! yup, a PR is a great way for us a community to discuss these things! This is not just because I am too lazy to update the readme, but because that discussion process of issues and PRs usually gets us to a better point that meets the needs of other users, and is also easier for other developers and community members to chime in. I agree with you that Like you note, that's not so helpful since unlike |
So I was intrigued to see how far r2u could come in help given its partial BioConductor support (and of course famously complete CRAN support). I fired up the # first command an echo of yours, installs in a few (single) seconds
install.r argparse ape dplyr tidyr BiocManager
# the I tried this which came back with a loooong list of packages so I Ctrl-C'ed out
#Rscript -e 'bspm::disable(); BiocManager::install("sangeranalyseR")'
# instead this installed all available build-deps
# (I had edited the '' and , out of the return from the stopped attempt
install.r sys bitops bit colorspace askpass zlibbioc RCurl GenomeInfoDbData bit64 blob memoise plogr isoband farver labeling munsell curl openssl BH fs rappdirs pixmap sp RcppArmadillo BiocGenerics S4Vectors IRanges XVector GenomeInfoDb crayon RSQLite DBI plyr fastmatch igraph quadprog gtable httpuv mime xtable fontawesome htmltools sourcetools later promises fastmap commonmark bslib cachem ellipsis ggplot2 scales httr viridisLite base64enc htmlwidgets RColorBrewer lazyeval crosstalk jquerylib anytime sass zip evaluate tinytex xfun yaml highr ade4 segmented bookdown Biostrings DECIPHER reshape2 phangorn sangerseqR gridExtra shiny shinydashboard shinyjs data.table plotly DT zeallot excelR shinycssloaders ggdendro shinyWidgets openxlsx rmarkdown knitr BiocStyle logger
# then I could just do -- which was quick
Rscript -e 'bspm::disable(); BiocManager::install("sangeranalyseR")' Now all is good: > library(sangeranalyseR)
Loading required package: stringr
Loading required package: ape
Loading required package: Biostrings
Loading required package: BiocGenerics
[.... lots and lots omitted ...]
Loading required package: logger
Welcome to sangeranalyseR
> It uses current packages, not the 'versioned' stack so it may not be of interest to you. But we can get a of BioC quickly installed, which is still of interest to some. |
Apparently, my attempt above does not work. I was able to install the
...so it appears that the bioconductor package is not installed in the correct libPath. My libPaths when calling the R script:
I cannot find the "installed" sangeranalyseR package anywhere in the docker image. The following returns nothing:
...and the package is definitely not in The entire docker file that I'm using:
|
🤷♂️ What I showed you was real. I just used (I also tried to throw a quick demo Dockerfile together (just as I had already done once today) but that balked as @Enchufa2 and I currently have an issue with |
Well sure if you use |
@nick-youngblut I suspect your installation isn't succeeding due to missing system libraries (might be Recall that R does not throw an error when |
For instance, this Dockerfile works for me: (though it does take 330 seconds to build)
|
Thanks @eddelbuettel and @cboettig for all of the help! ...and thanks @cboettig for test-building a dockerfile that works 🚀 @cboettig , is your use of
FYI: it took 1384 sec to build the |
Not really. There are also some BioC folks already using / poking at r2u so you could ask on the BioC slack or lists too for best practices. As for |
So for completeness, now after dinner, with the following Dockerfile
we install in 64 seconds. |
Arm64 platform does not support binary installation of CRAN packages, so installation takes longer. |
I ran the build for
|
I'm having some difficulties modifying the rocker tidyverse base image with Bioconductor packages. I've written up my goal, approach and problems more extensively in an issue on littler's github page (eddelbuettel/littler#93), because I thought there was something strange going on with the
--repository
flag of theinstall2.r
script, although that turned out to be a more low-level issue that can happen when mixing repos and thus has nothing to do withlittler
itself.I'm posting this issue here however because I hope that the rocker community can provide some guidance on how to tackle the things I'd like to do, i.e. install bioconductor in (rocker) docker and have the build fail when something goes wrong. I believe that this information could be useful for other users and could be included on the Rocker Project's guide on extending the images.
Very briefly, here are my findings and struggles:
R -e 'BiocManager::install("package")
and/usr/local/lib/R/site-library/littler/examples/installBioc.r
) raise a non-zero exit code and thus do not cause Docker builds to fail when something goes wrong (e.g. unavailable in the repo, mis-spelled name, or missing dependencies).install2.r
script (which does raise this error), but you have to pass all the specificBioCsoft
,BioCann
,BioCexp
URLs explicitly, as well as the default CRAN repo (because when any -r flag is added,install2.r
seemingly forgets about the standard CRAN repo defined byoptions("repos")
). However, as I've shown in the littler issue I linked to above, the order in which these repos are given seems to affect the outcome.rocker-versioned2/scripts/install_tidyverse.sh
Line 25 in 6d5eed8
options("repos")
to frozen RStudio Package Manager URLs (like"https://packagemanager.rstudio.com/all/__linux__/focal/latest"
for the most recent release and"https://packagemanager.rstudio.com/cran/__linux__/focal/296"
for version 4.0.1).options("BioC_mirror")
, so runningBiocManager::repositories()
shows the default bioconductor repositories, which are tied to the specific version of BiocManager that is installed.options(BioC_mirror = "https://packagemanager.rstudio.com/bioconductor")
and to use a compatible CRAN snapshot (they list appropriate snapshots for given versions of bioconductor here: https://packagemanager.rstudio.com/client/#/repos/4/overview).install2.r
script for installing most of its packages. This is good, because unlike using other methods such asRUN R -e "install.packages('tidyverse')"
, theinstall2.r
returns a non-zero exit code to the shell when it fails, which stops docker builds. Otherwise, the build would continue just fine and you would end up with a Docker image that is missing your package, without any way of knowing (except for scrolling through the very long and verbose output of the R install process or trying to load the package while running the container).options(repos)
in Rprofile.site (rocker-versioned2/scripts/install_R.sh
Line 135 in 6d5eed8
install2.r
automatically?I believe that this issue is not tied to which specific repositories are being used (RSPM or the default bioconductor ones) and that it could be worthwhile to highlight it somewhere in rocker's guide on modifying and extending the images. E.g. warning users about potential silently failing bioconductor installs by calling
BiocManager::install()
and warning about verifying whether usinginstall2.r
with all the individual sub-repositories for bioconductor does what they intend it to do.Am I going about this the wrong way perhaps? I guess I can just forget about pinning a specific repository and just keep track of my images as the unit I need to store for reproducibility? But then again, rocker images for previous versions of R also pin repo URLs, so that seems to be the intended approach. Any other advice or insight into how I can better handle these installations is highly appreciated!
EDIT: I've cross-posted this to the Bioconductor repository as well, since the same kind of addition to their documentation would be useful imo: Bioconductor/bioconductor_docker#38
The text was updated successfully, but these errors were encountered: