Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rsamtools::indexTabix: Capture console output #32

Closed
bschilder opened this issue Mar 17, 2022 · 3 comments
Closed

Rsamtools::indexTabix: Capture console output #32

bschilder opened this issue Mar 17, 2022 · 3 comments

Comments

@bschilder
Copy link

bschilder commented Mar 17, 2022

I recently posted an error I've been encountering with seqminer (here) but realized it seems to trace all the way back to tabix itself.

Regardless, in cases where I get the following kind of error...

[E::hts_idx_push] Unsorted positions on sequence #2: 71999802 followed by 7

...I'd like to wrap the function in atryCatch and capture the messages so that I rerun it with the offending lines in the skip argument. Something like:

out <- tryCatch({  
               Rsamtools::indexTabix(file = bgz_file,
                                     seq = 2,
                                     start = 3,
                                     end = 3, 
                                     comment ="SNP"
               ) 
           })

However, I can't seem to get the output into R. I've tried:

  • tryCatch (above)
  • capture.output
  • sink()

Do you know a way to extract this info?

Thanks,
Brian

Session info

R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.4

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] arrow_7.0.0      ggimage_0.3.0    ggplot2_3.3.5    dplyr_1.0.8      hexSticker_0.4.9
[6] echotabix_0.99.3

loaded via a namespace (and not attached):
  [1] AnnotationHub_3.2.2           BiocFileCache_2.2.1           systemfonts_1.0.4            
  [4] igraph_1.2.11                 BiocParallel_1.28.3           GenomeInfoDb_1.30.1          
  [7] digest_0.6.29                 yulab.utils_0.0.4             htmltools_0.5.2              
 [10] magick_2.7.3                  fansi_1.0.2                   magrittr_2.0.2               
 [13] memoise_2.0.1                 BSgenome_1.62.0               echoverseTemplate_0.99.0     
 [16] ontologyPlot_1.6              openxlsx_4.2.5                Biostrings_2.62.0            
 [19] matrixStats_0.61.0            R.utils_2.11.0                sysfonts_0.8.5               
 [22] prettyunits_1.1.1             colorspace_2.0-3              blob_1.2.2                   
 [25] rappdirs_0.3.3                textshaping_0.3.6             xfun_0.30                    
 [28] crayon_1.5.0                  RCurl_1.98-1.6                echodata_0.99.6              
 [31] jsonlite_1.8.0                hexbin_1.28.2                 graph_1.72.0                 
 [34] VariantAnnotation_1.40.0      glue_1.6.2                    gtable_0.3.0                 
 [37] zlibbioc_1.40.0               XVector_0.34.0                DelayedArray_0.20.0          
 [40] Rgraphviz_2.38.0              BiocGenerics_0.40.0           scales_1.1.1                 
 [43] DBI_1.1.2                     Rcpp_1.0.8.2                  showtextdb_3.0               
 [46] xtable_1.8-4                  progress_1.2.2                gridGraphics_0.5-1           
 [49] bit_4.0.4                     clisymbols_1.2.0              stats4_4.1.0                 
 [52] DT_0.21                       htmlwidgets_1.5.4             httr_1.4.2                   
 [55] ontologyIndex_2.7             ellipsis_0.3.2                pkgconfig_2.0.3              
 [58] XML_3.99-0.9                  R.methodsS3_1.8.1             farver_2.1.0                 
 [61] seqminer_8.4                  dbplyr_2.1.1                  utf8_1.2.2                   
 [64] here_1.0.1                    ggplotify_0.1.0               tidyselect_1.1.2             
 [67] labeling_0.4.2                rlang_1.0.2                   later_1.3.0                  
 [70] AnnotationDbi_1.56.2          BiocVersion_3.14.0            munsell_0.5.0                
 [73] tools_4.1.0                   cachem_1.0.6                  cli_3.2.0                    
 [76] generics_0.1.2                RSQLite_2.2.10                evaluate_0.15                
 [79] stringr_1.4.0                 fastmap_1.1.0                 yaml_2.3.5                   
 [82] ragg_1.2.2                    knitr_1.37                    bit64_4.0.5                  
 [85] fs_1.5.2                      zip_2.2.0                     purrr_0.3.4                  
 [88] KEGGREST_1.34.0               gh_1.3.0                      showtext_0.9-5               
 [91] mime_0.12                     R.oo_1.24.0                   xml2_1.3.3                   
 [94] biomaRt_2.50.3                brio_1.1.3                    compiler_4.1.0               
 [97] rstudioapi_0.13               interactiveDisplayBase_1.32.0 filelock_1.0.2               
[100] curl_4.3.2                    png_0.1-7                     testthat_3.1.2               
[103] paintmap_1.0                  tibble_3.1.6                  stringi_1.7.6                
[106] GenomicFeatures_1.46.5        desc_1.4.1                    lattice_0.20-45              
[109] Matrix_1.4-0                  vctrs_0.3.8                   pillar_1.7.0                 
[112] lifecycle_1.0.1               BiocManager_1.30.16           data.table_1.14.2            
[115] bitops_1.0-7                  httpuv_1.6.5                  rtracklayer_1.54.0           
[118] GenomicRanges_1.46.1          R6_2.5.1                      BiocIO_1.4.0                 
[121] promises_1.2.0.1              IRanges_2.28.0                ontoProc_1.16.0              
[124] assertthat_0.2.1              pkgload_1.2.4                 SummarizedExperiment_1.24.0  
[127] rprojroot_2.0.2               rjson_0.2.21                  withr_2.5.0                  
[130] GenomicAlignments_1.30.0      Rsamtools_2.10.0              S4Vectors_0.32.3             
[133] GenomeInfoDbData_1.2.7        parallel_4.1.0                hms_1.1.1                    
[136] grid_4.1.0                    ggfun_0.0.5                   tidyr_1.2.0                  
[139] rmarkdown_2.13                MatrixGenerics_1.6.0          piggyback_0.1.1              
[142] Biobase_2.54.0                shiny_1.7.1                   lubridate_1.8.0              
[145] restfulr_0.0.13          
@bschilder bschilder changed the title Capture console output Rsamtools::indexTabix: Capture console output Mar 17, 2022
@mtmorgan
Copy link
Collaborator

These errors are written to C language 'stderr' but unfortunately this is not directly available within R. From the command line, on Linux, one could run a script script.R and redirect stderr to a file

Rscript -f script.R 2> stderr.log

but probably your intention is something different.

Previously, Rsamtools used a 'hack' to remap fprintf(stderr, ...) to R's error stream, with something like

DFLAGS = \
  -Dfprintf=_samtools_fprintf \
  -Dexit=_samtools_exit \
  -Dabort=_samtools_abort

This would have captured samtools errors as R warnings, but unfortunately this functionality was lost when we shifted to using Rhtslib quite some time ago; perhaps @hpages has additional comment.

I don't think there's an easy way to capture what you'd like to do.

@hpages
Copy link
Contributor

hpages commented Mar 19, 2022

Hi Martin,

HTSlib uses a more sophisticated mechanism than Samtools for logging events: now everything goes thru the hts_log() function defined in header file hts_log.h. The function handles various levels of event severity: error, warning, debug, and trace. There's a macro defined for each level (hts_log_error(), hts_log_warning(), etc...). Each macro is a simple wrapper around hts_log().

Note that hts_log() itself is not a macro, it's a real function, and it's not an inline function either, so I'm not sure why it's defined in a header file.

Anyways, this new mechanism and its implementation differ substantially from the plain fprintf(stderr, ...) calls used all over the place in Samtools. I didn't investigate so I don't know how easy/hard it would be to come up with a hack similar to yours where all the events logged thru hts_log_error() get captured as R warnings.

Ideas, suggestions, patches are welcome.

Thanks,
H.

@hpages
Copy link
Contributor

hpages commented May 19, 2022

Personally I'm not planning to tweak Rhtslib to make the low-level C code in htslib spit output that is capturable from R but maybe someone else wants to give it a shot? If not, we can close this.

Thanks,
H.

@mtmorgan mtmorgan closed this as not planned Won't fix, can't repro, duplicate, stale Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants