Michael, For easier testing, with my mac OSX I can dial down the limit on number of files using shell command `ulimit -n 30` but YMMV depending on OS support.
In any case, your suspicions were on target. R function bgzip seems to be the culprit, and I am changing subject and cc:ing in Martin and Herve accordingly. Martin xor Herve, The problem can be reproduced by just calling bgzip repeatedly. depending on your value for `ulimit -n` library(Rsamtools) bed<- system.file("doc", "example.bed", package="rtracklayer") replicate(2000,bgzip(bed, 'delme.now',TRUE)) My workaround for now is to perform system calls to do the zipping and tabix indexing. So, no urgency, but, sessionInfo() is as below. Thanks, ~Malcolm From: Michael Lawrence [mailto:lawrence.mich...@gene.com]<mailto:[mailto:lawrence.mich...@gene.com]> Sent: Friday, November 09, 2012 5:52 AM To: Cook, Malcolm Cc: bioc-devel@r-project.org<mailto:bioc-devel@r-project.org>; Michael Lawrence <lawrence.mich...@gene.com<mailto:lawrence.mich...@gene.com>> (lawrence.mich...@gene.com<mailto:lawrence.mich...@gene.com>); Vincent Carey (st...@channing.harvard.edu<mailto:st...@channing.harvard.edu>) Subject: Re: rtracklayer BUG: `export(x, path, index=TRUE)` appears not to close filehandle on tabix files produced Hi Malcolm, I am not sure why this is happening. I haven't been able to reproduce it on my system (which I think has a limit of 1024, so I had to increase your test case to exceed that). Does this happen when calling bgzip + indexTabix on a file 256 times? That would help to eliminate the complicated wrappers. Thanks, Michael On Thu, Nov 8, 2012 at 2:32 PM, Cook, Malcolm <m...@stowers.org<mailto:m...@stowers.org>> wrote: rtracklayer developers (Michael/Vincent/Robert), I find that tabix indexed exporting too many bed files causes an error. The session following my signature reproduces the error. It provides sessionInfo() details prior to the code causing the error because sessionInfo() FAILS with 'too many open files' after running this code (as does anything the opens files). The error does NOT occur when index=FALSE. Only when index=TRUE. I expect that the tabix calls are not cleaning up open file handles correctly. uname -a tells me on my mac OSX that I can have 256 files open. The bug happens during the 253rd bedfile. openConnections() returns nothing. closeAllConnections() does not clean them up. lsof to list open files at the command line does NOT show them. Michael(?), you resolved a similar issue I once reported with rtracklayer when creating bigBed files : https://lists.soe.ucsc.edu/pipermail/genome/2012-February/028343.html Any suggestions for workarounds? Any possibility of a quick patch to released rtracklayer? Thanks for rtracklayer! ~Malcolm Cook ----------------------------------------------------------- bash-3.2$ R R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows" Copyright (C) 2012 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(rtracklayer) Loading required package: GenomicRanges Loading required package: BiocGenerics Attaching package: 'BiocGenerics' The following object(s) are masked from 'package:stats': xtabs The following object(s) are masked from 'package:base': Filter, Find, Map, Position, Reduce, anyDuplicated, cbind, colnames, duplicated, eval, get, intersect, lapply, mapply, mget, order, paste, pmax, pmax.int<http://pmax.int>, pmin, pmin.int<http://pmin.int>, rbind, rep.int<http://rep.int>, rownames, sapply, setdiff, table, tapply, union, unique Loading required package: IRanges Warning message: package 'GenomicRanges' was built under R version 2.15.2 > sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rtracklayer_1.18.0 GenomicRanges_1.10.4 IRanges_1.16.4 BiocGenerics_0.4.0 loaded via a namespace (and not attached): [1] BSgenome_1.26.1 Biostrings_2.26.2 RCurl_1.95-3 Rsamtools_1.10.1 XML_3.95-0.1 bitops_1.0-4.2 parallel_2.15.1 stats4_2.15.1 tools_2.15.1 zlibbioc_1.4.0 > x<-sapply(sprintf('deleteme_%s.bed',1:1000), function(conn) > {export(GRanges('X',IRanges(1,2)),conn,index=TRUE);1}) Error in value[[3L]](cond) : index build failed file: /Volumes/SAN1/Users/mec/deleteme/253.bed.gz In addition: Warning message: In doTryCatch(return(expr), name, parentenv, handler) : [ti_index_build2] fail to create the index file. > sessionInfo() Error in gzfile(file, "rb") : cannot open the connection In addition: Warning message: In gzfile(file, "rb") : cannot open compressed file '/Library/Frameworks/R.framework/Versions/2.15/Resources/library/rtracklayer/Meta/package.rds', probable reason 'Too many open files' [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel