Hi Val, I think it would help understand the motivations behind this proposal if you could give an example of a method where the user cannot supply a file name but has to create a 'File' (or 'FileList') object first. And how the file registry proposal below would help. It looks like you have such an example in the GenomicFileViews package. Do you think you could give more details?
Thanks, H. On 03/10/2014 08:46 PM, Valerie Obenchain wrote:
Hi all, I'm soliciting feedback on the idea of a general file 'registry' that would identify file types by their extensions. This is similar in spirit to FileForformat() in rtracklayer but a more general abstraction that could be used across packages. The goal is to allow a user to supply only file name(s) to a method instead of first creating a 'File' class such as BamFile, FaFile, BigWigFile etc. A first attempt at this is in the GenomicFileViews package (https://github.com/Bioconductor/GenomicFileViews). A registry (lookup) is created as an environment at load time: .fileTypeRegistry <- new.env(parent=emptyenv() Files are registered with an information triplet consisting of class, package and regular expression to identify the extension. In GenomicFileViews we register FaFileList, BamFileList and BigWigFileList but any 'File' class can be registered that has a constructor of the same name. .onLoad <- function(libname, pkgname) { registerFileType("FaFileList", "Rsamtools", "\\.fa$") registerFileType("FaFileList", "Rsamtools", "\\.fasta$") registerFileType("BamFileList", "Rsamtools", "\\.bam$") registerFileType("BigWigFileList", "rtracklayer", "\\.bw$") } The makeFileType() helper creates the appropriate class. This function is used behind the scenes to do the lookup and coerce to the correct 'File' class. > makeFileType(c("foo.bam", "bar.bam")) BamFileList of length 2 names(2): foo.bam bar.bam New types can be added at any time with registerFileType(): registerFileType(NewClass, NewPackage, "\\.NewExtension$") Thoughts: (1) If this sounds generally useful where should it live? rtracklayer, GenomicFileViews or other? Alternatively it could be its own lightweight package (FileRegister) that creates the registry and provides the helpers. It would be up to the package authors that depend on FileRegister to register their own files types at load time. (2) To avoid potential ambiguities maybe searching should be by regex and package name. Still a work in progress. Valerie _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
-- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319 _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel