rtracklayer essentially has this, although registration is implicit through extension of RTLFile or RsamtoolsFile, and the extension is taken from the class name. There is a BigWigFile, corresponding to ".bigwig", and that is extended by BWFile to support the ".bw" extension. The expectation is that other packages would extend RTLFile to implictly register handlers. I'm not sure there is a use case for generalization, but this proposal makes registration more explicit, which is probably a good thing. rtracklayer was just piggy backing on S4 registration.
I'm a little bit confused by the use of Lists rather than individual File objects. Are you also proposing that all RTLFiles would need a corresponding List, and that there would need to be an RTLFileList method for the various generics? It may not be necessary to specify the package name. There should be an environment (where) argument that defaults to topenv(parent.frame()), and that should suffice. Michael On Mon, Mar 10, 2014 at 8:46 PM, Valerie Obenchain <voben...@fhcrc.org>wrote: > Hi all, > > I'm soliciting feedback on the idea of a general file 'registry' that > would identify file types by their extensions. This is similar in spirit to > FileForformat() in rtracklayer but a more general abstraction that could be > used across packages. The goal is to allow a user to supply only file > name(s) to a method instead of first creating a 'File' class such as > BamFile, FaFile, BigWigFile etc. > > A first attempt at this is in the GenomicFileViews package ( > https://github.com/Bioconductor/GenomicFileViews). A registry (lookup) is > created as an environment at load time: > > .fileTypeRegistry <- new.env(parent=emptyenv() > > Files are registered with an information triplet consisting of class, > package and regular expression to identify the extension. In > GenomicFileViews we register FaFileList, BamFileList and BigWigFileList but > any 'File' class can be registered that has a constructor of the same name. > > .onLoad <- function(libname, pkgname) > { > registerFileType("FaFileList", "Rsamtools", "\\.fa$") > registerFileType("FaFileList", "Rsamtools", "\\.fasta$") > registerFileType("BamFileList", "Rsamtools", "\\.bam$") > registerFileType("BigWigFileList", "rtracklayer", "\\.bw$") > } > > The makeFileType() helper creates the appropriate class. This function is > used behind the scenes to do the lookup and coerce to the correct 'File' > class. > > > makeFileType(c("foo.bam", "bar.bam")) > BamFileList of length 2 > names(2): foo.bam bar.bam > > New types can be added at any time with registerFileType(): > > registerFileType(NewClass, NewPackage, "\\.NewExtension$") > > > Thoughts: > > (1) If this sounds generally useful where should it live? rtracklayer, > GenomicFileViews or other? Alternatively it could be its own lightweight > package (FileRegister) that creates the registry and provides the helpers. > It would be up to the package authors that depend on FileRegister to > register their own files types at load time. > > (2) To avoid potential ambiguities maybe searching should be by regex and > package name. Still a work in progress. > > > Valerie > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel