On Thu, Oct 19, 2017 at 5:25 PM, Chunlei Wu <c...@scripps.edu> wrote:
> Hello BioC-dev group, > > > We are working on a new R package right now and plan to submit > it to Bioconductor soon. It's a unified R client for the collection of > BioThings APIs (http://biothings.io). Using R6 class, it makes a lot > sense to me as I'm coming from Python's OOP experience. It will be used > like this: > > > library(biothings) > > gene_client <- BioThingsR6$new("gene") > gene_client$query("CDK2") > > > variant_client <- BioThingsR6$new("variant") > gene_client$query("dbsnp.rsid:rs1000") > > Each "client" above is corresponding to a specific BioThings API, e.g. one > for gene, and one for variant. And we will have more "clients" as we are > expanding the number of BioThings API. The same R code should work with the > future APIs. > > But if we use the traditional S4 class, it will be awkward as all > functions/methods are not "namespaced", we will need to define new > functions for each additional API. Something like this: > > library(biothings) > geneQuery("CDK2") > variantQuery("dbsnp.rsid:rs1000") > > If we ignore the mutability aspect, the difference here is only syntax. gene_client$query("CDK2") <-> query(gene_client, "CDK2") variant_client$query("dbsnp.rsid:rs1000") <-> query(variant_client, "dbsnp.rsid:rs1000") The problem in both APIs is that "query" is too generic; it's semantically poor. You're depending on the user choosing an informative name for the client in order to know what type of thing is being returned. Using explicitly named functions helps to prevents this. Presumably BioThings already has some sort of schema for each "thing", and the interface should correspond. It's true that the functional syntax has the potential for symbol collisions, but that's what namespaces are for. If it all possible though, use the collision to your advantage and set methods on existing generics. For example, genes() and transcripts() from GenomicFeatures are probably relevant. But the mutable/reference semantics do matter, unless this is a read-only API. Even if it were, the message-passing syntax (in R anyway) is unfamiliar to virtually every R user. But even if you're not convinced by all of that, at least use S4 reference classes, not R6, so that there is some level of integration, and you can take advantage of all the other S4 features. I also want to mention that "query" is not the only method for each API > client, there will be several other methods for each client. It will > quickly make the function names messy if we go with the S4 option. > > Anyway, we think we like R6 class better, but just want to get some > feedback here if the usage pattern using R6 class has been well-accepted in > the R community. Will the users feel cumbersome if they have to instantiate > the class first and then make the function calls? The majority of the > existing BioC package are indeed S4 class based, which makes us feel > hesitated. > > Thanks, > > Chunlei > > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel