There's also "?coverage(ga)", so the user can see what happens specifically for their object, without worrying about typing out the class.
At some point it would be neat to generate S4 documentation at run-time. Just popup a browser and see every method that might be dispatched with a given object. In theory, the R help server would support this. On Thu, Feb 20, 2014 at 3:13 PM, Hervé Pagès <hpa...@fhcrc.org> wrote: > > > On 02/20/2014 02:55 PM, Martin Morgan wrote: > >> On 02/20/2014 02:32 PM, Hervé Pagès wrote: >> >>> Hi Jesper, >>> >>> On 02/20/2014 02:13 PM, Jesper Gådin wrote: >>> >>>> Very true that it is quite difficult to find the documentation when one >>>> is not aware of its existence :P >>>> >>> >>> Yeah, this has been a source of frustration for many people. And >>> always a source of embarrassment (for us) when teaching our software >>> to beginners. >>> >>> I've started to change this. In the upcoming version of BioC (2.14, >>> scheduled for mid-April), when you'll do ?coverage, you'll get to >>> choose between the 3 man pages that document coverage methods (there >>> is one in IRanges, one in GenomicRanges, and one in GenomicAlignments). >>> >>> I want to generalize this to other generics that have methods spread >>> across several packages (e.g. findOverlaps, the intra- and inter-range >>> methods, etc...). >>> >>> Having to choose between several man pages every time you do e.g. >>> ?findOverlaps is a minor annoyance compared to not being able to >>> find the man page at all. (Note that if you already know where is >>> your favorite man page, you'll be able to direct access it with >>> e.g. ?GenomicRanges::findOverlaps). Nobody will ever need to use >>> the insane ?`findOverlaps,GenomicRanges,GenomicRanges-method` to >>> >> >> tab completion helps, so that you don't need to be totally insane, just >> insane enough to know to start with >> >> ?"cover<tab> >> > > and also insane enough to know which method you need to pick up amongst > the 30+ methods listed by ?"findOverlaps<tab> > Overwhelming! > > H. > > > >> Martin >> >> open that man page again. Ever! (it will still work though...) >>> >>> Cheers, >>> H. >>> >>> >>>> coverage() is fast and beautiful. Thanks! >>>> >>>> /Jesper >>>> >>>> >>>> On Wed, Feb 19, 2014 at 9:21 PM, Hervé Pagès <hpa...@fhcrc.org >>>> <mailto:hpa...@fhcrc.org>> wrote: >>>> >>>> Hi Jesper, >>>> >>>> >>>> On 02/19/2014 08:44 AM, Michael Lawrence wrote: >>>> >>>> On Wed, Feb 19, 2014 at 8:39 AM, Jesper Gådin >>>> <jesper.ga...@gmail.com <mailto:jesper.ga...@gmail.com>>wrote: >>>> >>>> Hi Michael, >>>> >>>> Herves suggestion will probably work for my use case, but if >>>> there are any >>>> smoother ways it would be preferable. >>>> >>>> The use case is as follows: >>>> >>>> 1) calculate strand specific coverage over a region from >>>> GAlignments object (or file) >>>> >>>> At the moment I read a file using readGAlignmentsFromBam() >>>> with tag="XS", >>>> then filter it on "flag" and "mapq". Then I subset the >>>> resulting GAlignments in a minus and a plus -strand object. >>>> Then I calculate coverage by my own coverage function which >>>> uses the cigar >>>> information in the GAlignments object. This function is the >>>> one using >>>> cigarToRleList() at the moment. As I understand the >>>> coverage() function >>>> from the GenomicAlignments/IRanges package does not take >>>> into account >>>> cigars, or does it? >>>> >>>> >>>> It does take the coverage into account; specifically to exclude >>>> the introns >>>> from coverage. I think there's also an option to exclude >>>> deletions. >>>> >>>> >>>> Unfortunately the man page is not easy to access (you need to do >>>> ?`coverage,GAlignments-method`__), but it says: >>>> >>>> The methods for GAlignments and GAlignmentPairs objects do: >>>> >>>> coverage(grglist(x), ...) >>>> >>>> And if you do grglist() on a GAlignments or GAlignmentPairs >>>> objects, the >>>> ranges you get in the returned GRangesList object are calculated >>>> based >>>> on the CIGAR. >>>> >>>> Trust but verify. Here is how you can actually verify that >>>> coverage() >>>> does take the CIGAR into account: >>>> >>>> library(RNAseqData.HNRNPC.bam.__chr14) >>>> gal <- readGAlignments(RNAseqData.__ >>>> HNRNPC.bam.chr14_BAMFILES[1]) >>>> cig_op_table <- cigarOpTable(cigar(gal)) >>>> >>>> First we pick up an alignment with an N in its CIGAR and do >>>> coverage() >>>> on it: >>>> >>>> > gal_with_N <- gal[which(cig_op_table[ , "N"] != 0)[1]] >>>> >>>> > gal_with_N >>>> GAlignments with 1 alignment and 0 metadata columns: >>>> seqnames strand cigar qwidth start end >>>> width >>>> <Rle> <Rle> <character> <integer> <integer> <integer> >>>> <integer> >>>> [1] chr14 + 55M2117N17M 72 19650072 19652260 >>>> 2189 >>>> ngap >>>> <integer> >>>> [1] 1 >>>> --- >>>> seqlengths: >>>> chr1 chr10 ... >>>> chrY >>>> 249250621 135534747 ... >>>> 59373566 >>>> >>>> > coverage(gal_with_N)$chr14 >>>> integer-Rle of length 107349540 with 5 runs >>>> Lengths: 19650071 55 2117 17 87697280 >>>> Values : 0 1 0 1 0 >>>> >>>> Same thing with an alignment with an I in its CIGAR: >>>> >>>> > gal_with_I <- gal[which(cig_op_table[ , "I"] != 0)[1]] >>>> >>>> > gal_with_I >>>> GAlignments with 1 alignment and 0 metadata columns: >>>> seqnames strand cigar qwidth start end >>>> width >>>> <Rle> <Rle> <character> <integer> <integer> <integer> >>>> <integer> >>>> [1] chr14 - 64M1I7M 72 19411677 19411747 >>>> 71 >>>> ngap >>>> <integer> >>>> [1] 0 >>>> --- >>>> seqlengths: >>>> chr1 chr10 ... >>>> chrY >>>> 249250621 135534747 ... >>>> 59373566 >>>> >>>> > coverage(gal_with_I)$chr14 >>>> integer-Rle of length 107349540 with 3 runs >>>> Lengths: 19411676 71 87937793 <tel:71%2087937793> >>>> Values : 0 1 0 >>>> >>>> Same thing with an alignment with a D in its CIGAR: >>>> >>>> > gal_with_D <- gal[which(cig_op_table[ , "D"] != 0)[1]] >>>> >>>> > gal_with_D >>>> GAlignments with 1 alignment and 0 metadata columns: >>>> seqnames strand cigar qwidth start end >>>> width >>>> <Rle> <Rle> <character> <integer> <integer> <integer> >>>> <integer> >>>> [1] chr14 + 38M1D34M 72 19659063 19659135 >>>> 73 >>>> ngap >>>> <integer> >>>> [1] 0 >>>> --- >>>> seqlengths: >>>> chr1 chr10 ... >>>> chrY >>>> 249250621 135534747 ... >>>> 59373566 >>>> >>>> > coverage(gal_with_D)$chr14 >>>> integer-Rle of length 107349540 with 3 runs >>>> Lengths: 19659062 73 87690405 >>>> Values : 0 1 0 >>>> >>>> Seeing is believing, >>>> >>>> Cheers, >>>> H. >>>> >>>> >>>> >>>> I started to look at the applyPileups() in Rsamtools which I >>>> can get to >>>> calculate coverage using cigars, but not using the strand or >>>> flag >>>> information for filtering. That solution would start from a >>>> bam-file >>>> instead of a GAlignments object, and sure I can do the >>>> filtering outside R. >>>> But it would be very convenient to do it all from within R. >>>> >>>> If there are nice solutions starting from both a GAlignments >>>> and a >>>> bam-file it would be great! =) >>>> >>>> /Jesper >>>> >>>> >>>> >>>> On Tue, Feb 18, 2014 at 10:52 PM, Michael Lawrence < >>>> lawrence.mich...@gene.com >>>> <mailto:lawrence.mich...@gene.com>> wrote: >>>> >>>> Hi Jesper, >>>> >>>> Would you be willing to volunteer your use case? As >>>> Herve hinted, >>>> cigarToRleList and friends are low-level helpers. There >>>> may be an easier >>>> way to achieve what you want, or an opportunity to >>>> improve things. >>>> >>>> Michael >>>> >>>> >>>> On Mon, Feb 17, 2014 at 1:10 AM, Jesper Gådin >>>> <jesper.ga...@gmail.com >>>> <mailto:jesper.ga...@gmail.com>>wrote: >>>> >>>> Hi, >>>> >>>> Have come across a cigar-vector that is problematic >>>> to process. >>>> >>>> #load package >>>> >>>> library(GenomicAlignments) >>>> >>>> >>>> #load data (see attached file) >>>> >>>> load("2014-02-17-cigarExample.__rdata") >>>> >>>> >>>> #run function cigarToRleList >>>> >>>> cigarRle <- cigarToRleList(cigarExample) >>>> >>>> Error in .Call2("Rle_constructor", values, lengths, >>>> check, 0L, PACKAGE = >>>> "IRanges") : >>>> integer overflow while summing elements in >>>> 'lengths' >>>> >>>> sessionInfo() >>>> >>>> R Under development (unstable) (2013-11-13 r64209) >>>> Platform: x86_64-unknown-linux-gnu (64-bit) >>>> >>>> locale: >>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>>> [3] LC_TIME=en_US.UTF-8 >>>> LC_COLLATE=en_US.UTF-8 >>>> [5] LC_MONETARY=en_US.UTF-8 >>>> LC_MESSAGES=en_US.UTF-8 >>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>>> >>>> attached base packages: >>>> [1] parallel stats graphics grDevices utils >>>> datasets methods >>>> [8] base >>>> >>>> other attached packages: >>>> [1] GenomicAlignments_0.99.18 Rsamtools_1.15.26 >>>> [3] Biostrings_2.31.12 XVector_0.3.6 >>>> [5] GenomicRanges_1.15.26 IRanges_1.21.23 >>>> [7] BiocGenerics_0.9.3 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] BatchJobs_1.1-1135 BBmisc_1.5 >>>> BiocParallel_0.5.8 >>>> bitops_1.0-6 >>>> >>>> [5] brew_1.0-6 codetools_0.2-8 >>>> DBI_0.2-7 >>>> digest_0.6.4 >>>> >>>> [9] fail_1.2 foreach_1.4.1 >>>> iterators_1.0.6 plyr_1.8 >>>> >>>> [13] RSQLite_0.11.4 sendmailR_1.1-2 >>>> stats4_3.1.0 tools_3.1.0 >>>> >>>> [17] zlibbioc_1.9.0 >>>> >>>> _________________________________________________ >>>> Bioc-devel@r-project.org >>>> <mailto:Bioc-devel@r-project.org> mailing list >>>> https://stat.ethz.ch/mailman/__listinfo/bioc-devel >>>> <https://stat.ethz.ch/mailman/listinfo/bioc-devel> >>>> >>>> >>>> >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> >>>> >>>> >>>> _________________________________________________ >>>> Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> >>>> mailing list >>>> https://stat.ethz.ch/mailman/__listinfo/bioc-devel >>>> <https://stat.ethz.ch/mailman/listinfo/bioc-devel> >>>> >>>> >>>> -- >>>> Hervé Pagès >>>> >>>> Program in Computational Biology >>>> Division of Public Health Sciences >>>> Fred Hutchinson Cancer Research Center >>>> 1100 Fairview Ave. N, M1-B514 >>>> P.O. Box 19024 >>>> Seattle, WA 98109-1024 >>>> >>>> E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org> >>>> Phone: (206) 667-5791 <tel:%28206%29%20667-5791> >>>> Fax: (206) 667-1319 <tel:%28206%29%20667-1319> >>>> >>>> >>>> >>> >> >> > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpa...@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel