On 02/20/2014 04:16 PM, Michael Lawrence wrote:
There's also "?coverage(ga)", so the user can see what happens
specifically for their object, without worrying about typing out the class.

I was never lucky with this:

  > library(rtracklayer)
  > path_to_gff <- system.file("tests", "v1.gff", package = "rtracklayer")
  > ?import(path_to_gff)
  Error in .helpForCall(topicExpr, parent.frame()) :
no documentation for function ‘import’ and signature ‘con = "character", format = "ANY", text = "ANY"’
  In addition: Warning message:
  In .helpForCall(topicExpr, parent.frame()) :
no method defined for function ‘import’ and signature ‘con = "character", format = "ANY", text = "ANY"’

H.


At some point it would be neat to generate S4 documentation at run-time.
Just popup a browser and see every method that might be dispatched with
a given object. In theory, the R help server would support this.


On Thu, Feb 20, 2014 at 3:13 PM, Hervé Pagès <hpa...@fhcrc.org
<mailto:hpa...@fhcrc.org>> wrote:



    On 02/20/2014 02:55 PM, Martin Morgan wrote:

        On 02/20/2014 02:32 PM, Hervé Pagès wrote:

            Hi Jesper,

            On 02/20/2014 02:13 PM, Jesper Gådin wrote:

                Very true that it is quite difficult to find the
                documentation when one
                is not aware of its existence :P


            Yeah, this has been a source of frustration for many people. And
            always a source of embarrassment (for us) when teaching our
            software
            to beginners.

            I've started to change this. In the upcoming version of BioC
            (2.14,
            scheduled for mid-April), when you'll do ?coverage, you'll
            get to
            choose between the 3 man pages that document coverage
            methods (there
            is one in IRanges, one in GenomicRanges, and one in
            GenomicAlignments).

            I want to generalize this to other generics that have
            methods spread
            across several packages (e.g. findOverlaps, the intra- and
            inter-range
            methods, etc...).

            Having to choose between several man pages every time you do
            e.g.
            ?findOverlaps is a minor annoyance compared to not being able to
            find the man page at all. (Note that if you already know
            where is
            your favorite man page, you'll be able to direct access it with
            e.g. ?GenomicRanges::findOverlaps). Nobody will ever need to use
            the insane
            ?`findOverlaps,GenomicRanges,__GenomicRanges-method` to


        tab completion helps, so that you don't need to be totally
        insane, just
        insane enough to know to start with

        ?"cover<tab>


    and also insane enough to know which method you need to pick up amongst
    the 30+ methods listed by ?"findOverlaps<tab>
    Overwhelming!

    H.



        Martin

            open that man page again. Ever! (it will still work though...)

            Cheers,
            H.


                coverage() is fast and beautiful. Thanks!

                /Jesper


                On Wed, Feb 19, 2014 at 9:21 PM, Hervé Pagès
                <hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
                <mailto:hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>>> wrote:

                     Hi Jesper,


                     On 02/19/2014 08:44 AM, Michael Lawrence wrote:

                         On Wed, Feb 19, 2014 at 8:39 AM, Jesper Gådin
                         <jesper.ga...@gmail.com
                <mailto:jesper.ga...@gmail.com>
                <mailto:jesper.ga...@gmail.com
                <mailto:jesper.ga...@gmail.com>__>>wrote:

                             Hi Michael,

                             Herves suggestion will probably work for my
                use case, but if
                             there are any
                             smoother ways it would be preferable.

                             The use case is as follows:

                             1) calculate strand specific coverage over
                a region from
                             GAlignments object (or file)

                             At the moment I read a file using
                readGAlignmentsFromBam()
                             with tag="XS",
                             then filter it on "flag" and "mapq". Then I
                subset the
                             resulting GAlignments in a minus and a plus
                -strand object.
                             Then I calculate coverage by my own
                coverage function which
                             uses the cigar
                             information in the GAlignments object. This
                function is the
                             one using
                             cigarToRleList() at the moment. As I
                understand the
                             coverage() function
                             from the GenomicAlignments/IRanges package
                does not take
                             into account
                             cigars, or does it?


                         It does take the coverage into account;
                specifically to exclude
                         the introns
                         from coverage. I think there's also an option
                to exclude
                deletions.


                     Unfortunately the man page is not easy to access
                (you need to do
                     ?`coverage,GAlignments-method`____), but it says:

                         The methods for GAlignments and GAlignmentPairs
                objects do:

                           coverage(grglist(x), ...)

                     And if you do grglist() on a GAlignments or
                GAlignmentPairs
                objects, the
                     ranges you get in the returned GRangesList object
                are calculated
                based
                     on the CIGAR.

                     Trust but verify. Here is how you can actually
                verify that
                coverage()
                     does take the CIGAR into account:

                        library(RNAseqData.HNRNPC.bam.____chr14)
                        gal <-
                readGAlignments(RNAseqData.____HNRNPC.bam.chr14_BAMFILES[1])
                        cig_op_table <- cigarOpTable(cigar(gal))

                     First we pick up an alignment with an N in its
                CIGAR and do
                coverage()
                     on it:

                        > gal_with_N <- gal[which(cig_op_table[ , "N"]
                != 0)[1]]

                        > gal_with_N
                        GAlignments with 1 alignment and 0 metadata columns:
                              seqnames strand       cigar    qwidth
                start       end
                        width
                                 <Rle>  <Rle> <character> <integer>
                <integer> <integer>
                     <integer>
                          [1]    chr14      + 55M2117N17M        72
                  19650072  19652260
                         2189
                                   ngap
                              <integer>
                          [1]         1
                          ---
                          seqlengths:
                                          chr1                 chr10 ...
                       chrY
                                     249250621             135534747 ...
                       59373566

                        > coverage(gal_with_N)$chr14
                        integer-Rle of length 107349540 with 5 runs
                          Lengths: 19650071       55     2117       17
                87697280
                          Values :        0        1        0        1
                      0

                     Same thing with an alignment with an I in its CIGAR:

                        > gal_with_I <- gal[which(cig_op_table[ , "I"]
                != 0)[1]]

                        > gal_with_I
                        GAlignments with 1 alignment and 0 metadata columns:
                              seqnames strand       cigar    qwidth
                start       end
                        width
                                 <Rle>  <Rle> <character> <integer>
                <integer> <integer>
                     <integer>
                          [1]    chr14      -     64M1I7M        72
                  19411677  19411747
                           71
                                   ngap
                              <integer>
                          [1]         0
                          ---
                          seqlengths:
                                          chr1                 chr10 ...
                       chrY
                                     249250621             135534747 ...
                       59373566

                        > coverage(gal_with_I)$chr14
                        integer-Rle of length 107349540 with 3 runs
                          Lengths: 19411676 71 87937793 <tel:71%2087937793>
                          Values :        0        1        0

                     Same thing with an alignment with a D in its CIGAR:

                        > gal_with_D <- gal[which(cig_op_table[ , "D"]
                != 0)[1]]

                        > gal_with_D
                        GAlignments with 1 alignment and 0 metadata columns:
                              seqnames strand       cigar    qwidth
                start       end
                        width
                                 <Rle>  <Rle> <character> <integer>
                <integer> <integer>
                     <integer>
                          [1]    chr14      +    38M1D34M        72
                  19659063  19659135
                           73
                                   ngap
                              <integer>
                          [1]         0
                          ---
                          seqlengths:
                                          chr1                 chr10 ...
                       chrY
                                     249250621             135534747 ...
                       59373566

                        > coverage(gal_with_D)$chr14
                        integer-Rle of length 107349540 with 3 runs
                          Lengths: 19659062       73 87690405
                          Values :        0        1        0

                     Seeing is believing,

                     Cheers,
                     H.



                             I started to look at the applyPileups() in
                Rsamtools which I
                             can get to
                             calculate coverage using cigars, but not
                using the strand or
                             flag
                             information for filtering. That solution
                would start from a
                             bam-file
                             instead of a GAlignments object, and sure I
                can do the
                             filtering outside R.
                             But it would be very convenient to do it
                all from within R.

                             If there are nice solutions starting from
                both a GAlignments
                             and a
                             bam-file it would be great! =)

                             /Jesper



                             On Tue, Feb 18, 2014 at 10:52 PM, Michael
                Lawrence <
                lawrence.mich...@gene.com <mailto:lawrence.mich...@gene.com>
                             <mailto:lawrence.michael@gene.__com
                <mailto:lawrence.mich...@gene.com>>> wrote:

                                 Hi Jesper,

                                 Would you be willing to volunteer your
                use case? As
                                 Herve hinted,
                                 cigarToRleList and friends are
                low-level helpers. There
                                 may be an easier
                                 way to achieve what you want, or an
                opportunity to
                                 improve things.

                                 Michael


                                 On Mon, Feb 17, 2014 at 1:10 AM, Jesper
                Gådin
                                 <jesper.ga...@gmail.com
                <mailto:jesper.ga...@gmail.com>
                                 <mailto:jesper.ga...@gmail.com
                <mailto:jesper.ga...@gmail.com>__>>wrote:

                                     Hi,

                                     Have come across a cigar-vector
                that is problematic
                                     to process.

                                     #load package

                                         library(GenomicAlignments)


                                     #load data (see attached file)


                load("2014-02-17-cigarExample.____rdata")


                                     #run function cigarToRleList

                                         cigarRle <-
                cigarToRleList(cigarExample)

                                     Error in .Call2("Rle_constructor",
                values, lengths,
                                     check, 0L, PACKAGE =
                                     "IRanges") :
                                         integer overflow while summing
                elements in
                'lengths'

                                         sessionInfo()

                                     R Under development (unstable)
                (2013-11-13 r64209)
                                     Platform: x86_64-unknown-linux-gnu
                (64-bit)

                                     locale:
                                        [1] LC_CTYPE=en_US.UTF-8
                LC_NUMERIC=C
                                        [3] LC_TIME=en_US.UTF-8
                LC_COLLATE=en_US.UTF-8
                                        [5] LC_MONETARY=en_US.UTF-8
                                       LC_MESSAGES=en_US.UTF-8
                                        [7] LC_PAPER=en_US.UTF-8
                LC_NAME=C
                                        [9] LC_ADDRESS=C
                LC_TELEPHONE=C
                                     [11] LC_MEASUREMENT=en_US.UTF-8
                LC_IDENTIFICATION=C

                                     attached base packages:
                                     [1] parallel  stats     graphics
                  grDevices utils
                                        datasets  methods
                                     [8] base

                                     other attached packages:
                                     [1] GenomicAlignments_0.99.18
                Rsamtools_1.15.26
                                     [3] Biostrings_2.31.12
                  XVector_0.3.6
                                     [5] GenomicRanges_1.15.26
                IRanges_1.21.23
                                     [7] BiocGenerics_0.9.3

                                     loaded via a namespace (and not
                attached):
                                        [1] BatchJobs_1.1-1135 BBmisc_1.5
                                     BiocParallel_0.5.8
                                     bitops_1.0-6

                                        [5] brew_1.0-6
                codetools_0.2-8
                DBI_0.2-7
                                        digest_0.6.4

                                        [9] fail_1.2           foreach_1.4.1
                                       iterators_1.0.6    plyr_1.8

                                     [13] RSQLite_0.11.4     sendmailR_1.1-2
                                       stats4_3.1.0       tools_3.1.0

                                     [17] zlibbioc_1.9.0


                ___________________________________________________
                Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
                                     <mailto:Bioc-devel@r-project.__org
                <mailto:Bioc-devel@r-project.org>> mailing list
                https://stat.ethz.ch/mailman/____listinfo/bioc-devel
                <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>

                <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
                <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>





                                  [[alternative HTML version deleted]]




                         ___________________________________________________
                Bioc-devel@r-project.org
                <mailto:Bioc-devel@r-project.org>
                <mailto:Bioc-devel@r-project.__org
                <mailto:Bioc-devel@r-project.org>>
                         mailing list
                https://stat.ethz.ch/mailman/____listinfo/bioc-devel
                <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>

                <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
                <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>


                     --
                     Hervé Pagès

                     Program in Computational Biology
                     Division of Public Health Sciences
                     Fred Hutchinson Cancer Research Center
                     1100 Fairview Ave. N, M1-B514
                     P.O. Box 19024
                     Seattle, WA 98109-1024

                     E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
                <mailto:hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>>
                     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
                <tel:%28206%29%20667-5791>
                     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
                <tel:%28206%29%20667-1319>






    --
    Hervé Pagès

    Program in Computational Biology
    Division of Public Health Sciences
    Fred Hutchinson Cancer Research Center
    1100 Fairview Ave. N, M1-B514
    P.O. Box 19024
    Seattle, WA 98109-1024

    E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
    Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
    Fax: (206) 667-1319 <tel:%28206%29%20667-1319>



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to