hi Valerie,

If the Bam is not sorted by name, isn't it possible that readGAlignment*
will load > yieldSize number of reads in order to find the mate?

Mike


On Wed, Mar 19, 2014 at 1:04 PM, Valerie Obenchain <voben...@fhcrc.org>wrote:

> Hi Mike,
>
> You no longer need to sort Bam files to use the pairing algo or yieldSize.
> The readGAlignment* functions now work with both constraints out of the box.
>
> Create a BamFile with yieldSize and indicate you want mates.
> bf <- BamFile(fl, yieldSize=10000, asMates=TRUE)
>
> Maybe set some specifications in a param:
> param <- ScanBamParam(what = c("qname", "flag"))
>
> Then call either readGAlignment* method that handles pairs:
> readGAlignmentsList(bf, param=param)
> readGAlignmentPairs(bf, param=param)
>
> For summarizeOverlaps():
> summarizeOverlaps(annotation, bf, param=param, singleEnd=FALSE)
>
> We've considered removing the 'obeyQname' arg and documentation but
> thought the concept may be useful in another application. I'll revisit the
> summarizeOverlaps() documentation to make sure 'obeyQname' is downplayed
> and 'asMates' is encouraged.
>
> Valerie
>
>
>
>
> On 03/19/14 07:39, Michael Love wrote:
>
>> hi,
>>
>>  From last year, in order to use yieldSize with paired-end BAMs, I
>>>
>> should sort the BAMs by qname and then use the following call to
>> BamFile:
>>
>> library(pasillaBamSubset)
>> fl <- sortBam(untreated3_chr4(), tempfile(), byQname=TRUE)
>> bf <- BamFile(fl, index=character(0), yieldSize=3, obeyQname=TRUE)
>>
>> https://stat.ethz.ch/pipermail/bioconductor/2013-March/051490.html
>>
>> If I want to use GenomicAlignments::readGAlignmentsList with
>> asMates=TRUE and respecting the yieldSize, what is the proper
>> construction? (in the end, I want to use summarizeOverlaps on
>> paired-end BAMs while respecting the yieldSize)
>>
>> library(pasillaBamSubset)
>> fl <- sortBam(untreated3_chr4(), tempfile(), byQname=TRUE)
>> bf <- BamFile(fl, index=character(0), yieldSize=3, obeyQname=TRUE,
>> asMates=TRUE)
>> x <- readGAlignmentsList(bf)
>> Warning message:
>>   In scanBam(bamfile, ..., param = param) :
>>     'obeyQname=TRUE' ignored when 'asMates=TRUE'
>>   Calls: readGAlignmentsList ... .matesFromBam ->
>> .load_bamcols_from_bamfile -> scanBam -> scanBam
>>
>> I see in the man pages for summarizeOverlaps it has:
>>
>> "In Bioconductor > 2.12 it is not
>> necessary to sort paired-end BAM files by ‘qname’. When
>> counting with ‘summarizeOverlaps’, setting ‘singleEnd=FALSE’
>> will trigger paired-end reading and counting."
>>
>> but I don't see how this can respect the specified yieldSize, because
>> readGAlignmentsList has to read in as many reads as necessary to find
>> the mate.
>>
>> Sorry in advance if I am missing something in the documentation!
>>
>> Mike
>>
>> _______________________________________________
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to