On 03/19/14 07:39, Michael Love wrote:
hi,

From last year, in order to use yieldSize with paired-end BAMs, I
should sort the BAMs by qname and then use the following call to
BamFile:

library(pasillaBamSubset)
fl <- sortBam(untreated3_chr4(), tempfile(), byQname=TRUE)
bf <- BamFile(fl, index=character(0), yieldSize=3, obeyQname=TRUE)

https://stat.ethz.ch/pipermail/bioconductor/2013-March/051490.html

If I want to use GenomicAlignments::readGAlignmentsList with
asMates=TRUE and respecting the yieldSize, what is the proper
construction? (in the end, I want to use summarizeOverlaps on
paired-end BAMs while respecting the yieldSize)

library(pasillaBamSubset)
fl <- sortBam(untreated3_chr4(), tempfile(), byQname=TRUE)
bf <- BamFile(fl, index=character(0), yieldSize=3, obeyQname=TRUE, asMates=TRUE)
x <- readGAlignmentsList(bf)
Warning message:
  In scanBam(bamfile, ..., param = param) :
    'obeyQname=TRUE' ignored when 'asMates=TRUE'
  Calls: readGAlignmentsList ... .matesFromBam ->
.load_bamcols_from_bamfile -> scanBam -> scanBam

I see in the man pages for summarizeOverlaps it has:

"In Bioconductor > 2.12 it is not
necessary to sort paired-end BAM files by ‘qname’. When
counting with ‘summarizeOverlaps’, setting ‘singleEnd=FALSE’
will trigger paired-end reading and counting."

but I don't see how this can respect the specified yieldSize, because
readGAlignmentsList has to read in as many reads as necessary to find
the mate.


I didn't specifically answer this in my last email. The reason yieldSize is no longer a problem is that we've rewritten the mate pairing algorithm in C so pairing is no longer done in R. Both readGAlignmentsList() and readGAlignmentPairs() use the same code. The *List container is more flexible in what it can hold (singletons, reads with unmapped mates etc.) and the *Pairs container holds mated-pairs in a left-right organization.

Valerie


Sorry in advance if I am missing something in the documentation!

Mike

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to