hi, >From last year, in order to use yieldSize with paired-end BAMs, I should sort the BAMs by qname and then use the following call to BamFile:
library(pasillaBamSubset) fl <- sortBam(untreated3_chr4(), tempfile(), byQname=TRUE) bf <- BamFile(fl, index=character(0), yieldSize=3, obeyQname=TRUE) https://stat.ethz.ch/pipermail/bioconductor/2013-March/051490.html If I want to use GenomicAlignments::readGAlignmentsList with asMates=TRUE and respecting the yieldSize, what is the proper construction? (in the end, I want to use summarizeOverlaps on paired-end BAMs while respecting the yieldSize) library(pasillaBamSubset) fl <- sortBam(untreated3_chr4(), tempfile(), byQname=TRUE) bf <- BamFile(fl, index=character(0), yieldSize=3, obeyQname=TRUE, asMates=TRUE) x <- readGAlignmentsList(bf) Warning message: In scanBam(bamfile, ..., param = param) : 'obeyQname=TRUE' ignored when 'asMates=TRUE' Calls: readGAlignmentsList ... .matesFromBam -> .load_bamcols_from_bamfile -> scanBam -> scanBam I see in the man pages for summarizeOverlaps it has: "In Bioconductor > 2.12 it is not necessary to sort paired-end BAM files by ‘qname’. When counting with ‘summarizeOverlaps’, setting ‘singleEnd=FALSE’ will trigger paired-end reading and counting." but I don't see how this can respect the specified yieldSize, because readGAlignmentsList has to read in as many reads as necessary to find the mate. Sorry in advance if I am missing something in the documentation! Mike _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel