Retrieving the data for a genomic range is efficient, doing this for thousands of samples might get tricky, but could probably be vectorized through clever use of matrices. But millions of regions by thousands of samples might need some support in native code, along the lines of viewSums, etc, but iterating over the bigwigs directly. Maybe you guys couple implement something in R and then we could profile and optimize it.
On Mon, Nov 18, 2013 at 4:33 PM, Kasper Daniel Hansen < kasperdanielhan...@gmail.com> wrote: > (Michael Love and I had some discussion on this Friday) > > I also think it would be a very convenient class/method. A lot of data > these days are naturally represented (and are available from say GEO) as > bigWig files (essentially coverage tracks), for example ChIP-seq. This > would be much more efficient than converting BAM to coverage on the fly. > > It seems to me that bigWig ought to be efficient for this, but I am not > very familiar with its performance. What we want is really to be able to > chunk multiple coverage profiles over the genome, and do computations on > each of the chunks. Any idea on efficiency? I am happy to contribute a > bit, at least with design. > > Best, > Kasper > > > On Mon, Nov 18, 2013 at 6:11 PM, Michael Lawrence < > lawrence.mich...@gene.com> wrote: > >> Aggregating coverage over multiple samples is a popular request recently. >> I'm happy to support this effort, but I thinks someone in Seattle is going >> to have to take the lead on it. >> >> >> On Mon, Nov 18, 2013 at 2:36 PM, Michael Love >> <michaelisaiahl...@gmail.com>wrote: >> >> > a discussion came up on devel last year about looking at a genomic range >> > over multiple samples and multiple experiments ( >> > >> > >> https://stat.ethz.ch/pipermail/bioc-devel/attachments/20120920/93a4fb61/attachment.pl >> > ) >> > >> > stepping aside the multiple experiment part, I'm interested in >> > BigWigViews() with fixed ranges across samples. Has there been any more >> > thoughts in this direction? >> > >> > BigWigViews would be incredibly useful for genomics applications where >> we >> > want to scan along the genome looking at lots of samples. BigWig offers >> a >> > concise representation of the information compared to BAM files. >> > >> > What I am trying now is using import(BigWigFile, which=gr) on files one >> by >> > one, and then binding the coverage together. >> > >> > best, >> > >> > Mike >> > >> > [[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioc-devel@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioc-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel