my current reason to prefer a CompressedVRangesList object over a
SimpleVRangesList object is that i find one order of magnitude
difference in creation time in each of these classes of objects:
library(VariantAnnotation)
fl <- system.file("extdata", "CEUtrio.vcf.bgz",
package="VariantFiltering")
vcf <- readVcf(fl, genome="hg19")
vr <- as(vcf, "VRanges")
length(vr)
[1] 15000
## create a VRangesList object
system.time(vrl <- do.call("VRangesList", split(vr, sampleNames(vr))))
user system elapsed
0.247 0.004 0.252
## create a CompressedVRangesList object
system.time(cvrl <- new("CompressedVRangesList", split(vr,
sampleNames(vr))))
user system elapsed
0.019 0.000 0.019
0.252/0.019
[1] 13.26316
with a larger vcf differences increase:
[... load vcf, coerce to VRanges ...]
length(vr)
[1] 25916
system.time(vrl <- do.call("VRangesList", split(vr, sampleNames(vr))))
user system elapsed
2.672 0.000 2.676
system.time(cvrl <- new("CompressedVRangesList", split(vr,
sampleNames(vr))))
user system elapsed
0.014 0.000 0.014
2.676 / 0.014
[1] 191.1429
so maybe i'm using the wrong way to construct a VRangesList object, but
according to our last conversation about this, there was no obvious
default fast way to do it, starting from a VRanges object:
https://stat.ethz.ch/pipermail/bioc-devel/2015-January/006905.html
it would be great if there's a fast way to do this kind of construction.
thanks,
robert.
On 02/25/2015 04:42 PM, Michael Lawrence wrote:
If you're storing data on a relatively small number of individuals (say,
hundreds), you should use SimpleVRangesList, not CompressedVRangesList.
On Wed, Feb 25, 2015 at 7:10 AM, Robert Castelo <robert.cast...@upf.edu
<mailto:robert.cast...@upf.edu>> wrote:
i see you point, the logic i was thinking about is to use a list of
VRanges objects to hold separately the variants of multiple
individuals, with one VRanges object per individual.
if i type the name of such a list object on the R shell, having the
GRangesList show method, i feel i do not see much information
because the screen just scrolls up tens or hundreds of lines
specifiying variants per individual. however, the concise appearance
of something like a VRangesList:
> vrl
VRangesList of length 10
names(32): S1 S2 S3 S4 ... S7 S8 S9 S10
at least suggests the user that the object holding the variants has
information for 10 samples and belongs to the class 'VRangesList'.
i thought this made general sense but i'm fine if you feel this
interpretation does not warrant such a change.
cheers,
robert.
On 02/25/2015 01:25 AM, Michael Lawrence wrote:
Why not have the SimpleVRangesList be shown like
CompressedVRangesList,
for consistency with GRangesList? In other words, the opposite
of what
you propose. A strong argument could also be made that a
SimpleGenomicRangesList should be shown like a GRangesList.
Unless there
is some aversion to the more verbose output....
On Tue, Feb 24, 2015 at 2:36 PM, Robert Castelo
<robert.cast...@upf.edu <mailto:robert.cast...@upf.edu>
<mailto:robert.cast...@upf.edu
<mailto:robert.cast...@upf.edu>__>> wrote:
so, yes, but IMO rather than inheriting the show method from a
GRangesList, i think that the show method for
CompressedVRangesList
objects should be inherited from a VRangesList object.
right now
this is the situation:
library(VariantAnnotation)
example(VRangesList)
vrl
VRangesList of length 2
names(2): sampleA sampleB
cvrl <- new("CompressedVRangesList", split(vr,
sampleNames(vr)))
cvrl
CompressedVRangesList object of length 2:
$a
VRanges object with 1 range and 1 metadata column:
seqnames ranges strand ref alt
totalDepth refDepth altDepth
<Rle> <IRanges> <Rle> <character> <characterOrRle> <integerOrRle>
<integerOrRle> <integerOrRle>
[1] chr1 [1, 5] + T
C 12 5 7
sampleNames softFilterMatrix | tumorSpecific
<factorOrRle> <matrix> | <logical>
[1] a TRUE | FALSE
$b
VRanges object with 1 range and 1 metadata column:
seqnames ranges strand ref alt totalDepth refDepth
altDepth
sampleNames softFilterMatrix |
[1] chr2 [10, 20] + A T 17 10
6 b FALSE |
tumorSpecific
[1] TRUE
-------
seqinfo: 2 sequences from an unspecified genome; no seqlengths
would it be possible to have the VRangesList show method for
CompressedVRangesList objects?
robert.
On 2/24/15 7:24 PM, Michael Lawrence wrote:
I think you might be missing an import. It should
inherit the
method for GRangesList.
On Tue, Feb 24, 2015 at 9:53 AM, Robert Castelo
<robert.cast...@upf.edu <mailto:robert.cast...@upf.edu>
<mailto:robert.cast...@upf.edu
<mailto:robert.cast...@upf.edu>__>> wrote:
hi,
i'm using the CompressedVRangesList class in
VariantFiltering
to hold variants and their annotations across
multiple samples
and found that there was no show method for this
class (unless
i'm missing the right import here) so i made one within
VariantFiltering by copying&pasting from other
similar classes:
setMethod("show",
signature(object="__CompressedVRangesList"),
function(object) {
lo <- length(object)
cat(classNameForDisplay(__object), " of
length ",
lo, "\n",
sep = "")
if (!is.null(names(object)))
cat(BiocGenerics:::__labeledLine("names",
names(object)))
})
i guess, however, that the right home for this would be
VariantAnnotation. let me know if you consider
adding it there
(or somewhere else) and i'll remove it from
VariantFiltering.
thanks,
robert.
_________________________________________________
Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
<mailto:Bioc-devel@r-project.__org
<mailto:Bioc-devel@r-project.org>>
mailing list
https://stat.ethz.ch/mailman/__listinfo/bioc-devel
<https://stat.ethz.ch/mailman/listinfo/bioc-devel>
--
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514 <tel:%2B34.933.160.514>
fax: +34.933.160.550 <tel:%2B34.933.160.550>
--
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel