On 04/01/2014 02:43 PM, Michael Lawrence wrote:
Thanks Herve. I might not be so bad to have rep out in the unnamed case
(think of NULL names meaning wildcard). If we had:
i <- IntegerList(1:5)
x[i]
The 'i' does not really identify any one element in 'x'. If both 'i' and
'x' had names, then there would be a matching, but otherwise, truncating
'x' to length(i) is surprising, and it's hard to imagine a use-case for
it. In some ways, this is analogous to logical indexing, which is recycled.
But that said, my use case is really more of a pluckHead/Tail. Don't
worry about this release.
The pseq_len() utility I sent previously solves your pluckHead()
problem:
pluckHead <- function(x, n=6)
{
x[pseq_len(pmin(elementLengths(x), n))]
}
or, using the non-exported utility IRanges:::fancy_mseq():
pluckHead <- function(x, n=6)
{
x_eltlens <- unname(elementLengths(x))
i_eltlens <- pmin(x_eltlens, n)
i_skeleton <- PartitioningByEnd(cumsum(i_eltlens), names=names(x))
unlisted_i <- IRanges:::fancy_mseq(i_eltlens)
i <- relist(unlisted_i, i_skeleton)
x[i]
}
For pluckTail():
pluckTail <- function(x, n=6)
{
x_eltlens <- unname(elementLengths(x))
i_eltlens <- pmin(x_eltlens, n)
i_skeleton <- PartitioningByEnd(cumsum(i_eltlens), names=names(x))
offset <- x_eltlens - i_eltlens
unlisted_i <- IRanges:::fancy_mseq(i_eltlens, offset)
i <- relist(unlisted_i, i_skeleton)
x[i]
}
For both, 'n' can be of length > 1 and is recycled to the length of 'x'.
Negative values in 'n' are not supported but that should be easy to
add.
So I could add these 2 functions to IRanges, however, I'm not totally
convinced by the names. What about phead() and ptail() ("p" for
"parallel"), or vhead() and vtail() ("v" for "vectorized"), or mhead()
and mtail() (they're just fast equivalent to 'mapply(head, x, n)' and
'mapply(tail, x, n))', or...?
Thanks,
H.
Michael
On Tue, Apr 1, 2014 at 12:06 PM, Hervé Pagès <hpa...@fhcrc.org
<mailto:hpa...@fhcrc.org>> wrote:
On 04/01/2014 10:17 AM, Ryan wrote:
That won't work if any vector has fewer than 5 elements. Maybe
lapply(x, head, n=5)
would work?
Yes. Note that you can use endoapply() to preserve the class of the
original object:
> endoapply(cvg, head, n=5)
RleList of length 3
$chr1
integer-Rle of length 5 with 2 runs
Lengths: 4 1
Values : 1 2
$chr2
integer-Rle of length 5 with 4 runs
Lengths: 1 1 1 2
Values : 0 1 2 3
$chr3
integer-Rle of length 5 with 1 run
Lengths: 5
Values : 0
But lapply- or endoapply-based solutions are slower than a [ based
solution. Unfortunately the latter requires too much munging to get
the subscript right:
## parallel seq_len()
pseq_len <- function(eltlens)
{
ans_skeleton <- PartitioningByWidth(eltlens)
tmp <- relist(seq_len(sum(eltlens)), ans_skeleton)
tmp - start(ans_skeleton) + 1L
}
Then:
> pseq_len(c(5, 1, 0, 2))
IntegerList of length 4
[[1]] 1 2 3 4 5
[[2]] 1
[[3]] integer(0)
[[4]] 1 2
> cvg[pseq_len(pmin(__elementLengths(cvg), 5))]
RleList of length 3
$chr1
integer-Rle of length 5 with 2 runs
Lengths: 4 1
Values : 1 2
$chr2
integer-Rle of length 5 with 4 runs
Lengths: 1 1 1 2
Values : 0 1 2 3
$chr3
integer-Rle of length 5 with 1 run
Lengths: 5
Values : 0
H.
On Tue Apr 1 09:24:51 2014, Cook, Malcolm wrote:
in the mean time,
lapply(`[`,x,IntegerList(1:5))
??
>-----Original Message-----
>From: bioc-devel-bounces@r-project.__org
<mailto:bioc-devel-boun...@r-project.org>
[mailto:bioc-devel-bounces@r-__project.org
<mailto:bioc-devel-boun...@r-project.org>] On Behalf Of
Michael Lawrence
>Sent: Tuesday, April 01, 2014 9:21 AM
>To: bioc-devel@r-project.org
<mailto:bioc-devel@r-project.org>
>Subject: [Bioc-devel] Subsetting Lists by Lists
>
>Mostly to Herve:
>
>Sometimes we want to pluck the first 1, or 10, or
whatever elements
from
>each element of a list. If I had a list 'x', I thought I
could do
this with:
>
>x[IntegerList(1:5)]
>
>But it only gives elements 1:5 from x[[1]], not each
element of
'x'. In
>other words, I thought the index would be repped out.
Instead, 'x' is
>subset to the length of 'i', and I'm not sure if that
makes sense?
>
>But maybe what we really want are pluckHead/Tail, which
would be
robust to
>the case that < n elements are in an element. And of
course a more
general
>pluck(x, i) to select 'i' from each element, but I
wanted the line
above to
>do that.
>
>Michael
>
> [[alternative HTML version deleted]]
>
>_________________________________________________
>Bioc-devel@r-project.org
<mailto:Bioc-devel@r-project.org> mailing list
>https://stat.ethz.ch/mailman/__listinfo/bioc-devel
<https://stat.ethz.ch/mailman/listinfo/bioc-devel>
_________________________________________________
Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
mailing list
https://stat.ethz.ch/mailman/__listinfo/bioc-devel
<https://stat.ethz.ch/mailman/listinfo/bioc-devel>
_________________________________________________
Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
mailing list
https://stat.ethz.ch/mailman/__listinfo/bioc-devel
<https://stat.ethz.ch/mailman/listinfo/bioc-devel>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpa...@fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel