On 04/01/2014 02:43 PM, Michael Lawrence wrote:
Thanks Herve. I might not be so bad to have rep out in the unnamed case
(think of NULL names meaning wildcard). If we had:

i <- IntegerList(1:5)
x[i]

The 'i' does not really identify any one element in 'x'. If both 'i' and
'x' had names, then there would be a matching, but otherwise, truncating
'x' to length(i) is surprising, and it's hard to imagine a use-case for
it. In some ways, this is analogous to logical indexing, which is recycled.

But that said, my use case is really more of a pluckHead/Tail. Don't
worry about this release.

The pseq_len() utility I sent previously solves your pluckHead()
problem:

  pluckHead <- function(x, n=6)
  {
    x[pseq_len(pmin(elementLengths(x), n))]
  }

or, using the non-exported utility IRanges:::fancy_mseq():

  pluckHead <- function(x, n=6)
  {
    x_eltlens <- unname(elementLengths(x))
    i_eltlens <- pmin(x_eltlens, n)
    i_skeleton <- PartitioningByEnd(cumsum(i_eltlens), names=names(x))
    unlisted_i <- IRanges:::fancy_mseq(i_eltlens)
    i <- relist(unlisted_i, i_skeleton)
    x[i]
  }

For pluckTail():

  pluckTail <- function(x, n=6)
  {
    x_eltlens <- unname(elementLengths(x))
    i_eltlens <- pmin(x_eltlens, n)
    i_skeleton <- PartitioningByEnd(cumsum(i_eltlens), names=names(x))
    offset <- x_eltlens - i_eltlens
    unlisted_i <- IRanges:::fancy_mseq(i_eltlens, offset)
    i <- relist(unlisted_i, i_skeleton)
    x[i]
  }

For both, 'n' can be of length > 1 and is recycled to the length of 'x'.
Negative values in 'n' are not supported but that should be easy to
add.

So I could add these 2 functions to IRanges, however, I'm not totally
convinced by the names. What about phead() and ptail() ("p" for
"parallel"), or vhead() and vtail() ("v" for "vectorized"), or mhead()
and mtail() (they're just fast equivalent to 'mapply(head, x, n)' and
'mapply(tail, x, n))', or...?

Thanks,
H.


Michael



On Tue, Apr 1, 2014 at 12:06 PM, Hervé Pagès <hpa...@fhcrc.org
<mailto:hpa...@fhcrc.org>> wrote:

    On 04/01/2014 10:17 AM, Ryan wrote:

        That won't work if any vector has fewer than 5 elements. Maybe

        lapply(x, head, n=5)

        would work?


    Yes. Note that you can use endoapply() to preserve the class of the
    original object:

       > endoapply(cvg, head, n=5)

       RleList of length 3
       $chr1
       integer-Rle of length 5 with 2 runs
         Lengths: 4 1
         Values : 1 2


       $chr2
       integer-Rle of length 5 with 4 runs
         Lengths: 1 1 1 2
         Values : 0 1 2 3

       $chr3
       integer-Rle of length 5 with 1 run
         Lengths: 5
         Values : 0

    But lapply- or endoapply-based solutions are slower than a [ based
    solution. Unfortunately the latter requires too much munging to get
    the subscript right:

       ## parallel seq_len()
       pseq_len <- function(eltlens)
       {
         ans_skeleton <- PartitioningByWidth(eltlens)
         tmp <- relist(seq_len(sum(eltlens)), ans_skeleton)
         tmp - start(ans_skeleton) + 1L
       }

    Then:

       > pseq_len(c(5, 1, 0, 2))
       IntegerList of length 4
       [[1]] 1 2 3 4 5
       [[2]] 1
       [[3]] integer(0)
       [[4]] 1 2

       > cvg[pseq_len(pmin(__elementLengths(cvg), 5))]

       RleList of length 3
       $chr1
       integer-Rle of length 5 with 2 runs
         Lengths: 4 1
         Values : 1 2


       $chr2
       integer-Rle of length 5 with 4 runs
         Lengths: 1 1 1 2
         Values : 0 1 2 3

       $chr3
       integer-Rle of length 5 with 1 run
         Lengths: 5
         Values : 0

    H.



        On Tue Apr  1 09:24:51 2014, Cook, Malcolm wrote:

            in the mean time,

            lapply(`[`,x,IntegerList(1:5))

            ??

               >-----Original Message-----
               >From: bioc-devel-bounces@r-project.__org
            <mailto:bioc-devel-boun...@r-project.org>
            [mailto:bioc-devel-bounces@r-__project.org
            <mailto:bioc-devel-boun...@r-project.org>] On Behalf Of
            Michael Lawrence
               >Sent: Tuesday, April 01, 2014 9:21 AM
               >To: bioc-devel@r-project.org
            <mailto:bioc-devel@r-project.org>
               >Subject: [Bioc-devel] Subsetting Lists by Lists
               >
               >Mostly to Herve:
               >
               >Sometimes we want to pluck the first 1, or 10, or
            whatever elements
            from
               >each element of a list. If I had a list 'x', I thought I
            could do
            this with:
               >
               >x[IntegerList(1:5)]
               >
               >But it only gives elements 1:5 from x[[1]], not each
            element of
            'x'. In
               >other words, I thought the index would be repped out.
            Instead, 'x' is
               >subset to the length of 'i', and I'm not sure if that
            makes sense?
               >
               >But maybe what we really want are pluckHead/Tail, which
            would be
            robust to
               >the case that < n elements are in an element. And of
            course a more
            general
               >pluck(x, i) to select 'i' from each element, but I
            wanted the line
            above to
               >do that.
               >
               >Michael
               >
               >    [[alternative HTML version deleted]]
               >
               >_________________________________________________
               >Bioc-devel@r-project.org
            <mailto:Bioc-devel@r-project.org> mailing list
               >https://stat.ethz.ch/mailman/__listinfo/bioc-devel
            <https://stat.ethz.ch/mailman/listinfo/bioc-devel>

            _________________________________________________
            Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
            mailing list
            https://stat.ethz.ch/mailman/__listinfo/bioc-devel
            <https://stat.ethz.ch/mailman/listinfo/bioc-devel>


        _________________________________________________
        Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
        mailing list
        https://stat.ethz.ch/mailman/__listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/listinfo/bioc-devel>


    --
    Hervé Pagès

    Program in Computational Biology
    Division of Public Health Sciences
    Fred Hutchinson Cancer Research Center
    1100 Fairview Ave. N, M1-B514
    P.O. Box 19024
    Seattle, WA 98109-1024

    E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
    Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
    Fax: (206) 667-1319 <tel:%28206%29%20667-1319>



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to