On 09/10/2011 08:08 AM, André Rossi wrote:
Hi everybody!

I'm creating an object of a S4 class that has two slots: ListExamples, which
is a list, and idx, which is an integer (as the code below).

Then, I read a data.frame file with 10000 (ten thousands) of lines and 10
columns, do some pre-processing and, basically, I store each line as an
element of a list in the slot ListExamples of the S4 object. However, many
operations after this take a considerable time.

Can anyone explain me why dois it happen? Is it possible to speed up an
script that deals with a big number of data (it might be data.frame or
list)?

Thank you,

André Rossi

setClass("Buffer",
     representation=representation(
         Listexamples = "list",
         idx = "integer"
     )
)

Hi André,

Can you provide a simpler and more reproducible example, for instance

> setClass("Buf", representation=representation(lst="list"))
[1] "Buf"
> b=new("Buf", lst=replicate(10000, list(10), simplify=FALSE))
> system.time({ b@lst[[1]][[1]] = 2 })
   user  system elapsed
  0.005   0.000   0.005

Generally it sounds like you're modeling the rows as elements of Listofelements, but you're better served by modeling the columns (lst = replicate(10, integer(10000)), if all of your 10 columns were integer-valued, for instance). Also, S4 is providing some measure of type safety, and you're undermining that by having your class contain a 'list'. I'd go after

setClass("Buffer",
         representation=representation(
           col1="integer",
           col2="character",
           col3="numeric"
           ## etc.
           ),
         validity=function(object) {
             nms <- slotNames(object)
             len <- sapply(nms, function(nm) length(slot(object, nm)))
             if (1L != length(unique(len)))
                 "slots must all be of same length"
             else TRUE
         })

Buffer <-
    function(col1, col2, col3, ...)
{
    new("Buffer", col1=col1, col2=col2, col3=col3, ...)
}

Let's see where the inefficiencies are before deciding that this is an S4 issue.

Martin


        [[alternative HTML version deleted]]




______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to