I have data for different items (ID) in a database.
For each ID I have to get:

-          Timestamp of the observation (timestamp);

-          numerical value (val) that will be my response variable in some kind 
of model;

-          a variable number of variables in a know set (if value for a 
specific variable is not present in DB it is 0).

To get to the above mentioned values I have to cycle over IDs, make some 
calculation and store results to construct a huge data.frame for subsequent 
estimations. The number of rows for each ID is random (typically 14 to 200).

My current approach is to construct a matrix like this:

out <- c('A', 'B', 'C', 'D')
out <- matrix(-1, 5000, 3 + length(out), dimnames = list(1:5000, c('ID', 
'timestamp' , 'val', out)))

I access to out matrix by numerical index to substitute values ( out[1:n,1] <- 
k )
When matrix is full I add 5000 rows and go on.
Afterward I clean rows with ID set to -1 and than all other -1 values with 0

For my application typically an ID have something between 14 and 200 
observations (mean around 50) but I have 15000 IDs ...
After profiling I realize that accessing the out matrix this way is too slow.

Do you have any idea on how to speed up this kind of process?
I think something can be done creating a data.frame for each ID and bind them 
in the end. Is it a good idea? How can I implement that? List of data.frame? 
And than?

Below some code that can be useful if someone would like to experiment ...

alist <- vector('list', 2)
alist[[1]] <- data.frame( ID = 1, timestamp = 1:14, val = rnorm(14), A = 1, B = 
2, C = 3 )
alist[[2]] <- data.frame( ID = 2, timestamp = 2:15, val = rnorm(14), B = 2, C = 
3, D = 4 )
alist[[3]] <- data.frame( ID = 3, timestamp = 3:30, val = rnorm(28), C = 1, D = 
2 )


Thanks in advance for your valuable help.
Daniele

________________________________
ORS Srl

Via Agostino Morando 1/3 12060 Roddi (Cn) - Italy
Tel. +39 0173 620211
Fax. +39 0173 620299 / +39 0173 433111
Web Site www.ors.it

------------------------------------------------------------------------------------------------------------------------
Qualsiasi utilizzo non autorizzato del presente messaggio e dei suoi allegati ? 
vietato e potrebbe costituire reato.
Se lei avesse ricevuto erroneamente questo messaggio, Le saremmo grati se 
provvedesse alla distruzione dello stesso
e degli eventuali allegati.
Opinioni, conclusioni o altre informazioni riportate nella e-mail, che non 
siano relative alle attivit? e/o
alla missione aziendale di O.R.S. Srl si intendono non attribuibili alla 
societ? stessa, n? la impegnano in alcun modo.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to