Re: [R] for loop; lm() regressions; list of vectors - lapply - accolades and square brackets??

Driss Agramelal Wed, 31 Mar 2010 11:33:41 -0700

Peter, it works!!! Thanks a lot to all of you!

Yes I mistook a data.frame for a matrix....


Now both David's and Dennis' solutions work!

I am really a beginner with R, and I think it's a wonderful tool, but
sometimes
a little difficult to use only learning from the manuals and help files...

I hope I didn't pollute the help list! I am really sorry I didn't provide
more info
from the beginning, and for you to fully understand my question, please
find
my code hereafter, with a 'copy-pastable' sample of my data:

What I did:

## Step 1: Import the data (simplified for example)

a <- read.table("clipboard", header = TRUE)
a
         a1          a2          a3          a4          a5
1   0.01181  0.01181  0.01181  0.01181  0.01181
2   0.11266  0.10878  0.11217  0.11255  0.11258
3   0.00655  0.00655  0.00655  0.00655  0.00655
4  -0.02956 -0.03098 -0.03006 -0.03243 -0.03244
5  -0.01788 -0.01788 -0.01788 -0.01788 -0.01788
6  -0.01106  0.00083 -0.01172 -0.00328 -0.00328
7   0.00668  0.00668  0.00668  0.00668  0.00668
8   0.06130  0.05779  0.06310  0.06144  0.06146
9   0.01300  0.01300  0.01300  0.01300  0.01300
10 -0.05749 -0.05764 -0.05771 -0.05970 -0.05972
11 -0.04019 -0.04019 -0.04019 -0.04019 -0.04019
12 -0.01223 -0.01235 -0.01229 -0.00866 -0.00867
13  0.01422  0.01422  0.01422  0.01422  0.01422
14  0.00636  0.00603  0.00633  0.00813  0.00813
15 -0.00715 -0.00715 -0.00715 -0.00715 -0.00715
16 -0.02773 -0.03089 -0.02802 -0.02729 -0.02730
17 -0.02534 -0.02534 -0.02534 -0.02534 -0.02534
18  0.01151  0.00634  0.01120  0.01509  0.01509
19 -0.01103 -0.01314 -0.01112 -0.00894 -0.00895
20 -0.01889 -0.01964 -0.01860 -0.01813 -0.01813
21  0.01498  0.01438  0.01526  0.01548  0.01549
22  0.00731  0.00853  0.00746  0.00699  0.00699
23 -0.02229 -0.02069 -0.02247 -0.02223 -0.02224
24  0.00824  0.00817  0.00786  0.00783  0.00783
25  0.00596  0.00699  0.00613  0.00621  0.00622
26 -0.04201 -0.04241 -0.04212 -0.04271 -0.04272
27  0.03012  0.02972  0.03041  0.03056  0.03056
28 -0.00972 -0.01009 -0.00957 -0.01037 -0.01037
29 -0.02637 -0.02657 -0.02598 -0.02614 -0.02615
30 -0.05003 -0.05134 -0.05025 -0.05054 -0.05055
31 -0.02514 -0.02483 -0.02523 -0.02424 -0.02424
32  0.02797  0.02732  0.02802  0.02663  0.02685
33 -0.00777 -0.00791 -0.00766 -0.00716 -0.00716
34 -0.01836 -0.01830 -0.01855 -0.01820 -0.01842

b <- c(read.table("clipboard", header = TRUE))
b
$b
 [1]  0.01181  0.02764  0.00655  0.02275 -0.01788 -0.03519  0.00668  0.00541
 0.01300  0.01528 -0.04019
[12] -0.00838  0.01422 -0.02362 -0.00715 -0.00862 -0.02534 -0.02361 -0.01210
 0.00135  0.00101 -0.00017
[23] -0.02670  0.00799  0.01333 -0.03874  0.04817 -0.00102 -0.02142 -0.04965
-0.02773  0.02451 -0.00953
[34] -0.01985

b <- b$b

## Step 2: perform the regressions (David's way)

David <- *for(i in seq_along(a)) print(r<- lm(a[ ,i] ~ b) )*

Call:
lm(formula = a[, i] ~ b)

Coefficients:
(Intercept)            b
  0.0008459    0.8484054


Call:
lm(formula = a[, i] ~ b)

Coefficients:
(Intercept)            b
  0.0004523    0.8215883


Call:
lm(formula = a[, i] ~ b)

Coefficients:
(Intercept)            b
  0.0008573    0.8513867


Call:
lm(formula = a[, i] ~ b)

Coefficients:
(Intercept)            b
   0.001105     0.818247


Call:
lm(formula = a[, i] ~ b)

Coefficients:
(Intercept)            b
   0.001108     0.818983

## Step 2: Perform the regressions (Dennis' way)

Dennis <- *lm(as.matrix(a) ~ b)*
Dennis

Call:
lm(formula = as.matrix(a) ~ b)

Coefficients:
                   a1               a2               a3             a4
    a5
(Intercept)  0.0008459  0.0004523  0.0008573  0.0011052  0.0011081
b               0.8484054  0.8215883  0.8513867  0.8182470  0.8189826

## Further steps to come... I am still learning to manipulate my new results
to plot.lm a choosen response, to test for autocorrelation, maybe lag or use
the gls or robust method, etc... I already did most of the analyses in a
test
phase with a single regression, but it will be another story on the entire
sample...! Maybe I will post some more questions if appropriate...(?)

For better comprehension, imagine "a1" to "a5" are companies, and b is
a benchmark index, and I calculate CAPM "betas"...

Many thanks, and I pasted Dennis's answer hereafter for it to be on the
forum
as well, since it is a valuable contribution!!

Driss Agramelal
-----------------------
Student writing his master's thesis
University of St-Gallen






Hi:

I don't know if this will take care of your problem, but here are several
approaches.
I'm using the melt() function in package plyr to
vectorize the response matrix and create a factor variable from the variable
names.

Approach 1: lmList() in nlme.

library(plyr)
library(nlme)
df <- data.frame(x = 1:10, y1 = rnorm(10), y2 = rnorm(10), y3 = rnorm(10))
# melt() creates a grouping 'variable' to distinguish the responses; the
responses
# themselves are put into a variable 'value'
df1 <- melt(df, id.vars = 'x')
# Run lmList using variable as the grouping factor...
mods <- lmList(value ~ x | variable, data = df1)
# for example...
> coef(mods)
   (Intercept)           x
y1  0.05917904 -0.02963703
y2 -1.37155019  0.14582574
y3 -0.52451519  0.09575826

It appears that the data are pooled when this function is run, so it's as
though
one is fitting something analogous to an ANCOVA with separate fits for each
response but a pooled error variance estimate. This is probably not what you
want.

Approach 2: Split the data by the grouping factor and run separate
regressions
on each. I'm going to be inelegant and use a loop (until I remember how to
do
the non-loop version or somebody else shows us first).

l <- split(df1, df1$variable)     # split data by the y-variable indicator
m <- vector('list', 3)               # create a list of three components,
initially empty
for(i in seq_along(m)) m[[i]] <- lm(value ~ x, data = l[[i]])
str(m)         # tells you which components of the model are saved

# For example, to extract the coefficients, use
do.call(rbind, lapply(m, '[[', 1))
     (Intercept)           x
[1,]  0.05917904 -0.02963703
[2,] -1.37155019  0.14582574
[3,] -0.52451519  0.09575826

# For the residuals (component 2),
do.call(rbind, lapply(m, '[[', 2))

The lapply function takes a list of model objects in m, extracts the
component of
interest from each model with '[[' and then indicates which component to
extract.

A third approach would be to assign each model fit to a separate object.
There are
some advantages to doing this, but with a large number of responses this
might
become a little unwieldy unless you like to iterate. But here goes...

Starting from the list l,
for(i in seq_along(l)) assign(paste('m', i, sep = ""), lm(value ~ x, data =
l[[i]])

In my example, this should create three new model objects: m1, m2, m3, one
for each response.

There are other ways to go as well, I'm sure, but you haven't really
specified
what it is that you want.

HTH,
Dennis












2010/3/31 Peter Ehlers <ehl...@ucalgary.ca>

> Driss,
>
> David is right - you should include data/code.
>
> Nevertheless, the error you get suggests that what you
> call a "matrix" is in fact a data.frame; that's usually
> a crucial distinction. There are two things you might try:
>
> Dennis' way:
>
>  lm(as.matrix(a) ~ b)
>
> David's way:
>
>
>  for(i in seq_along(a)) print(r<- lm(a[ ,i] ~ b) )
>
> #Note the comma!
>
>  -Peter Ehlers
>
>
> On 2010-03-31 7:42, David Winsemius wrote:
>
>>
>> On Mar 31, 2010, at 9:13 AM, Driss Agramelal wrote:
>>
>>
>>> Hello and thank you both for your answers!
>>>
>>> Dennis, I tried to simply run
>>>
>>> lm(a ~ b)
>>>
>>> after re-importing "a" as a matrix, but I get the following error
>>> message:
>>>
>>> Error in model.frame.default(formula = a ~ b, drop.unused.levels =
>>> TRUE) :
>>>   invalid type (list) for variable 'a'
>>>
>>> so maybe I have to specify something in the arguments? What do you
>>> think?
>>>
>>> David,
>>>
>>> I tried your syntax as well, and received quasi-the same error
>>> statement:
>>>
>>>  for(i in seq_along(a)) print(r<- lm(a[i] ~ b) )
>>>>
>>> Error in model.frame.default(formula = a[i] ~ b, drop.unused.levels
>>> = TRUE) :
>>>   invalid type (list) for variable 'a[i]'
>>>
>>> I am not too familiar with the use of accolades, square brackets and
>>> parentheses, the order in which they have to come
>>> in the function and the role they play, but I think they might be
>>> important...
>>>
>>> I also tried to use "lapply"; it works wonderfully for a basic
>>> function like:
>>>
>>> lapply(a, mean)
>>>
>>> I get a list of results with names and values..perfect! But with the
>>> lm() function... I just
>>> don't know how to write the arguments... tried several options
>>> without success...
>>>
>>> Any idea that could help me solve this only seemingly easy task
>>> would be most welcome!!
>>>
>>
>> Do you think we can figure this out when you have provided no sample
>> data and have not provided even the results of str on the data object
>> you are working with? Generally one gains insight by parring the
>> problem down to smaller test cases and working with them. Perhaps the
>> first 5 elements of "a" rather than all 150?
>>
>>
> --
> Peter Ehlers
> University of Calgary
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] for loop; lm() regressions; list of vectors - lapply - accolades and square brackets??

Reply via email to