Re: [R] for loop

Matevž Pavlič Sun, 31 Oct 2010 02:00:05 -0700

Hi Dennis,

Thank you for your extensive explanations. Yes, I guess I did not explain what 
I would like to do.

Basically I would like to conduct a linear regression for each of 15 classes. 
Your answers gave me new

Perspective on how R works. 

Thanks again for the help, 

m

From: Dennis Murphy [mailto:djmu...@gmail.com] 
Sent: Sunday, October 31, 2010 4:40 AM
To: MatevÅ¾ PavliÄ
Cc: r-help@r-project.org
Subject: Re: [R] for loop

Hi:

If your objective is to make 15 plots, one for each level of razred, then you 
don't need to make 15 individual data frames first. The lattice and ggplot2 
packages allow conditioning plots. You haven't mentioned what types of plots 
you're interested in getting, but if it's something simple like a scatterplot 
of y vs. x for each level of razred, it's not that hard to do. Let's fake some 
data:

d <- data.frame(razred = rep(LETTERS[1:15], each = 10),
                   x = rep(1:10, 15),
                   y = rep(2 + 0.5 * 1:10, 15) + rnorm(150))

d has 15 levels of razred with 10 observations at each level. razred is a 
factor, the other variables are either integer or numeric.

Produce scatterplots of y vs. x for each level of razred, using both the 
lattice and ggplot2 packages:

library(lattice)
# each plot adds a new feature - run one plot at a time.
xyplot(y ~ x | razred, data = d, type = c('p', 'r'))
xyplot(y ~ x | razred, data = d, type = c('p', 'r'), layout = c(3, 5))
xyplot(y ~ x | razred, data = d, type = c('p', 'r'), layout = c(3, 5), as.table 
= TRUE)

library(ggplot2)
ggplot(d, aes(x, y)) + geom_point() + geom_smooth(method = 'lm') +
    facet_wrap( ~ razred, ncol = 3)
ggplot(d, aes(x, y)) + geom_point() + geom_smooth(method = 'lm', se = FALSE) +
   facet_wrap( ~ razred, ncol = 3)

If instead you want something like a scatterplot matrix for each data subset 
defined by level of razred, then maybe something like this (?):

# add a new variable to the data frame
# splom is the scatterplot matrix function in lattice
d$z1 <- rnorm(150)
splom(~ d[, -1] | razred, data = d, layout = c(2, 2, 4))

Just guessing here since you didn't make your objective explicit. 

It's entirely possible that you can conduct a significant part of your data 
analysis without having to split the data into subsets. Several summary 
functions, for example, can compute a number of summary functions by group with 
a one-line call. Here are a couple of examples, one using aggregate() from the 
base package and another using function ddply() from the plyr package:

aggregate(y ~ razred, data = d, FUN = mean)
   razred        y
1       A 4.816841
2       B 4.520804
3       C 5.196329
4       D 4.615575
5       E 3.982240
6       F 4.466559
7       G 4.938669
8       H 4.539541
9       I 4.354991
10      J 4.573654
11      K 4.450624
12      L 5.138087
13      M 4.931111
14      N 4.879493
15      O 5.087452

library(plyr)
ddply(d, 'razred', summarise, mx = mean(x), my = mean(y), mz1 = mean(z1))
   razred  mx       my         mz1
1       A 5.5 4.816841 -0.01745305
2       B 5.5 4.520804  0.24724069
3       C 5.5 5.196329  0.18717750
4       D 5.5 4.615575  0.18885590
5       E 5.5 3.982240 -0.91284339
6       F 5.5 4.466559  0.36479266
7       G 5.5 4.938669 -0.36359562
8       H 5.5 4.539541  0.06061162
9       I 5.5 4.354991  0.05138409
10      J 5.5 4.573654  0.31160018
11      K 5.5 4.450624  0.17458712
12      L 5.5 5.138087 -0.26482357
13      M 5.5 4.931111 -0.39194953
14      N 5.5 4.879493  0.33154075
15      O 5.5 5.087452  0.32816931

There are a number of functions and packages that will do this sort of thing 
quite well - I'll mention doBy, data.table, Hmisc and sqldf as excellent 
options, noting that there are other packages and functions in the apply family 
that can perform groupwise processing seamlessly. The point of mentioning this 
is so that you don't automatically think you have to split the data in myriad 
ways before you can process a function. The good folks that designed this 
language, and the many people who have contributed code to the R project, are 
pretty smart, and have devised fairly simple ways to process data, even if it's 
large. 

Of course, it's always possible that splitting is necessary; if you're willing 
to be a little more forthcoming about your analysis goals, you might get a 
better targeted response..

HTH,
Dennis

On Sat, Oct 30, 2010 at 12:00 PM, MatevÅ¾ PavliÄ <matevz.pav...@gi-zrmk.si> 
wrote:

Just one more thing...
I get a list with 15 data.frames :

List of 15
 $ 1:'data.frame':      7 obs. of  9 variables:
 ..$ vrtina         : Factor w/ 6 levels "T1A-1","T1A-2",..: 1 1 2 2 5 5 5
 ..$ globina.meritve: num [1:7] 7.6 8.5 10.4 17.4 12.5 15.5 16.5
 ..$ E0             : num [1:7] 4109 2533 491 810 2374 ...
 ..$ Eur1           : num [1:7] 6194 4713 605 1473 NA ...
 ..$ Eur2           : num [1:7] 3665 7216 266 4794 7387 ...
 ..$ Eur3           : num [1:7] 3221 3545 920 3347 6768 ...
 ..$ H              : num [1:7] 8 5.9 5.9 6.9 9.3 10.9 10
 ..$ Mpl            : num [1:7] 61.9 136.7 19.9 96.4 178.5 ...
 ..$ class          : int [1:7] 1 1 1 1 1 1 1
.
.
.

But how would I acces them (i.e. to draw a plot for each data.frame for each 
data.frame in a list)?

Thanks,m

-----Original Message-----
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Saturday, October 30, 2010 8:24 PM
To: MatevÅ¾ PavliÄ
Cc: r-help@r-project.org
Subject: Re: [R] for loop

On Oct 30, 2010, at 2:07 PM, MatevÅ¾ PavliÄ wrote:

> Hi,
>
> I know this is probalby a very trivial thing to do for most of the R
> users, but since I just strated using it I have some problems....
>
> I have a data.frame with a field called "razred". This field has
> values from 1 up to 15.
>
> Is it possible to create a for loop that would create a new data frame
> for each of the "razred" values.

The R-way would be to use the split function and leave the result in a list to 
which the same operation could be also repeatedly performed using lapply.

?split

And take a look at the fourth example applying split to the builtin airquality 
dataframe.

The plyr package also provides functions on dataframes.

--

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] for loop

Reply via email to