Hi: It's also simpler to use transform() or within(), especially if you want to create and/or modify multiple variables in a data frame. For example,
df<- data.frame(weight=round(runif(10, 10, 100)), sex=round(runif(100, 0, 1))) df <- transform(df, pct = 100 * weight/ave(weight, sex, FUN = sum)) > head(df, 3) weight sex pct 1 87 0 2.425425 2 31 1 1.025471 3 71 0 1.979370 HTH, Dennis On Sat, Jun 18, 2011 at 2:44 AM, Albert-Jan Roskam <fo...@yahoo.com> wrote: > Thanks a lot to all who responded. This is a little less confusing now, > although > it's hard for me to fathom the (practical) use of a dataframe within a > dataframe. If one mixes different notations, or, put in a different way, > different underlying classes (data.frame vs. numeric), these rather > unintuitive > results appear. > So I'll use any of these: > df$pct <- df$weight / ave(df$weight, df$sex, FUN=sum)*100 > df["pct"] <- df["weight"] / ave(df["weight"], df["sex"], FUN=sum)*100 > > using str() is very insightful, as is using class() > > I'd prefer it if R simply generated an error when one attempts to nest a > data.frame within a data.frame. > > Thanks again! > > Cheers!! > Albert-Jan > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > All right, but apart from the sanitation, the medicine, education, wine, > public > order, irrigation, roads, a fresh water system, and public health, what have > the > Romans ever done for us? > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > > ________________________________ > From: Brian Diggs <dig...@ohsu.edu> > To: R-help@r-project.org > Sent: Fri, June 17, 2011 11:58:44 PM > Subject: Re: [R] is this a bug? > > On 6/17/2011 2:24 PM, (Ted Harding) wrote: >> And the extra twist in the tale is exemplified by this >> mini-version of Albert-Jan's first example: >> >> DF<- data.frame(A=c(1,2,3)) >> DF$B<- c(4,5,6) >> DF$C<- c(7,8,9) >> DF >> # A B C >> # 1 1 4 7 >> # 2 2 5 8 >> # 3 3 6 9 >> >> DF$D<- DF["A"]/DF["B"] >> DF >> # A B C A >> # 1 1 4 7 0.25 >> # 2 2 5 8 0.40 >> # 3 3 6 9 0.50 >> >> ##And why: >> >> DF["A"]/DF["B"] >> # A >> # 1 0.25 >> # 2 0.40 >> # 3 0.50 >> >> ##So the ratio DF["A"]/DF["B"] comes out with the name of >> ##the numerator, "A". This is then the name given to DF$D > > It's even slightly weirder than that: > > str(DF) > #'data.frame': 3 obs. of 4 variables: > # $ A: num 1 2 3 > # $ B: num 4 5 6 > # $ C: num 7 8 9 > # $ D:'data.frame': 3 obs. of 1 variable: > # ..$ A: num 0.25 0.4 0.5 > > There is a column D in DF which is itself a data frame with a single > column whose name is A (because of what Ted said). When formatted for > printing out, the column name of the inner data frame is used (as a > result of how data.frame() itself handles named arguments when the > argument is itself a data.frame: "If a list or data frame or matrix is > passed to data.frame it is as if each component or column had been > passed as a separate argument..."). > > So not a bug, but a convoluted set of circumstances that can happen when > non-atomic vectors are assigned to columns of a data.frame. That's one > of those /you shouldn't do that even though it is technically legal or > at least you shouldn't be surprised when things don't work the way you > thought they would/ things. > >> Thus Albert-Jan's >> df["weight"] / ave(df["weight"], df["sex"], FUN=sum)*100 >> comes through with name "weight". >> >> Ted. >> >> >> On 17-Jun-11 21:06:42, William Dunlap wrote: >>> df$varname is a column of df. >>> >>> df["varname"] is a one-column df containing that column. >>> >>> df[["varname"]] is a column of df (same as df$varname). >>> >>> df[,"varname"] is a column of df (same as df$varname). >>> >>> df[,"varname",drop=FALSE] is a one-column df (same as df$varname). >>> >>> df$newVarname<- df["varname"] inserts a new component >>> into df, the component being a one-column data.frame, >>> not the column in that data.frame. >>> >>> Bill Dunlap >>> Spotfire, TIBCO Software >>> wdunlap tibco.com >>> >>>> -----Original Message----- >>>> From: r-help-boun...@r-project.org >>>> [mailto:r-help-boun...@r-project.org] On Behalf Of Albert-Jan Roskam >>>> Sent: Friday, June 17, 2011 1:49 PM >>>> To: R Mailing List >>>> Subject: [R] is this a bug? >>>> >>>> Hello, >>>> >>>> Is the following a bug? I always thought that df$varname<- >>>> does the same as >>>> df["varname"]<- >>>> >>>>> df<- data.frame(weight=round(runif(10, 10, 100)), >>>> sex=round(runif(100, 0, >>>> 1))) >>>>> df$pct<- df["weight"] / ave(df["weight"], df["sex"], FUN=sum)*100 >>>>> names(df) >>>> [1] "weight" "sex" "pct" ### ----------> ok >>>>> head(df) > [[elided Yahoo spam]] >>>> 1 86 0 2.4002233 >>>> 2 19 1 0.5643006 >>>> 3 32 0 0.8931063 >>>> 4 87 0 2.4281328 >>>> 5 45 0 1.2559308 >>>> 6 95 0 2.6514094 >>>>> rm(df) >>>>> df<- data.frame(weight=round(runif(10, 10, 100)), >>>> sex=round(runif(100, 0, >>>> 1))) >>>>> df["pct"]<- df["weight"] / ave(df["weight"], df["sex"], >>>> FUN=sum)*100 ### >>>>> -----> this does work >>>>> names(df) >>>> [1] "weight" "sex" "pct" >>>>> head(df) >>>> weight sex pct >>>> 1 15 0 0.5246590 >>>> 2 43 0 1.5040224 >>>> 3 17 1 0.9284544 >>>> 4 44 1 2.4030584 >>>> 5 76 1 4.1507373 >>>> 6 59 0 2.0636586 >>>>> do.call(c, R.Version()) >>>> platform arch >>>> "i686-pc-linux-gnu" "i686" >>>> os system >>>> "linux-gnu" "i686, linux-gnu" >>>> status major >>>> "" "2" >>>> minor year >>>> "11.1" "2010" >>>> month day >>>> "05" "31" >>>> svn rev language >>>> "52157" "R" >>>> version.string >>>> "R version 2.11.1 (2010-05-31)" >>>>> # Thanks! >>>> >>>> Cheers!! >>>> Albert-Jan >>>> >>>> >>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>> All right, but apart from the sanitation, the medicine, >>>> education, wine, public >>>> order, irrigation, roads, a fresh water system, and public >>>> health, what have the >>>> Romans ever done for us? >>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> -------------------------------------------------------------------- >> E-Mail: (Ted Harding)<ted.hard...@wlandres.net> >> Fax-to-email: +44 (0)870 094 0861 >> Date: 17-Jun-11 Time: 22:24:41 >> ------------------------------ XFMail ------------------------------ >> > > > -- > Brian S. Diggs, PhD > Senior Research Associate, Department of Surgery > Oregon Health & Science University > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.