Dear Bert and Sarah, Thank you very much for your clarifications on this matter. I will have to study more closely the way extracting subsets of data structures is performed, and I will change my programming habits accordingly.
Best regards, Paulo Barata --------------------------------------------------------------------- ---------- Original Message ----------- From: Bert Gunter <gunter.ber...@gene.com> To: Paulo Barata <paulo.bar...@ensp.fiocruz.br> Cc: Frans Marcelissen <frans.marcelis...@digipsy.nl>, r-help@r-project.org, ehl...@ucalgary.ca Sent: Tue, 17 Jul 2012 08:06:57 -0700 Subject: {Link Suspeito} Re: [R] variable (column) in a data frame > Inline below. > > -- Bert > > On Tue, Jul 17, 2012 at 7:40 AM, Paulo Barata > <paulo.bar...@ensp.fiocruz.br>wrote: > > > > > Dear Frans and Peter, > > > > Yes, the notation df[,'var'] is able to catch a non-existent > > variable var inside a data frame df. But the notation df$var > > isn't. > > > > So we have this situation, where two different notations, which > > (as far as I understand) perform the same action, have different > > kinds of response. > > > > You don't understand far enough. Your assumption is simply not true. For > example, from ?"[" : > > "The most important distinction between [, [[ and $ is that the [ can > select more than one element whereas the other two select a single element. > > The default methods work somewhat differently for atomic vectors, > matrices/arrays and for recursive (list-like, see > is.recursive<http://127.0.0.1:25542/library/base/help/is.recursive>) > objects. $ is only valid for recursive objects, and is only > discussed in the section below on recursive objects." > > So the Help page already notes that there are differences among them. > > Nevertheless, your discomfort is, imo, understandable. > Extraction/replacement for data structures is a complex business, > and R's approach to the issues have "evolved" over time, with > "inconsistencies," especially for edge cases, baked in. Because > these issues are at the very core of R's behavior, I think it likely > that except for egregious inconsistencies and outright bugs -- which > at this point are most unlikely to exist -- it is well nigh > impossible to change them. I see no recourse but to always check > such edge cases carefully and to be as consistent as possible in > your own programming usage (e.g. always using [,".."] for extracting > columns). As Peter has pointed out several times, the $ extractor is > convenient syntactic sugar that can get one into a lot of trouble, > and is probably best avoided. > > Cheers, > > Bert > > > Couldn't this situation be fixed? Isn't it possible to make the > > df$var notation to issue an error when referring to a non-existent > > variable inside the data frame? > > > > Thank you very much. > > > > Paulo Barata > > > > --------------------------------------------------------------------- > > > > > > ---------- Original Message ----------- > > From: "Frans Marcelissen" <frans.marcelis...@digipsy.nl> > > To: "'Paulo Barata'" <paulo.bar...@ensp.fiocruz.br>, <r-help@r-project.org > > > > > Sent: Mon, 16 Jul 2012 14:25:21 +0200 > > Subject: RE: [R] variable (column) in a data frame > > > > > Hoi Pauli, > > > There is a difference between two ways of accessing columns in a matrex: > > > > df$aaa > > > NULL > > > > df["AAA"] > > > Error in `[.data.frame`(df, "AAA") : undefined columns selected > > > So df["AAA"] or df[,"AAA"] gives the error message you expect. > > > ------------------- > > > Frans > > > > > > -----Oorspronkelijk bericht----- > > > Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > > > Namens Paulo Barata > > > Verzonden: zondag 15 juli 2012 16:31 > > > Aan: r-help@r-project.org > > > Onderwerp: [R] variable (column) in a data frame > > > > > > To the R help list, > > > > > > When using a data frame, there is no warning or error message when I > > > refer to a non-existent variable inside the data frame. > > > > > > Example: > > > > > > ##---------------------------------------------- > > > > > > a <- c(1,2,3) > > > b <- c(11,22,33) > > > df <- data.frame(a,b) > > > df > > > > > > ## correct: there is a column in df named 'a' > > > ## the sum is correctly performed > > > sum(df$a==2) > > > > > > ## incorrect: there is no column in df named 'aaa', ## but the sum is > > > performed anyway without either warning or error > > > sum(df$aaa==2) > > > > > > ##---------------------------------------------- > > > > > > Is there some way to make R issue either a warning or an error > > > message in such a situation? > > > > > > I am using R version 2.15.1 64-bit on Windows 7 Professional. > > > > > > Thank you very much. > > > > > > Paulo Barata > > > > > > --------------------------------------------------------------------- > > > Paulo Barata > > > > > > ENSP - Fundação Oswaldo Cruz > > > Rua Leopoldo Bulhões 1480 - 8A > > > 21041-210 Rio de Janeiro - RJ > > > Brazil > > > E-mail: paulo.bar...@ensp.fiocruz.br > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > > > This message has been scanned for viruses and > > > dangerous content by MailScanner, and is > > > believed to be clean. > > ------- End of Original Message ------- > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional- > groups/pdb-biostatistics/pdb-ncb-home.htm > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. ------- End of Original Message ------- ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.