This should be a version that does what you want. Because you named the variable lvarname, I assumed you were already passing "lvar" instead of trying to pass lvar (without the quotes), which is in no way a 'name'.
myfunct.better <- function(subgroup, lvarname, xvarname, yvarname, dataframe) { #enter the subgroup, the variable names to be used and the dataframe #in which they are found Data.tmp <- Fulldf[Fulldf[,lvarname]==subgroup, c(xvarname,yvarname)] Data.tmp <-na.omit(Data.tmp) indextable <- table(Data.tmp[,xvarname], Data.tmp[,yvarname]) # create the contingency #table on the basis of the entered variables #actually, if I remember well, you could simply use indextable<-table(Data.tmp) here #that would allow for some more simplifications (replace xvarname and yvarname by #columnsOfInterest or similar, and pass that instead of c(xvarname, yvarname) ) return(indextable) } myfunct.better("yes", lvarname="lvar", xvarname="xvar", yvarname="yvar", dataframe=Fulldf) HTH, Nick Sabbe -- ping: nick.sa...@ugent.be link: http://biomath.ugent.be wink: A1.056, Coupure Links 653, 9000 Gent ring: 09/264.59.36 -- Do Not Disapprove -----Original Message----- From: irene.p...@googlemail.com [mailto:irene.p...@googlemail.com] On Behalf Of E Hofstadler Sent: vrijdag 1 april 2011 14:28 To: Nick Sabbe Cc: r-help@r-project.org Subject: Re: [R] programming: telling a function where to look for the entered variables Thanks Nick and Juan for your replies. Nick, thanks for pointing out the warning in subset(). I'm not sure though I understand the example you provided -- because despite using subset() rather than bracket notation, the original function (myfunct) does what is expected of it. The problem I have is with the second function (myfunct.better), where variable names + dataframe are not fixed within the function but passed to the function when calling it -- and even with bracket notation I don't quite manage to tell R where to look for the columns that related to the entered column names. (but then perhaps I misunderstood you) This is what I tried (using bracket notation): myfunct.better(dataframe, subgroup, lvarname,yvarname){ Data.tmp <- dataframe[dataframe[,deparse(substitute(lvarname))]==subgroup, c("xvar",deparse(substitute(yvarname)))] } but this creates an empty contingency table only -- perhaps because my use of deparse() is flawed (I think what is converted into a string is "lvarname" and "yvarname", rather than the column names that these two function-variables represent in the dataframe)? 2011/4/1 Nick Sabbe <nick.sa...@ugent.be>: > See the warning in ?subset. > Passing the column name of lvar is not the same as passing the 'contextual > column' (as I coin it in these circumstances). > You can solve it by indeed using [] instead. > > For my own comfort, here is the relevant line from your original function: > Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar")) > Which should become something like (untested but should be close): > Data.tmp <- Fulldf[Fulldf[,"lvar"]==subgroup, c("xvar","yvar")] > > This should be a lot easier to translate based on column names, as the > column names are now used as such. > > HTH, > > > Nick Sabbe > -- > ping: nick.sa...@ugent.be > link: http://biomath.ugent.be > wink: A1.056, Coupure Links 653, 9000 Gent > ring: 09/264.59.36 > > -- Do Not Disapprove > > > > > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of E Hofstadler > Sent: vrijdag 1 april 2011 13:09 > To: r-help@r-project.org > Subject: [R] programming: telling a function where to look for the entered > variables > > Hi there, > > Could someone help me with the following programming problem..? > > I have written a function that works for my intended purpose, but it > is quite closely tied to a particular dataframe and the names of the > variables in this dataframe. However, I'd like to use the same > function for different dataframes and variables. My problem is that > I'm not quite sure how to tell my function in which dataframe the > entered variables are located. > > Here's some reproducible data and the function: > > # create reproducible data > set.seed(124) > xvar <- sample(0:3, 1000, replace = T) > yvar <- sample(0:1, 1000, replace=T) > zvar <- rnorm(100) > lvar <- sample(0:1, 1000, replace=T) > Fulldf <- as.data.frame(cbind(xvar,yvar,zvar,lvar)) > Fulldf$xvar <- factor(xvar, labels=c("blue","green","red","yellow")) > Fulldf$yvar <- factor(yvar, labels=c("area1","area2")) > Fulldf$lvar <- factor(lvar, labels=c("yes","no")) > > and here's the function in the form that it currently works: from a > subset of the dataframe Fulldf, a contingency table is created (in my > actual data, several other operations are then performed on that > contingency table, but these are not relevant for the problem in > question, therefore I've deleted it) . > > # function as it currently works: tailored to a particular dataframe > (Fulldf) > > myfunct <- function(subgroup){ # enter a particular subgroup for which > the contingency table should be calculated (i.e. a particular value of > the factor lvar) > Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar")) > #restrict dataframe to given subgroup and two columns of the original > dataframe > Data.tmp <- na.omit(Data.tmp) # exclude missing values > indextable <- table(Data.tmp$xvar, Data.tmp$yvar) # make contingency table > return(indextable) > } > > #Since I need to use the function with different dataframes and > variable names, I'd like to be able to tell my function the name of > the dataframe and variables it should use for calculating the index. > This is how I tried to modify the first part of the #function, but it > didn't work: > > # function as I would like it to work: independent of any particular > dataframe or variable names (doesn't work) > > myfunct.better <- function(subgroup, lvarname, yvarname, dataframe){ > #enter the subgroup, the variable names to be used and the dataframe > in which they are found > Data.tmp <- subset(dataframe, lvarname==subgroup, select=c("xvar", > deparse(substitute(yvarname)))) # trying to subset the given dataframe > for the given subgroup of the given variable. The variable "xvar" > happens to have the same name in all dataframes) but the variable > yvarname has different names in the different dataframes > Data.tmp <- na.omit(Data.tmp) > indextable <- table(Data.tmp$xvar, Data.tmp$yvarname) # create the > contingency table on the basis of the entered variables > return(indextable) > } > > calling > > myfunct.better("yes", lvarname=lvar, yvarname=yvar, dataframe=Fulldf) > > results in the following error: > > Error in `[.data.frame`(x, r, vars, drop = drop) : > undefined columns selected > > My feeling is that R doesn't know where to look for the entered > variables (lvar, yvar), but I'm not sure how to solve this problem. I > tried using with() and even attach() within the function, but that > didn't work. > > Any help is greatly appreciated. > > Best, > Esther > > P.S.: > Are there books that elaborate programming in R for beginners -- and I > mean things like how to best use vectorization instead of loops and > general "best practice" tips for programming. Most of the books I've > been looking at focus on applying R for particular statistical > analyses, and only comparably briefly deal with more general > programming aspects. I was wondering if there's any books or tutorials > out there that cover the latter aspects in a more elaborate and > systematic way...? > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.