In context. Sent from my iPhone
> On Apr 23, 2017, at 2:38 PM, Jeff Newmiller <jdnew...@dcn.davis.ca.us> wrote: > > Coming from an Excel background, copying and pasting seems attractive, but it > does not create a reproducible record of what you did so it becomes quite > tiring and frustrating after some time has passed and you return to your > analysis. > > Nitpick: you put the setdiff function in the row selection position, an error > I am sure Hadley did not recommend. That was not how my wetware interpreter read that code. I saw it as a single argument to "[". Best; David > > Since R is programmable, there are far more ways to select columns than just > setdiff. Since your description of desired features is vague, you are > unlikely to get the answer you would really like from your email. Some > possibilities to think about: > > a) use regular expressions and grep or grepl to select by similar character > patterns. E.g. all columns including the the substring "value" or "key": > grep( "key|value", names( dta ). Possible to specify very complex selection > patterns, but there are whole books on regular expressions, so you can't > expect to learn all about them on this R-specific mailing list. > > b) use a separate csv file with a column listing each column name, and then > one column for each subset you want to define, using TRUE/FALSE values to > include or not include the column name identified. E.g. > > # typically easier to manage in an external data file, online for example only > colsets <- read.csv( text= > "Colname,set1,set2 > key,TRUE,TRUE > value1,TRUE,FALSE > value2,TRUE,FALSE > factor1,FALSE,TRUE > ",header=TRUE,as.is=TRUE) > dta[ , colsets$set1 ] > > Also your criteria of "clean listing" and "copy-pasteable" are likely > mutually exclusive, depending how you interpret them. You might be able to > use dput to export a set of column names that can be re-imported accurately, > but you might not regard it as "clean" if you are thinking "readable". > -- > Sent from my phone. Please excuse my brevity. > >> On April 23, 2017 12:07:19 PM PDT, Bruce Ratner PhD <b...@dmstat1.com> wrote: >> R-helpers: >> I'm reading "Advanced R" (Wickham), which provides his way, quoted >> below, of keeping variables. This cherry-picking approach clearly is >> not practical with a large dataset. >> >> "If you know the columns you don’t want, use set operations to work out >> which colums to keep: df[setdiff(names(df), "z")]" >> >> I'm looking for a way of producing an output of 1000 plus variables, >> such that I can get a clean listing of variables, not like from st(), >> that are easily copy-pastable for selecting the variables I want to >> keep. >> >> Any suggestion is appreciated. >> Thanks. >> Bruce >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.