In context.

Sent from my iPhone

> On Apr 23, 2017, at 2:38 PM, Jeff Newmiller <jdnew...@dcn.davis.ca.us> wrote:
> 
> Coming from an Excel background, copying and pasting seems attractive, but it 
> does not create a reproducible record of what you did so it becomes quite 
> tiring and frustrating after some time has passed and you return to your 
> analysis. 
> 
> Nitpick: you put the setdiff function in the row selection position, an error 
> I am sure Hadley did not recommend. 

That was not how my wetware interpreter read that code. I saw it as a single 
argument to "[".

Best;
David
> 
> Since R is programmable, there are far more ways to select columns than just 
> setdiff. Since your description of desired features is vague, you are 
> unlikely to get the answer you would really like from your email. Some 
> possibilities to think about:
> 
> a) use regular expressions and grep or grepl to select by similar character 
> patterns. E.g. all columns including the the substring "value" or "key": 
> grep( "key|value", names( dta ). Possible to specify very complex selection 
> patterns, but there are whole books on regular expressions, so you can't 
> expect to learn all about them on this R-specific mailing list. 
> 
> b) use a separate csv file with a column listing each column name, and then 
> one column for each subset you want to define, using TRUE/FALSE values to 
> include or not include the column name identified. E.g.
> 
> # typically easier to manage in an external data file, online for example only
> colsets <- read.csv( text=
> "Colname,set1,set2
> key,TRUE,TRUE
> value1,TRUE,FALSE
> value2,TRUE,FALSE
> factor1,FALSE,TRUE
> ",header=TRUE,as.is=TRUE)
> dta[ , colsets$set1 ]
> 
> Also your criteria of "clean listing" and "copy-pasteable" are likely 
> mutually exclusive, depending how you interpret them. You might be able to 
> use dput to export a set of column names that can be re-imported accurately, 
> but you might not regard it as "clean" if you are thinking "readable".
> -- 
> Sent from my phone. Please excuse my brevity.
> 
>> On April 23, 2017 12:07:19 PM PDT, Bruce Ratner PhD <b...@dmstat1.com> wrote:
>> R-helpers:
>> I'm reading "Advanced R" (Wickham), which provides his way, quoted
>> below, of keeping variables. This cherry-picking approach clearly is
>> not practical with a large dataset. 
>> 
>> "If you know the columns you don’t want, use set operations to work out
>> which colums to keep: df[setdiff(names(df), "z")]"
>> 
>> I'm looking for a way of producing an output of 1000 plus variables,
>> such that I can get a clean listing of variables, not like from st(),
>> that are easily copy-pastable for selecting the variables I want to
>> keep. 
>> 
>> Any suggestion is appreciated.
>> Thanks. 
>> Bruce
>> 
>> ______________________________________________
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to