Re: [R] Why points() is defined specially for a 1 by 2 matrix?

2009-10-19 Thread hadley wickham
>  To answer one of your other questions: ggplot (and lattice) is/are > very powerful, but base graphics are (a) easier to get your head around > and (b) easier to adjust if you don't like the defaults.  Changing things > just a little bit in ggplot can be difficult (as an example, the answer to >

Re: [R] Putting names on a ggplot

2009-10-20 Thread hadley wickham
On Sun, Oct 18, 2009 at 10:29 AM, John Kane wrote: > Thanks Stefan, the annotate approach works beautifully.  I had not got that > far in Hadley's book apparently :( > > I'm not convinced though that the explaination > >> you shouldn't use aes in this case since nampost, >> temprange, ... are not

Re: [R] Sandard deviation calculation

2009-10-26 Thread hadley wickham
> What are the values of > >  length((Ht_cm[type=='SD'][from_treeline=='above'])[1]) I suspect the error is in the subsetting - the following seems more plausible: Ht_cm[type=='SD' && from_treeline=='above'] Hadley -- http://had.co.nz/ __ R-help@r-p

Re: [R] GGPLOT2 Different Layers Different X Values

2009-10-28 Thread Hadley Wickham
Hi John, Could you please provide a small reproducible example? Thanks, Hadley Sent from my iPhone On 26/10/2009, at 6:50 PM, Jonathan Bleyhl > wrote: I'm trying to plot values based on a date and then overlay a histogram also by date. The problem is that both data sets don't have exact

Re: [R] ggplot2: stat_bin ..count.. with geom_text when NA is present

2009-10-28 Thread hadley wickham
Hi Bryan, Thanks for the reproducible example. The problem is actually in your code, not mine ;) You probably want: y = min(res, na.rm = TRUE) - 0.1 * diff(range(res, na.rm = TRUE)) Hadley (drop = TRUE solves a difference problem - it controls whether or not to remove bins with zero count) On

Re: [R] "The system cannot find the file specified"

2009-10-29 Thread hadley wickham
> Do you have write permission in C:\Program Files\R\R-2.9.2\library?  It > could be that the installer just tried to create the QRMlib subdir, and > failed, and that's why it doesn't exist. One possible reason for failure is that your virus checker prevented the R installer from creating a new di

Re: [R] ggplot2: Histogram with negative values on x-axis doesn't work

2009-10-29 Thread hadley wickham
> I can reproduce it with for example > x=c(-9.23, -9.56, -1.40) > > But adding a single positive number, even .001, fixes it, while > adding a similar negative number introduces a new error message, so it > really looks like a bug in ggplot2 when all the values are negative. > > Report it to t

Re: [R] multiple pages with ggplot2 facet_wrap?

2009-10-30 Thread hadley wickham
On Wed, Oct 28, 2009 at 8:19 PM, Bill Gillespie wrote: > I currently use lattice functions to produce multiple pages of plots using > the "layout" argument to specify the number of rows and columns of panels, > e.g., > > xyplot(price ~ carat | clarity, diamonds, layout = c(2, 2)) > > This results

Re: [R] "Safe" way to automatically install required packages...

2009-11-02 Thread hadley wickham
If you package "depends" on another package, it will be automatically installed. Hadley On Mon, Nov 2, 2009 at 12:56 PM, Jonathan Greenberg wrote: > R-helpers: > >   I'm working on an r-package that I want to make as easy-to-use as possible > for a novice R-user, which includes automatically ins

Re: [R] Patterned shading in ggplot

2009-11-04 Thread hadley wickham
Hi Paul, You might want to try the gray colour scale - scale_fill_grey() Unfortunately grid (the underlying graphics library that ggplot2 uses) does not currently support patterns. Hadley On Wed, Nov 4, 2009 at 4:17 AM, Paul Chatfield wrote: > > Am trying to produce a graph which prints out w

Re: [R] map of a country and its different geographical levels

2009-11-07 Thread hadley wickham
> If readShapePoly() (deprecated - use readShapeSpatial() instead) says that > the data are not polygons, then they are not. If you want to fill > administrative boundaries polygons, you need polygons, not lines. The source > you are using is based on OpenStreetMaps, so more likely to be lines, and

[R] Extracting matched expressions

2009-11-08 Thread Hadley Wickham
Hi all, Is there a tool in base R to extract matched expressions from a regular expression? i.e. given the regular expression "(.*?) (.*?) ([ehtr]{5})" is there a way to extract the character vector c("one", "two", "three") from the string "one two three" ? Thanks, Hadley -- http://had.co.nz/

Re: [R] Extracting matched expressions

2009-11-09 Thread Hadley Wickham
an wrote: > Is this what you want: > >> x <- ' one two three ' >> y <- >> sub(".*?([^[:space:]]+)[[:space:]]+([^[:space:]]+)[[:space:]]+([ehrt]{5}).*", > +     "\\1 \\2 \\3", x, perl=TRUE) >> unlist(strspl

Re: [R] Complicated For Loop (to me)

2009-11-09 Thread hadley wickham
Have a look at ?split Hadley On Mon, Nov 9, 2009 at 10:41 AM, agm. wrote: > > Hello, > > I'm trying to run a loop that will subset my data into specific sets by > regions and by race/ethnicity.  I'm trying to do this fairly compactly, and > I cannot get this to work. > > A "simple" version of th

Re: [R] drop unused levels in subset.data.frame

2009-11-10 Thread Hadley Wickham
If you don't want to preserve factor levels when subsetting use characters. There are very few other differences in behavior. Hadley On Tuesday, November 10, 2009, baptiste auguie wrote: > Dear list, > > subset has a 'drop' argument that I had often mistaken for the one in > [.factor which remov

Re: [R] Data transformation

2009-11-11 Thread hadley wickham
>> (x.n <- cast(x.m, id ~ var, function(.dat){ > +     if (length(.dat) == 0) return(0)  # test for no data; return > zero if that is the case > +     mean(.dat) > + })) Or fill = 0. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list h

Re: [R] Is there a way to specify drop=FALSE as the global default?

2009-11-11 Thread hadley wickham
> See the above example. Is there a way to make 'drop=FALSE' as global > default, so that when I say 'tmp[,1]', R will treat it as > 'tmp[,1,drop=FALSE]'? The following code won't change the defaults, but it would at least let you know when you're making the mistake: trace_all <- function(fs, tra

Re: [R] By processing on two variables at once?

2009-11-12 Thread hadley wickham
>> aggregate(dat$value, list(dat$x, dat$y), mean) >  Group.1 Group.2  x > 1       1       1 15 > 2       1       2 30 >> newdat <-aggregate(dat$value, list(dat$x, dat$y), mean) >> names(newdat) <- c("x","y",bquote("mean(value)") ) That bquote isn't necessary because you're already working with str

[R] Escaping regular expressions

2009-11-13 Thread Hadley Wickham
Hi all, Is there a method for escaping strings to be used regular expressions? i.e. if I have a user supplied string that I'd like to use as a fixed component is there a method that will turn (e.g.) ".$^" into "\\.\\$\\^" ? Thanks, Hadley -- http://had.co.nz/

Re: [R] Escaping regular expressions

2009-11-13 Thread Hadley Wickham
Test.$^", "Test"), fixed = TRUE) > > On Fri, Nov 13, 2009 at 11:33 AM, Hadley Wickham wrote: >> Hi all, >> >> Is there a method for escaping strings to be used regular expressions? >>  i.e. if I have a user supplied string that I'd like to use

Re: [R] How to show all the functions and classes that are defined in a library?

2009-11-13 Thread hadley wickham
On Fri, Nov 13, 2009 at 9:53 AM, Peng Yu wrote: > library(some_library_name) Try help(package = some_library_name) Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the pos

Re: [R] Presentation of data in Graphical format

2009-11-18 Thread hadley wickham
> Yes I tried all the basic ones like box plot, pie chart, etc but the data > representation isnt that clear. Given that you have neither provided your data, nor explained what you are trying to uncover from it, what sort of advice do you expect to get? Hadley -- http://had.co.nz/

Re: [R] Presentation of data in Graphical format

2009-11-18 Thread hadley wickham
of > duties performed by the Secretary from HR dept and 4th column is time > required to perform the duty > > so there are many such posts and dept with varied duties and times resp > > > Regards > > Our Thoughts have the Power to Change our Destiny. > Sunita > > >

Re: [R] formatting dates in axis labels (ggplot2)

2009-11-20 Thread hadley wickham
Hi Michael, > I'm having trouble figuring out how to format Date variables when used as > axis labels in graphs. > The particular case here is an attempt to re-create Nightingale's coxcomb > graph with ggplot2, > where I'd like the months to be labeled as "Mar 1885", "Apr 1885", using a > date for

Re: [R] how to tell if its better to standardize your data matrix first when you do principal

2009-11-22 Thread hadley wickham
You've asked the same question on stackoverflow.com and received the same answer. This is rude because it duplicates effort. If you urgently need a response to a question, perhaps you should consider paying for it. Hadley On Sun, Nov 22, 2009 at 12:04 PM, masterinex wrote: > > so under which c

Re: [R] recognizing variable-names of cast() (package: reshape)

2009-11-22 Thread hadley wickham
Hi David, The variable names of mma are sp, dentro, variable and value - area is not a variable in that data frame. Hadley On Sun, Nov 22, 2009 at 9:48 PM, David Douterlungne wrote: > > > > Dear R Users, > > > > > Reshape seems to be very useful for data-manipulation, but I am struggling > wit

Re: [R] Check if string has all alphabets or numbers

2009-11-23 Thread hadley wickham
>> mywords<- c("harry","met","sally","subway10","1800Movies","12345", "not >> correct 123") >> all.letters <- grep("^[[:alpha:]]*$", mywords) >> all.numbers <- grep("^[[:digit:]]*$", mywords)  # numbers >> mixed <- grep("^[[:digit:][:alpha:]]*$", mywords) mywords<- c("harry","met","sally","subway

Re: [R] reshape question

2009-11-24 Thread hadley wickham
> I don't really understand what you want and the example solution throws away > quite a lot of data, so consider this alternative: > > data.out2 <- read.table(textConnection("id   rater.1 n.1   rater.2 n.2 > rater.3 n.3   rater.4 n.4 > 11   11 0.118  79        NA  NA        NA  NA        NA  N

Re: [R] ggplot legend for multiple time series

2009-12-01 Thread hadley wickham
Because of the combinatorial nature of ggplot2, it is simply not possible to provide an example that illustrates every single combination of options. There are already over 600 example graphics in the package - if you can't find one that exactly meets your need, you need to buy the book and learn

Re: [R] Data Manipulation Question

2009-12-03 Thread hadley wickham
On Thu, Dec 3, 2009 at 3:52 PM, John Filben wrote: > Can R support data manipulation programming that is available in the SAS > datastep?  Specifically, can R support the following: > -  Read multiple dataset one record at a time and compare values from > each; then base on if-then logic

Re: [R] Scraping a web page

2009-12-03 Thread hadley wickham
> If you're after text, then it's probably a matter of locating the element > that encloses the data you want-- perhaps by using getNodeSet along with an > XPath[1] that specifies the element you are interest with.  The text can > then be recovered using the xmlValue() function. And rather than tr

Re: [R] [ggplot2] Wind rose orientation

2009-12-03 Thread hadley wickham
Hi Thomas, I suspect you want geom_bar(stat = "identity", width = 1), but it's hard to be sure without a reproducible example. Hadley On Thu, Dec 3, 2009 at 8:18 PM, Thomas S. Dye wrote: > Aloha all, > > I love using ggplot.  It took a while to get used to the grammar of > graphics, but it is

Re: [R] How to get a matrix by sapply (with strsplit)?

2009-12-03 Thread hadley wickham
> I'm not sure what the other lines(but the last) have to do anything, but are > you looking for something like this: > > do.call(rbind, sapply(paste(1:10, 1:10), strsplit, split=' ')) strsplit is already vectorised wrt its first argument, so all you need is: do.call(rbind, strsplit(paste(1:10,

Re: [R] [ggplot2] Wind rose orientation

2009-12-03 Thread hadley wickham
t;  | > | ""    |    90 |   0 | "13 - 24" | > | ""    |  67.5 |   0 | "13 - 24" | > | ""    |    45 |   0 | "13 - 24" | > | ""    |  22.5 |   0 | "13 - 24" | > | ""    |   360 |   1 | "13 - 24

Re: [R] [ggplot2] Wind rose orientation

2009-12-06 Thread hadley wickham
ce > all my graphs with ggplot2.  If there is a bug that swallows the 16th bar, > though, then I'll make my wind rose with another package and wait patiently > until ggplot2 plots the full compass. > Thanks again for a terrific software package. > All the best, > Tom > Begi

Re: [R] [ggplot2] Wind rose orientation

2009-12-07 Thread hadley wickham
> The idea of plotting a wind rose must be fairly common.  I wonder if it > would make sense to have a switch that would wrap data around the ends of a > continuous scale? Probably - but it requires a lot of work, because ggplot2 doesn't currently support circular scales, which is what you really

Re: [R] [GGPLOT] Legends at different layers

2009-12-07 Thread hadley wickham
You mean: dat <- data.frame(x = rnorm(100)) dat1 <- data.frame( x = c(0,0), y = c(1,0), Label = c("Point1", "Point2") ) ggplot(dat, aes(x)) + geom_histogram(aes(fill = ..count..)) + geom_point(aes(x, y, colour = Label), data = dat1, size = 4) + scale_fill_gradient("Count", low = "gre

Re: [R] problem with split eating giga-bytes of memory

2009-12-08 Thread hadley wickham
Hi Mark, Why are you using factors? I think for this case you might find characters are faster and more space efficient. Alternatively, you can have a look at the plyr package which uses some tricks to keep memory usage down. Hadley On Tue, Dec 8, 2009 at 9:46 PM, Mark Kimpel wrote: > Charles

Re: [R] Assigning variables into an environment.

2009-12-09 Thread hadley wickham
On Wed, Dec 9, 2009 at 9:43 PM, Rolf Turner wrote: > > I am working with a somewhat complicated structure in which > I need to deal with a function that takes ``basic'' arguments > and also depends on a number of parameters which change depending > on circumstances. > > I thought that a sexy way o

Re: [R] Fwd: Evaluating a function within a pre-defined environment?

2009-12-10 Thread hadley wickham
On Wed, Dec 9, 2009 at 4:48 PM, David Reiss wrote: > Ideally I would like to be able to use the function f (in my example) > as-is, without having to designate the environment as an argument, or > to otherwise have to use "e$x" in the function body. e <- new.env() e$x <- 3 f <- function(xx) x <<-

Re: [R] About R memory management?

2009-12-10 Thread hadley wickham
For the case below, you don't need to know anything about how R manages memory, but you do need to understand basic concepts algorithmic complexity. You might find "The Algorithm Design Manual", http://www.amazon.com/dp/1848000693, a good start. Hadley On Thu, Dec 10, 2009 at 10:26 AM, Peng Yu

Re: [R] barplot and cumulative curve using ggplot2 layers

2009-12-10 Thread hadley wickham
Hi Sunita, To get the bars, you want: ggplot(mydata, aes(x = factor(jobno), y = recruits)) + geom_bar() and to add the cumulative sum, first add it to the data: mydata$cum_ recruits <- cumsum(mydata$recruits) and then add another layer: + geom_line(aes(y = cum_recruits, group = 1)) Hadley O

[R] [R-pkgs] ggplot2: version 0.8.4

2009-12-10 Thread Hadley Wickham
ggplot2 ggplot2 is a plotting system for R, based on the grammar of graphics, which tries to take the good parts of base and lattice graphics and avoid bad parts. It takes care of many of the fiddly details that make plotting a hassle (l

Re: [R] How to get rid of NULL in the result of apply()?

2009-12-10 Thread hadley wickham
> Is there a version of apply that returns a list without NULL's? > > I try to remove NULL elements in the following example, but neither > for loops work. Would you please let me know what the correct way is? Try this function: compact <- function(x) Filter(Negate(is.null), x) compact(x) Hadley

Re: [R] Sources for open sourced homework questions for R?

2009-12-11 Thread hadley wickham
Hi Dave, I have a few drills available from http://had.co.nz/stat405 - see the right hand column, about half way down. They seem similar in spirit to what you're thinking of. You might want to look at the "Little Schemer" for a similar approach with a different programming language. However, I'm

Re: [R] Why a list of NULL's are reduced to NULL?

2009-12-11 Thread hadley wickham
> A very common situation is that the users don't know all the possible > return types of 'some_third_party_function()'. If the users don't know > all the return types, he/she can not make sure the return type of > function(x) {...} be always the same. How do you deal with this case? It's not that

Re: [R] ggplot: Problem with legend background

2009-12-11 Thread hadley wickham
Hi Luc, You want: legend.title=theme_text(size=20, hjust = 0) So the legend title is left aligned, not centred. Hadley On Fri, Dec 11, 2009 at 9:26 AM, MUHC_Research wrote: > > Dear R-users, > > I am preparing graphs for an upcoming article using the different functions > of the ggplot2 pack

Re: [R] Is there lazy copy in R?

2009-12-14 Thread hadley wickham
> I don't understand what these addresses mean. Would you please help me > understand it? Did you try reading the documentation? When an object is traced any copying of the object by the C function ‘duplicate’ or by arithmetic or mathematical operations produces a message to standa

Re: [R] expand.grid game

2009-12-19 Thread hadley wickham
> I hope I have missed a better way to do this in R. Otherwise, I > believe what I'm after is some kind of C or C++ macro expansion, > because the number of loops should not be hard coded. Why not generate the list of integers that sum to 17, and then mix with 0s as appropriate? Hadley -- http:

[R] [R-pkgs] ggplot2 version 0.8.5

2009-12-22 Thread Hadley Wickham
ggplot2 ggplot2 is a plotting system for R, based on the grammar of graphics, which tries to take the good parts of base and lattice graphics and avoid bad parts. It takes care of many of the fiddly details that make plotting a hassle (l

Re: [R] by-group processing

2009-05-08 Thread hadley wickham
On Wed, May 6, 2009 at 8:12 PM, jim holtman wrote: > Ths should do it: > >> do.call(rbind, lapply(split(x, x$ID), tail, 1)) >         ID Type N > 45900 45900    I 7 > 46550 46550    I 7 > 49270 49270    E 3 Or with plyr: library(plyr) ddply(x, "id", tail, 1) plyr encapsulates the common split-

Re: [R] ggplot2: recommended workaround for broken legend.position="top"

2009-05-10 Thread hadley wickham
On Sun, May 10, 2009 at 10:32 AM, Zeljko Vrba wrote: > Searching the mail archives I found that using legend.position as in > p.ring.3 + opts(legend.position="top") > > is a known bug.  I tried doing > p.ring.3 + opts(legend.position=c(0.8, 0.2)) > > which works, but the legend background is trans

Re: [R] Help with reshape/reShape and indexing

2009-05-13 Thread hadley wickham
> This does it more or less your way: > > ds <- split(df, df$Name) > ds <- lapply(ds, function(x){x$Index <- seq_along(x[,1]); x}) > df2 <- unsplit(ds, df$Name) > tapply(df2$X1, df2[,c("Name", "Index")], function(x) x) > > athough there may exist much easier ways ... Here's one way with the plyr a

Re: [R] Function to read a string as the variables as opposed to taking the string name as the variable

2009-05-14 Thread hadley wickham
On Thu, May 14, 2009 at 12:16 PM, Lori Simpson wrote: > I am writing a custom function that uses an R-function from the > reshape package: cast.  However, my question could be applicable to > any R function. > > Normally one writes the arguments directly into a function, e.g.: > > result=cast(tabl

Re: [R] memory usage grows too fast

2009-05-14 Thread hadley wickham
On Thu, May 14, 2009 at 6:21 PM, Ping-Hsun Hsieh wrote: > Hi All, > > I have a 1000x100 matrix. > The calculation I would like to do is actually very simple: for each row, > calculate the frequency of a given pattern. For example, a toy dataset is as > follows. > > Col1    Col2    Col3    Co

Re: [R] assign unique size of point in xyplot

2009-05-15 Thread hadley wickham
On Thu, May 14, 2009 at 2:14 PM, Garritt Page wrote: > Hello,I am using xyplot to try and create a conditional plot.  Below is a > toy example of the type of data I am working with > > slevel <- rep(rep(c(0.5,0.9), each=2, times=2), times=2) > > tlevel <- rep(rep(c(0.5,0.9), each=4), times=2) > >

Re: [R] ggplot2: annotating plot with mathematical formulae

2009-05-16 Thread hadley wickham
Hi Paul, Unfortunately that's not something that's currently possible with ggplot2, but I am thinking about how to make it possible. Hadley On Sat, May 16, 2009 at 7:48 AM, Paul Emberson wrote: > Hi Stephen, > > The problem is that the label on the graph doesn't get rendered with a > superscrip

Re: [R] ggplot2 and Date class

2009-06-01 Thread hadley wickham
You might have an out-of-date version of the plyr package - try install.packages("plyr") Hadley On Mon, Jun 1, 2009 at 10:20 AM, Matt Frost wrote: > I'm trying to plot a time series in ggplot, but a date column in my > data frame is causing errors. Rather than provide my own data, I'll > just re

Re: [R] [ANN] ggplot2 + rggobi course. July 30-31, Washington DC

2009-06-02 Thread hadley wickham
. All proceeds go to the GGobi Foundation to support graphics research. Find out more, and book your tickets online at http://lookingatdata.com Regards, Hadley Wickham Dianne Cook __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo

Re: [R] Still can't find missing data - How do I get NA in xtabs with factors?

2009-06-02 Thread hadley wickham
>> Let's see if I understand this.  Do I iterate through >>    x <- factor(x, levels(c(levels(x), NA), exclude=NULL) >> for each of the few hundred variables (x) in my data frame? > > > Yes, for all being factors. Wouldn't addNA() be the preferred method? To do it for all variables is pretty simp

Re: [R] Minor tick marks for date/time ggplot2 (this is better, but not exactly what I want)

2009-06-04 Thread hadley wickham
On Mon, Jun 1, 2009 at 2:18 PM, stephen sefick wrote: > library(ggplot2) > > melt.updn <- (structure(list(date = structure(c(11808, 11869, 11961, 11992, > 12084, 12173, 12265, 12418, 12600, 12631, 12753, 12996, 13057, > 13149, 11808, 11869, 11961, 11992, 12084, 12173, 12265, 12418, > 12600, 12631,

Re: [R] OT: Inference for R - Interview

2009-06-04 Thread hadley wickham
Is it really necessary to further advertise this company which already spams R-help subscribers? Hadley On Thu, Jun 4, 2009 at 10:41 PM, Ajay ohri wrote: > Dear All, > > Slightly off -non technical topic ( but hey it is Friday) > > Following last week's interview with REvolution Computing which m

Re: [R] A very frustrating read.table error message

2009-06-06 Thread hadley wickham
On Sat, Jun 6, 2009 at 5:02 PM, Adam D. I. Kramer wrote: > Dear Colleagues, > >        Occasionally I deal with computer-generated (i.e., websurvey) data > files that haven't quite worked correctly. When I try to read the data into > R, I get something like this: > > Error in scan(file, what, nmax,

Re: [R] Looking for easy way to normalize data by groups

2009-06-08 Thread hadley wickham
On Mon, Jun 8, 2009 at 10:29 AM, Herbert Jägle wrote: > Hi, > > i do have a dataframe representing data from a repeated experiment. PID is a > subject identifier, Time are timepoints in an experiment which was repeated > twice. For each subject and all three timepoints there are 2 sets of four > va

Re: [R] how to substitute missing values (NAs) by the group means

2009-06-08 Thread hadley wickham
On Mon, Jun 8, 2009 at 8:56 PM, Mao Jianfeng wrote: > Dear Ruser's > > I ask for helps on how to substitute missing values (NAs) by mean of the > group it is belonging to. > > my dummy dataframe is: > >> df >       group traits > 1  BSPy01-10     NA > 2  BSPy01-10    7.3 > 3  BSPy01-10    7.3 > 4  

[R] Programmatically copying a graphic to the clipboard

2009-06-12 Thread Hadley Wickham
Hi all, Is there a cross-platform way to do this? On the mac, I cando this by saving an eps file, and then using pbcopy. Is it possible on other platforms? Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman

[R] [OT] VBA to save excel as csv

2009-06-15 Thread Hadley Wickham
Hi all, This is a little off-topic, but it is on the general topic of getting data in R. I'm looking for a excel macro / vba script that will export all spreadsheets in a directory (with one file per tab) into csv. Does anyone have anything like this? Thanks, Hadley -- http://had.co.nz/ ___

[R] Learning S3

2009-06-18 Thread Hadley Wickham
Hi all, Do you know of any good resources for learning how S3 works? I've some how become familiar with it by reading many small pieces, but now that I'm teaching it to students I'm wondering if there are any good resources that describe it completely, especially in a reader-friendly way. So far

Re: [R] Learning S3

2009-06-18 Thread Hadley Wickham
ck > Sent: Thursday, June 18, 2009 9:17 AM > To: Hadley Wickham > Cc: r-help > Subject: Re: [R] Learning S3 > > There is a section on Object Orientation in MASS (I have 2nd ed). > > On Thu, Jun 18, 2009 at 12:06 PM, Hadley Wickham wrote: >> Hi all, >> >> Do you k

Re: [R] Dataset suggestion sought

2009-06-18 Thread hadley wickham
> In revising my book Regression Modeling Strategies for a second edition, I > am seeking a dataset for exemplifying multiple regression using least > squares.  Ideally the dataset would have 5-40 variables and 40-1 > independent observations, and would generate significant interest for a wide

Re: [R] Roxygen vs Sweave for S4 documentation

2009-06-21 Thread hadley wickham
> I have been using R for a while.  Recently, I have begun converting my > package into S4 classes.  I was previously using Rdoc for documentation. > Now, I am looking to use the best tool for S4 documentation.  It seems that > the best choices for me are Roxygen and Sweave (I am fine with tex). >

[R] [R-pkgs] plyr 0.1.9

2009-06-23 Thread Hadley Wickham
plyr is a set of tools for a common set of problems: you need to break down a big data structure into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to: * fit the same model to subsets of a data frame * quickly calculate summary

Re: [R] Apply as.factor (or as.numeric etc) to multiple columns

2009-06-23 Thread hadley wickham
Hi Mark, Have a look at colwise (and numcolwise and catcolwise) in the plyr package. Hadley On Tue, Jun 23, 2009 at 4:23 PM, Mark Na wrote: > Hi R-helpers, > > I have a dataframe with 60columns and I would like to convert several > columns to factor, others to numeric, and yet others to dates. R

Re: [R] "by" question

2009-06-24 Thread hadley wickham
You might also want to look at the plyr package, http://had.co.nz/plyr. In particular, ddply + transform makes these tasks very easy. library(plyr) ddply(mtcars, "cyl", transform, pos = seq_along(cyl), mpg_avg = mean(mpg)) Hadley On Wed, Jun 24, 2009 at 11:48 AM, David Hugh-Jones wrote: > That

Re: [R] Following progress in a lapply() function

2009-03-22 Thread hadley wickham
On Sun, Mar 22, 2009 at 5:06 PM, Blanchette, Marco wrote: > Dear all, > > I am processing a very long and complicated list using lapply through a > custom function and I would like to generate some sort of progress report. > For instance, print a dot on the screen every time 1000 item have been

Re: [R] histogram plots with many different samples

2009-03-25 Thread hadley wickham
Or use frequency polygons, if you want to stay with the interpretability of a histogram. Hadley On Wed, Mar 25, 2009 at 12:07 PM, Greg Snow wrote: > Personally I find those types of plots difficult to interpret.  Much easier > to create, view, and interpret is to simply plot the lines from densi

Re: [R] how to input multiple .txt files

2009-03-30 Thread hadley wickham
On Mon, Mar 30, 2009 at 10:33 AM, Mike Lawrence wrote: > To repent for my sins, I'll also suggest that Hadley Wickham's "plyr" > package (http://had.co.nz/plyr/) is also useful/parsimonious in this > context: > > a <- ldply(cust1_files,read.table) You might also want to do names(cust1_files) <-

Re: [R] Calculating First Occurance by a factor

2009-03-30 Thread hadley wickham
On Mon, Mar 30, 2009 at 2:58 PM, Mike Lawrence wrote: > I discovered Hadley Wickham's "plyr" package last week and have found > it very useful in circumstances like this: > > library(plyr) > > firstfixtime = ddply( >       .data = data >       , .variables = c('Sub','Tr','IA') >       , .fun <- fu

[R] Bug in col2rgb?

2009-03-31 Thread hadley wickham
> col2rgb("#0079", TRUE) [,1] red 0 green0 blue 0 alpha 121 > col2rgb("#0080", TRUE) [,1] red255 green 255 blue 255 alpha0 > col2rgb("#0081", TRUE) [,1] red 0 green0 blue 0 alpha 129 Any ideas? Thanks, Hadley -- http://had.co

Re: [R] Using apply to get group means

2009-03-31 Thread hadley wickham
On Tue, Mar 31, 2009 at 11:31 AM, baptiste auguie wrote: > Not exactly the output you asked for, but perhaps you can consider, > > library(doBy) >> summaryBy(x3~x2+x1,data=x,FUN=mean) >> >>  x2 x1 x3.mean >> 1  1  A     1.5 >> 2  1  B     2.0 >> 3  1  C     3.5 >> 4  2  A     4.0 >> 5  2  B     5.

Re: [R] Reshape: 'melt' numerous objects

2009-03-31 Thread hadley wickham
On Tue, Mar 31, 2009 at 11:12 AM, Steve Murray wrote: > > Dear R Users, > > I'm trying to use the reshape package to 'melt' my gridded data into column > format. I've done this before on individual files, but this time I'm trying > to do it on a directory of files (with variable file names) - th

Re: [R] ggplot: order of numeric factor levels?

2009-03-31 Thread hadley wickham
On Tue, Mar 31, 2009 at 5:01 PM, Marianne Promberger wrote: > Hi, > > I'm having problems with qplot and the order of numeric factor levels. > > Factors with numeric levels show up in the order in which they appear > in the data, not in the order of the levels (as far as I understand > factors!) >

Re: [R] Calculating First Occurance by a factor

2009-04-01 Thread hadley wickham
> I tried messing with the line df$FixTime[which.min(df$FixInx)] changing it > to df[which.min(df$FixInx)] or adding new lines with the additional columns > that I want to include, but nothing seemed to work. I'll admit I only have a > mild understanding of what is going on with the function .fun.

Re: [R] Calculating First Occurance by a factor

2009-04-01 Thread hadley wickham
On Wed, Apr 1, 2009 at 11:00 AM, hadley wickham wrote: >> I tried messing with the line df$FixTime[which.min(df$FixInx)] changing it >> to df[which.min(df$FixInx)] or adding new lines with the additional columns >> that I want to include, but nothing seemed to work. I'

Re: [R] Public R servers?

2009-04-01 Thread hadley wickham
> Earlier I posted a question about memory usage, and the community's input was > very helpful.  However, I'm now extending my dataset (which I use when > running a regression using lm).  As a result, I am continuing to run into > problems with memory usage, and I believe I need to shift to impl

Re: [R] Deleting rows based on identity variable

2009-04-02 Thread hadley wickham
On Thu, Apr 2, 2009 at 3:37 PM, Rowe, Brian Lee Yung (Portfolio Analytics) wrote: > Is this what you want: >> d1[which(id != 4),] Or just d1[id != 4, ] Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/li

Re: [R] Selecting all rows of factors which have at least one positive value?

2009-04-02 Thread hadley wickham
>   X1 X2 > 1  11  0 > 2  11  0 > 3  11  0 > 4  11  1 > 5  12  0 > 6  12  0 > 7  12  0 > 8  13  0 > 9  13  1 > 10 13  1 > > > and I want to select all rows pertaining to factor levels of X1 for > which exists at least one "1" for X2. To be clear, I want rows 1:4 > (since there exists at least one o

Re: [R] plyr and table question

2009-04-03 Thread hadley wickham
On Fri, Apr 3, 2009 at 4:43 AM, baptiste auguie wrote: > Dear all, > > I'm puzzled by the following example inspired by a recent question on > R-help, > > > cc <- textConnection("user_id  website          time > 20        google            0930 > 21        yahoo            0935 > 20        faceboo

Re: [R] plyr and table question

2009-04-03 Thread hadley wickham
On Fri, Apr 3, 2009 at 8:43 AM, baptiste auguie wrote: > That makes sense, so I can do something like, > > count <- function(x){ >        as.integer(unclass(table(x))) > } > > count(d$user_id) > > ddply(d, .(user_id), transform, count = count(user_id)) > >>  user_id  website time count >> 1      2

Re: [R] data.frame to array?

2009-04-03 Thread hadley wickham
On Fri, Apr 3, 2009 at 1:45 PM, wrote: > I have a list of data.frames > >> str(bins) > > List of 19217 >  $ 100026:'data.frame': 1 obs. of  6 variables: >  ..$ Sku  : chr "100026" >  ..$ Bin  : chr "T149C" >  ..$ Count: int 108 >  ..$ X    : int 20 >  ..$ Y    : int 149 >  ..$ Z    : chr "3" >  $

Re: [R] data.frame, converting row data to columns

2009-04-04 Thread hadley wickham
On Sat, Apr 4, 2009 at 12:09 PM, ds wrote: > > I have a data frame something like: >                      name         wrist > nLevel            emot > 1                    4094          3.34                    1 > frustrated > 2                    4094          3.94                    1 > frustra

Re: [R] data.frame, converting row data to columns

2009-04-04 Thread hadley wickham
On Sat, Apr 4, 2009 at 12:28 PM, jim holtman wrote: > Does this do what you want: > >> x <- read.table(textConnection("name         wrist nLevel            emot > + 1                    4094          3.34                    1   frustrated > + 2                    4094          3.94                

Re: [R] package: maps and spatstat question

2009-04-06 Thread hadley wickham
Hi Laura, You might find the map_data function from the ggplot2 package helpful: library(ggplot2) library(maps) head(map_data("state", "iowa")) It formats the output of the map command into a self-documenting data frame. Hadley On Mon, Apr 6, 2009 at 7:00 AM, Laura Chihara wrote: > > I would

Re: [R] Best way to turn a list into a data.frame

2009-04-06 Thread hadley wickham
On Mon, Apr 6, 2009 at 8:49 AM, Daniel Brewer wrote: > Hello, > > What is the best way to turn a list into a data.frame? > > I have a list with something like: > $`3845` >  [1] "04010" "04012" "04360" > > $`1029` > [1] "04110" "04115" > > And I would like to get a data frame like the following: >

Re: [R] SUM,COUNT,AVG

2009-04-06 Thread hadley wickham
On Mon, Apr 6, 2009 at 9:34 AM, Stavros Macrakis wrote: > There are various ways to do this in R. > > # sample data > dd <- data.frame(a=1:10,b=sample(3,10,replace=T),c=sample(3,10,replace=T)) > > Using the standard built-in functions, you can use: > > *** aggregate *** > > aggregate(dd,list(b=dd$

Re: [R] Collapse data matrix with extra info separated by commas

2009-04-06 Thread hadley wickham
On Mon, Apr 6, 2009 at 10:40 AM, baptiste auguie wrote: > Here's one attempt with plyr, hopefully Hadley will give you a better > solution ( I could not get cast() to do it either) > > test <- > data.frame(a=c("A","A","A","A","B","B","B"),b=c(1,1,2,2,1,1,1),c=sample(1:7)) > ddply(test,.(a,b),.fun=

Re: [R] SUM,COUNT,AVG

2009-04-06 Thread hadley wickham
On Mon, Apr 6, 2009 at 5:31 PM, Jun Shen wrote: > This is a good example to compare different approaches. My understanding is > > aggregate() can apply one function to multiple columns > summarize() can apply multiple functions to one column > I am not sure if ddply() can actually apply multiple f

Re: [R] change inter-line spacing in grid graphics - how to?

2009-04-07 Thread hadley wickham
Have a look at ?gpar - it will tell you about lineheight. Hadley On Tue, Apr 7, 2009 at 3:28 AM, Mark Heckmann wrote: > I am trying to change the inter-line spacing in grid.text(), but I just > don't find how to do it. > > pushViewport(viewport()) > grid.text("The inter-line spacing\n is too big

Re: [R] Using as.formula() with the reshape package cast

2009-04-07 Thread hadley wickham
On Tue, Apr 7, 2009 at 8:44 AM, wrote: > > I am trying to use the "cast" function from the reshape package, where the > formula is not passed in directly, but as the result of the as.formula() > function. > > Using reshape v. 0.7.2 > > I am able to properly melt() by data with: > >> molten <- mel

Re: [R] newbie query: simple crosstabs

2009-04-07 Thread hadley wickham
On Tue, Apr 7, 2009 at 4:41 PM, Jorge Ivan Velez wrote: > Hi Eik, > You're absolutely right. My bad. > > Here is the correction of the code I sent: > > apply(mydata[,-1], 2, tapply, mydata[,1], function(x) sum(x)/length(x)) Or more simply: apply(mydata[,-1], 2, tapply, mydata[,1], mean) Hadley

<    1   2   3   4   5   6   7   8   9   10   >