[R] Plotting survival curves from a Cox model with time dependent covariates
Dear all, Let's assume I have a clinical trial with two treatments and a time to event outcome. I am trying to fit a Cox model with a time dependent treatment effect and then plot the predicted survival curve for one treatment (or both). library(survival) test <- list(time=runif(100,0,10),event=sample(0:1,100,replace=T),trmt=sample(0:1,100,replace=T)) model1 <- coxph(Surv(time, event) ~ tt(trmt), data=test, tt=function(x, t, ...) pspline(x + t)) newdat1 <- data.frame(trmt=1,time=list(0,1,2,3,4,5)) plot(survfit(model1,newdata=newdat1,individual=T), xlab = "Years", ylab="Survival") Where I think I am failing is with how to correctly specify what I want the survfit function to do. My understanding on reading the documentation for the survival package is that I should use newdata to not only specify the treatment, but also timepoints for which I want survival estimates and that this is the scenario for which the individual=T option can be appropriate. However, I just seem to fail to figure out exactly how I should specify this. It would be greatly appreciated if someone who has done this before or knows how to do it could give me a quick (or extensive, of course) hint. Many thanks, Björn PS: Yes, I realise that a Kaplan-Meier plot would do something like the above very nicely, but once I get this to work, I am actually looking at something a bit more complicated where a KM plot would not help me. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add a density line to a cumulative histogram - second try
Thanks, I found the function ecdf() which does the job. plot( ecdf( nvtpoints), col="BLUE", lwd=1, add=TRUE ) -- View this message in context: http://r.789695.n4.nabble.com/Add-a-density-line-to-a-cumulative-histogram-second-try-tp3666969p3669310.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stacked bar plot of frequency vs time
Thank you for the solutions! I have the first one working and it does exactly what I am looking for. Unfortunately I have to put the plot in a common figure alongside other plots made in the basic environment (challenging!). With the second method, I was unable to make the stacked bars locate to the appropriate positions along the X axis (ie the appropriate time), which, though unconventional is required for my figure. So I am still looking for a complete solution in the basic plotting environment. I have boiled my problem down to this minimal example: # Made-up data tC <- textConnection(" Time Type1 Type2 Type3 1.3 .50 .25 .25 4.5 .55 .25 .20 5.2 .65 .20 .15 ") data1 <- read.table(header=TRUE, tC) data2 <- data.frame(Time=rep(data1$Time, 3), stack(data1[,2:4])) close.connection(tC) # PLOT1 Scatterplot attach(data1) par(mar=c(1,1,1,1)) plot(Time, Type1, frame=T, ylab="Divergence", col=rgb(0,100,0,50,maxColorValue=255), main="plot 1", xlim= c(0,6), ylim= c(0, 1), axes=FALSE, xlab=" ") detach(data1) # PLOT2 barplot require(lattice) attach(data2) barchart(values ~ Time, group=ind, data=data2, stack=TRUE, horizontal=FALSE, main="not there yet") plot2 <- xyplot(values ~ Time, group=ind, data=data2, stack=TRUE, horizontal=FALSE, panel=panel.barchart, ylim=c(-0.05,1.05), xlim=c(0,6), main="Plot 2- how can I plot below plot1?") print(plot2) detach(data2) The only thing left is to get both plots to be vertically aligned, one above the other on the same figure. Is this possible? Thanks for all of your thoughts. Marcel Marcel -- View this message in context: http://r.789695.n4.nabble.com/Stacked-bar-plot-of-frequency-vs-time-tp3659715p3669311.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combining elements in a data frame
Hi: Q1: Try something like # Formula interface works for R-2.11.0 and later versions RTavg <- aggregate(RT ~ word, data = alldat, FUN = mean) merge(CCCW, RTavg, by.x = row.names(CCCW), by.y = 'word', all = TRUE) If the merge doesn't work (which is entirely possible), you might want to define a variable word in CCCW first and then try again with something like merge(CCCW, RTavg, by = 'word', all = TRUE) IIRC, all = TRUE keeps words from both data frames filling missing values with NA, all.x keeps everything from the first argument of merge() and all.y keeps everything from the second argument. If you omit the option, it returns only the words that occur in both data frames. From your description, it appears you want all.x = TRUE where x = CCCW. See ?merge for specific details. Q2: See ?table and ?ftable Utterly untested code in the absence of a reproducible example, so caveat emptor. Dennis On Thu, Jul 14, 2011 at 9:17 PM, Lee Averell wrote: > Hi all, > I have 2 data frames the first contains a list with repeats of words > and an associated response time (RT) measure for each word. The second is a > tabulation of each unique word and other information such as the amount and > of responses for each word. I need to determine the mean RT for each word and > add that as a column in the second data frame. > Any help would be appreciated > Cheers > Lee > > Data frame 1 > >> head(alldat) > s expt session cycle trial left.right freq concr word rt resp > Response correct corrResp > 121 1a a 1 C1 1 1 lf hc pianist 1529 old > hi FALSE new > 122 1a a 1 C1 2 1 hf hc sweat 1518 new > hi TRUE new > 123 1a a 1 C1 3 1 lf lc carnage 1046 old > hi TRUE old > 124 1a a 1 C1 4 1 lf hc nymph 1142 old > hi TRUE old > 125 1a a 1 C1 5 1 hf lc hank 1487 new > hi TRUE new > 126 1a a 1 C1 6 1 lf hc waist 1199 new > hi TRUE new > respType > 121 s > 122 s > 123 s > 124 s > 125 s > 126 s >> > > Data frame 2 > >> head(CCCW) > FALSE TRUE CC propCC lo hi > abode 2 11 TRUE 0.8461538 4 9 > abyss 1 12 TRUE 0.9230769 2 11 > accord 2 11 TRUE 0.8461538 2 11 > account 0 0 FALSE NaN 0 0 > acre 4 9 TRUE 0.6923077 4 9 > adage 0 0 FALSE NaN 0 0 > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LME and overall treatment effects
Dear Mark, Interpreting one of the main effects when they are part of an interaction is, AFAIK, not possible. Your statement about comparing treatments when Year is continuous is not correct. The parameters of treatment assume that Year == 0! Which might lead to very strange effect when year is not centered to a year close to the observed years. Have a look at the example below set.seed(123456) dataset <- expand.grid(cYear = 0:20, Treatment = factor(c("A", "B", "C")), Obs = 1:3) dataset$Year <- dataset$cYear + 2000 Trend <- c(A = 1, B = 0.1, C = -0.5) TreatmentEffect <- c(A = 2, B = -1, C = 0.5) sdNoise <- 1 dataset$Value <- with(dataset, TreatmentEffect[Treatment] + Trend[Treatment] * cYear) + rnorm(nrow(dataset), sd = sdNoise) lm(Value ~ Year * Treatment, data = dataset) lm(Value ~ cYear * Treatment, data = dataset) If you want to focus on the treatment effect alone but take the year effect into account, then add Year as a random effect. library(lme4) lmer(Value ~ 0 + Treatment + (0 + Treatment|Year), data = dataset) In your case you want to cross the random effect of year with those of plot. Crossed random effects are hard to do with the nlme package but easy with the lme4 package. Model <- lmer(Species~ 0 + Treatment + (0 + Treatment|Year) + (1|Plot/Quadrat) ,na.action =na.omit,data=UDD) Best regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics & Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey > -Oorspronkelijk bericht- > Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > Namens Mark Bilton > Verzonden: donderdag 14 juli 2011 22:05 > Aan: Bert Gunter > CC: r-help@r-project.org > Onderwerp: Re: [R] LME and overall treatment effects > > Ok...lets try again with some code... > > --- > Hello fellow R users, > > I am having a problem finding the estimates for some overall treatment effects > for my mixed models using 'lme' (package nlme). I hope someone can help. > > Firstly then, the model: > The data: Plant biomass (log transformed) Fixed Factors: Treatment(x3 Dry, > Wet, > Control) Year(x8 2002-2009) Random Factors: 5 plots per treatment, 10 quadrats > per plot (N=1200 (3*5*10*8)). > > I am modelling this in two ways, firstly with year as a continuous variable > (interested in the difference in estimated slope over time in each treatment > 'year*treatment'), and secondly with year as a categorical variable > (interested in > difference between 'treatments'). > > -- > ie: (with Year as either numeric or factor) > > Model<-lme(Species~Year*Treatment,random=~1|Plot/Quadrat,na.action = > na.omit,data=UDD) > - > > When using Year as a continuous variable, the output of the lme means that I > can compare the 3 treatments within my model... > i.e. it takes one of the Treatment*year interactions as the baseline and > compares (contrasts) the other two to that. > - > ie > > Fixed effects: Species ~ Year * Treatment > Value Std.Error DF t-value p-value > (Intercept) 1514.3700 352.7552 1047 4.292978 0. > Year -0.75190.1759 1047 -4.274786 0. > Treatment0 -461.9500 498.8711 12 -0.925991 0.3727 > Treatment1 -1355.0450 498.8711 12 -2.716222 0.0187 > Year:Treatment0 0.23050.2488 1047 0.926537 0.3544 > Year:Treatment1 0.67760.2488 1047 2.724094 0.0066 > > so Year:Treatment0 differs from baseline Year:Treatment-1 by 0.2305 and > Year:Treatment1 is significantly different (p=0.0066) from > Year:Treatment-1 > -- > > I can then calculate the overall treatment*year effect using > 'anova.lme(Model)'. > - > > anova.lme(Model1) > numDF denDF F-value p-value > (Intercept)1 1047 143.15245 <.0001 > Year 1 1047 19.56663 <.0001 > Treatment 212 3.73890 0.0547 > Year:Treatment 2 1047 3.83679 0.0219 > > so there is an overall difference in slope between treatments (Year:Treatment > interaction) p=0.0219 > -- > > However, the problem comes when I use Year as a categorical variable. > Here, I am interested
[R] searching and replacing in a data frame.
Dear R helpers, Please have a look at the following : - Note : My goal is to find and replace all Inf's in a data array with 0. > t<-data.frame(A=c(Inf,0,0),B=c(1,2,3)) > t A B 1 Inf 1 2 0 2 3 0 3 >str(t) 'data.frame':3 obs. of 2 variables: $ A: num Inf 0 0 $ B: num 1 2 3 > t[which(t==Inf,arr.ind=T)] [1] Inf > t[which(t==Inf,arr.ind=T)]<-0 Error in `[<-.data.frame`(`*tmp*`, which(t == Inf, arr.ind = T), value = 0) : only logical matrix subscripts are allowed in replacement Query : Why does the search work but the replace not work ? Many thanks for your time and efforts. Ashim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to order each element according to alphabet
Hi: Is this what you're looking for? Lines <- " ASG,UXW,AFODJEL E,TDIWE,ROFD" # Read in the above lines (for purposes of this example only) # Note the stringsAs Factors = FALSE option! df <- read.csv(textConnection(Lines), header = FALSE, stringsAsFactors = FALSE) closeAllConnections() dm <- as.matrix(df) # convert to a character matrix # Function to sort a character string in alphabetical (lexical) order sortfun <- function(x) paste(sort(unlist(strsplit(x, ''))), collapse = '') # Apply to the rows of the matrix t(apply(df, 1, function(x) sapply(x, sortfun))) Result: V1V2 V3 [1,] "AGS" "UWX" "ADEFJLO" [2,] "E" "DEITW" "DFOR" If you need to do this for only a subset of your variables, create a character submatrix and follow the script above on that, after which you would need to do some post-processing on your own. HTH, Dennis On Thu, Jul 14, 2011 at 6:18 PM, onthetopo wrote: > Hi there, > > I have a large amino acid csv file like this: > > input.txt: > P,LV,Q,Z > P,VL,Q,Z > P,ML,QL,Z > > There is a problem with this file, since LV and VL are in fact the same > thing. > How do I order each element according to alphabetical order so that the > desired output would look like: > > output.txt: > P,LV,Q,Z > P,LV,Q,Z > P,LM,LQ,Z > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/how-to-order-each-element-according-to-alphabet-tp3668997p3668997.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Smart legend ???
On 07/14/2011 05:20 AM, JIA Pei wrote: Hi, all: Is there an automatic smart legend for R? Since my R code is running in a row, which will produce a bunch of R plots in a single run, some of the produced plots are really "ridiculous". Because my legend is fixed to "topleft", sometimes, which occludes the key parts of the figure/plots, but most of the time, the legend works just fine. I'm wondering is there a smart legend in R? Whenever I set "topleft" but occlude the actual plots, the smart legend may reset from "topleft" to "topright". Or, just try "topleft", "topright", "bottomleft" and "bottomright" in a particular sequence, and calculate the occlusion ratio. Pick up either the legend with the least occlusion, or the first priority legend when some legends are of the same occlusion? Hi JIA Pei, The "emptyspace" function in the plotrix package may be helpful. This tries to find the largest empty rectangle on a plot and returns the coordinates of the center of that rectangle. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot a vertical column of colored rectangles
Hi, I've been really struggling with this. If I have a vector like dat <- c(0,0,0,0,1,1,1,0,0,0,1,1,0,0,0,1,0,0,0) I want to plot each element as a colored rectangle (red=1, blue=1) in the right order, so they all stack up forming a vertical column on the graph. Sort of like a building, with each floor in the appropriate color. Any ideas? I've tried using ggplot and geom_tile, but my data has a million elements and the plots take forever to generate. I've also tried using a heatmap, but I need 2 columns at least, and I only have 1. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correct behavior of Hmisc::capitalize()?
On 14.07.2011 23:32, Henrik Bengtsson wrote: Hi, from example(capitalize) of the Hmisc package (v 0.8.3) you get: capitalize(c("Hello", "bob", "daN")) [1] "Hello" "Bob" "daN" Is that "daN" correct? If so, then this behavior that only *all lowercase strings*, which the code indicates, will be capitalized is not documented. Hmisc::capitalize function (string) { capped<- grep("^[^A-Z]*$", string, perl = TRUE) substr(string[capped], 1, 1)<- toupper(substr(string[capped], 1, 1)) return(string) } There are also some misspelled words in help("capitalize"). sessionInfo() R version 2.13.1 Patched (2011-07-09 r56344) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] Hmisc_3.8-3 survival_2.36-9 loaded via a namespace (and not attached): [1] cluster_1.14.0 grid_2.13.1 lattice_0.19-30 tools_2.13.1 /Henrik (Hmisc maintainer cc:ed) I don't see you CCed. The Hmisc maintainer is the only one who can answer your message appropriately. Best, Uwe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] crr - computationally singular
Hi! I guess that you have solved this error by now but I figured I'd post the result of my 12 hour debugging session in case anyone else has the same issue. Lets start with a more intuitive example that the one crr offers: # CODE START # # Define a set size my_set_size <- 1000 # Create the covariables cov <- cbind(rbinom(my_set_size, 1, .5), rbinom(my_set_size, 1, .05), rbinom(my_set_size, 1, .1)) dimnames(cov)[[2]] <- c('gender','risk factor 1','risk factor 2') # Create random time to failure/cens periods ftime <- rexp(my_set_size) # Create events my_event1 <- rbinom(my_set_size, 1, .04) my_event2 <- rbinom(my_set_size, 1, .20) # The competing event can't happen if 1 has already occurred my_event2[my_event1 > 0] <- 0 fstatus <- my_event1 + my_event2*2 # Factor the censor variable fstatus <- factor(fstatus, levels=c(0,1,2), labels=c("censored", "re-operation", "death")) # Check that it seems Ok table(fstatus) # Do the test test_results <- crr(ftime, fstatus, cov, failcode="re-operation", cencode="censored") # Output the results summary(test_results) # CODE END # Ok, so the error occurs in the .Fortran call to "crrval" (I think it was called) that returns an empty variable if you forget to specify the factor failcode, in other words exchange above crr to: test_results <- crr(ftime, fstatus, cov, failcode=1, cencode="censored") And you get the: Error in solve.default(v[[1]]) : Lapack routine dgesv: system is exactly singular Another way to get a singular error is to have a covariate that is 0. Try to exchange to this code for the covariates: cov <- cbind(rbinom(my_set_size, 1, .5), rbinom(my_set_size, 1, .05), rbinom(my_set_size, 1, .1)*0) And you get: Error in drop(.Call("La_dgesv", a, as.matrix(b), tol, PACKAGE = "base")) : Lapack routine dgesv: system is exactly singular This code has been checked with 2.13.1 and cmprsk ver. 2.2.2 I'm not so familiar with R but I believe that this is actually a bug in the cmprsk package which should check for the variables being factors and then handle them as expected. I've noticed similar issues with cuminc function that doesn't behave as expected when providing factored censoring variables. I haven't seen any issues with factoring the covariates although I've used http://www.stat.unipg.it/~luca/R/crr-addson.R Scruccas factor2ind function when I've had non-binomial factors. I hope someone out there will be able to avoid my 12 hours of debugging with this post. Max Gordon -- View this message in context: http://r.789695.n4.nabble.com/crr-computationally-singular-tp891659p3669639.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to order each element according to alphabet
On Jul 15, 2011, at 12:23 AM, onthetopo wrote: dd [,1] [,2] [1,] "OP" "SU" [2,] "XA" "YQ" sapply( lapply( + strsplit(dd, split=""), sort), + paste, collapse="") [1] "OP" "AX" "SU" "QY" The result is not what I intended since it is a single line. It should be: [,1] [,2] [1,] "OP" "SU" [2,] "AX" "QY" sortvec <- function(x) paste( sapply( strsplit(x, split=""), sort), sep="") apply(dd, 1:2, sortvec) [,1] [,2] [1,] "OP" "SU" [2,] "AX" "QY" -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Drawing a histogram from a massive dataset
Dear All, I have a massive dataset from which I would like to draw a histogram. Any ideas on how to accomplish this? Thanks in advance, Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Out of Sample Prediction Interval/Point Estimate
Hi All, I have been requested to come up with an out-of-sample prediction einterval and point estimate. I have never done this and I am hoping for help from you all. First can R do this? If so, what are the steps? What do I need? I have a data file that I can include, if that would help. I'm between a beginner and intermediate user of R so if it is complicated, I may be asking for a lot of help. Thanks so much Zeda. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Export Unicode characters from R
On 11-07-14 7:11 PM, Sverre Stausland wrote: > funny.g<- "\u1E21" > funny.g [1] "ḡ" > data.frame (funny.g) -> funny.g > funny.g$funny.g [1] ḡ Levels: I think the problem is in the data.frame code, not in writing. Data.frames try to display things in a readable way, and since you're on Windows where UTF-8 is not really supported, the code helpfully changes that character to the "" string. for display. You should be able to write the Unicode character to file if you use lower level methods such as cat(), on a connection opened using the file() function with the encoding set explicitly. Duncan Murdoch > write.table (funny.g, file = "C:/~funny.g.txt", col.names = FALSE, row.names = FALSE, quote = FALSE, fileEncoding = "UTF-8") __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] searching and replacing in a data frame.
On Jul 15, 2011, at 5:20 AM, Ashim Kapoor wrote: Dear R helpers, Please have a look at the following : - Note : My goal is to find and replace all Inf's in a data array with 0. t<-data.frame(A=c(Inf,0,0),B=c(1,2,3)) t A B 1 Inf 1 2 0 2 3 0 3 str(t) 'data.frame':3 obs. of 2 variables: $ A: num Inf 0 0 $ B: num 1 2 3 t[which(t==Inf,arr.ind=T)] [1] Inf Several problems here. `t` is a perfectly good function name so using it as an object name is confusing. t[which(t==Inf,arr.ind=T)]<-0 Error in `[<-.data.frame`(`*tmp*`, which(t == Inf, arr.ind = T), value = 0) : only logical matrix subscripts are allowed in replacement Query : Why does the search work but the replace not work ? Because you gave a numeric matrix as an argument to "data.frame.[<-" and it wanted a different mode. I think it would have worked if `t` were a matrix. Many thanks for your time and efforts. Two methods that would accomplish the task: ttt<-data.frame(A=c(Inf,0,0),B=c(1,2,3)) ttt[is.infinite(as.matrix(ttt))] <- 0 Or: apply(ttt, 1:2, function(x) x[is.infinite(x)] <- 0 ) Ashim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fixed effects Tobit, Honore style?
A cleaner and slightly more tested version is at http://davidhughjones.blogspot.com/2011/07/honore-style-fixed-effects-estimators.html David Hugh-Jones Research Associate CAGE, Department of Economics University of Warwick http://davidhughjones.googlepages.com On 13 July 2011 15:33, David Hugh-Jones wrote: > True! Here's my attempt -- use at your own risk. > > honore <- function (b, dataset, x1, x2) { > dxb <- (x2 - x1) %*% b > y1 <- # insert your y variable here > y2 <- # insert your y variable here > sum( > (pmax(y1, dxb) - pmax(y2, dxb) - dxb)^2 + > 2*(y1 < dxb)*(dxb-y1)*y2 + > 2*(y2 < -dxb)* (-dxb-y2)*y1 > ) > } > > fetobit <- function (dataset, form) { > x2 <- model.matrix(form, dataset[,T=2]) > x1 <- model.matrix(form, dataset[,T=1]) ># could maybe set initial values to something different > res <- optim(rep(0, ncol(x1)), fn=honore, x1=x1, x2=x2, > dataset=dataset, method="BFGS", control=list(maxit=1000)) > if (res$convergence != 0) warning("Didn't converge") > res$par > } > > For standard errors, bootstrap. > > > David Hugh-Jones > Research Associate > CAGE, Department of Economics > University of Warwick > http://davidhughjones.googlepages.com > > > > On 12 July 2011 21:38, Daniel Malter wrote: > >> Not that I know of, but the paper says that they are easy to compute. If >> you >> did, you could contribute the code. >> >> Best, >> Daniel >> >> >> David Hugh-Jones-3 wrote: >> > >> > Hi all, >> > >> > Is there any code to run fixed effects Tobit models in the style of >> Honore >> > (1992) in R? >> > (The original Honore article is here: >> > >> http://www.jstor.org/sici?sici=0012-9682%28199205%2960%3A3%3C533%3ATLALSE%3E2.0.CO%3B2-2 >> ) >> > >> > Cheers >> > David >> > >> > [[alternative HTML version deleted]] >> > >> > __ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/fixed-effects-Tobit-Honore-style-tp3662246p3663464.html >> Sent from the R help mailing list archive at Nabble.com. >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing Complex Formulas
On Jul 14, 2011, at 20:19 , Duncan Murdoch wrote: > On 14/07/2011 12:46 PM, warmstron1 wrote: >> I resolved this issue. It appears that "^" won't work for this case, but >> "**" worked. I can't find any reference to this, but where "^" seems to be >> used to raise a value to a numerical function, "**" is used for a y raised >> to the power of x where x it a computation. > > Those should be equivalent. Can you post the code that wasn't working, and > describe what "not working" meant? More easily, demonstrate that code that _is_ working stops working if you replace "**" with "^". Or stop spreading misinformation! -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind in aggregate formula - based on an existing object (vector)
For a little lateral thinking, consider the use of "." on the LHS. That could play out as follows: > myvars <- c("Ozone","Wind") > f <- . ~ Month > j <- union(all.vars(f[[3]]), myvars) > aggregate(. ~ Month, data=airquality[j], mean, na.rm=T) MonthOzone Wind 1 5 23.61538 11.457692 2 6 29.4 12.18 3 7 59.11538 8.523077 4 8 59.96154 8.565385 5 9 31.44828 10.075862 (and of course, when you play with something unusual, a buglet pops up: it doesn't work with f instead of the explicit formula in the call to aggregate.) On Jul 15, 2011, at 00:10 , Dennis Murphy wrote: > Hi: > > I think Bill's got the right idea for your problem, but for the fun of > it, here's how Bert's suggestion would play out: > > # Kind of works, but only for the first variable in myvars... >> aggregate(get(myvars) ~ group + mydate, FUN = sum, data = example) > group mydate get(myvars) > 1 group1 2008-12-01 4 > 2 group2 2008-12-01 6 > 3 group1 2009-01-01 40 > 4 group2 2009-01-01 60 > 5 group1 2009-02-01 400 > 6 group2 2009-02-01 600 > > # Maybe sapply() with get as the function will work... >> aggregate(sapply(myvars, get) ~ group + mydate, FUN = sum, data = example) > group mydate myvars get > 1 group1 2008-12-01 4 4.2 > 2 group2 2008-12-01 6 6.2 > 3 group1 2009-01-01 40 40.2 > 4 group2 2009-01-01 60 60.2 > 5 group1 2009-02-01400 400.2 > 6 group2 2009-02-01600 600.2 > > Apart from the variable names, it matches example.agg1. OTOH, Bill's > suggestion matches example.agg1 exactly and has an advantage in terms > of code clarity: > > byVars <- c('group', 'mydate') >> aggregate(example[myvars], by = example[byVars], FUN = sum) > group mydate value1 value2 > 1 group1 2008-12-01 44.2 > 2 group2 2008-12-01 66.2 > 3 group1 2009-01-01 40 40.2 > 4 group2 2009-01-01 60 60.2 > 5 group1 2009-02-01400 400.2 > 6 group2 2009-02-01600 600.2 > > FWIW, > Dennis > > On Thu, Jul 14, 2011 at 12:05 PM, Dimitri Liakhovitski > wrote: >> Hello! >> >> I am aggregating using a formula in aggregate - of the type: >> aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata) >> >> However, I actually have an object (vector of my variables to be aggregated): >> myvars<-c("var1","var2","var3") >> >> I'd like my aggregate formula (its "cbind" part) to be able to use my >> "myvars" object. Is it possible? >> Thanks for your help! >> >> Dimitri >> >> Reproducible example: >> >> mydate = rep(seq(as.Date("2008-12-01"), length = 3, by = "month"),4) >> value1=c(1,10,100,2,20,200,3,30,300,4,40,400) >> value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1) >> >> example<-data.frame(mydate=mydate,value1=value1,value2=value2) >> example$group<-c(rep("group1",3),rep("group2",3),rep("group1",3),rep("group2",3)) >> example$group<-as.factor(example$group) >> (example);str(example) >> >> example.agg1<-aggregate(cbind(value1,value2)~group+mydate,sum,data=example) >> # this works >> (example.agg1) >> >> ### Building my object (vector of 2 names - in reality, many more): >> myvars<-c("value1","value2") >> example.agg1<-aggregate(cbind(myvars)~group+mydate,sum,data=example) >> ### does not work >> >> >> -- >> Dimitri Liakhovitski >> Ninah Consulting >> www.ninah.com >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind in aggregate formula - based on an existing object (vector)
THAT'S IT, Bill - exactly what I was looking for! Thanks a lot for the input, everyone. I find the "by" method the most straigtfoward and clear. Dimitri On Thu, Jul 14, 2011 at 5:12 PM, William Dunlap wrote: > You may find it easier to use the data.frame method for aggregate > instead of the formula method when you are using vectors of column > names. E.g., > > responseVars <- c("mpg", "wt") > byVars <- c("cyl", "gear") > aggregate(mtcars[responseVars], by=mtcars[byVars], FUN=median) > > gives the same result as > > aggregate(cbind(mpg, wt) ~ cyl + gear, FUN=median, data=mtcars) > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Dimitri Liakhovitski > Sent: Thursday, July 14, 2011 1:45 PM > To: David Winsemius > Cc: r-help > Subject: Re: [R] cbind in aggregate formula - based on an existing object > (vector) > > Thanks a lot! > > actually, what I tried to do is very simple - just passing tons of > variable names into the formula. Maybe that "get" thing suggested by > Bert would work... > > Dimitri > > > On Thu, Jul 14, 2011 at 4:01 PM, David Winsemius > wrote: >> Dmitri: >> >> as.matrix makes a matrix out of the dataframe that is passed to it. >> >> As a further note I attempted and failed for reasons that are unclear to me >> to construct a formula that would (I hoped) preserve the column names which >> are being mangle in the posted effort: >> >> form <- as.formula(paste( >> "cbind(", >> paste( myvars, collapse=","), >> ") ~ group+mydate", >> sep=" ") ) >>> myvars<-c("value1","value2") >>> example.agg1<-aggregate(formula=form,data=example, FUN=sum) >> Error in m[[2L]][[2L]] : object of type 'symbol' is not subsettable >>> traceback() >> 2: aggregate.formula(formula = form, data = example, FUN = sum) >> 1: aggregate(formula = form, data = example, FUN = sum) >> >>> form >> cbind(value1, value2) ~ group + mydate >>> parse(text=form) >> expression(~ >> cbind(value1, value2), group + mydate) >> >> So it seems to be correctly dispatched to aggregate.formula but not passing >> some check or another. Also tried with formula() rather than as.formula with >> identical error message. Also tried including without naming the argument. >> >> -- >> David >> >> >> On Jul 14, 2011, at 3:32 PM, Dimitri Liakhovitski wrote: >> >>> Thank you, David, it does work. >>> Could you please explain why? What exactly does changing it to "as matrix" >>> do? >>> Thank you! >>> Dimitri >>> >>> On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius >>> wrote: On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote: > Hello! > > I am aggregating using a formula in aggregate - of the type: > aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata) > > However, I actually have an object (vector of my variables to be > aggregated): > myvars<-c("var1","var2","var3") > > I'd like my aggregate formula (its "cbind" part) to be able to use my > "myvars" object. Is it possible? > Thanks for your help! > Not sure I have gotten all the way there, but this does work: example.agg1<-aggregate(as.matrix(example[myvars])~group+mydate,sum,data=example) > example.agg1 group mydate example[myvars] NA 1 group1 2008-12-01 4 4.2 2 group2 2008-12-01 6 6.2 3 group1 2009-01-01 40 40.2 4 group2 2009-01-01 60 60.2 5 group1 2009-02-01 400 400.2 6 group2 2009-02-01 600 600.2 > Dimitri > > Reproducible example: > > mydate = rep(seq(as.Date("2008-12-01"), length = 3, by = "month"),4) > value1=c(1,10,100,2,20,200,3,30,300,4,40,400) > value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1) > > example<-data.frame(mydate=mydate,value1=value1,value2=value2) > > > example$group<-c(rep("group1",3),rep("group2",3),rep("group1",3),rep("group2",3)) > example$group<-as.factor(example$group) > (example);str(example) > > > > example.agg1<-aggregate(cbind(value1,value2)~group+mydate,sum,data=example) > # this works > (example.agg1) > > ### Building my object (vector of 2 names - in reality, many more): > myvars<-c("value1","value2") > example.agg1<-aggregate(cbind(myvars)~group+mydate,sum,data=example) > ### does not work > > > -- > Dimitri Liakhovitski > Ninah Consulting > www.ninah.com > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproduc
Re: [R] Very slow optim()
As I'm at least partly responsible for CG in optim, and packager of Rcgmin, I'll recommend the latter based on experience since it was introduced. I've so far seen no example where CG does better than Rcgmin, though I'm sure there are cases to be found. However, Ben is right that if ADMB does so well (it uses effectively analytic derivatives), then use it. Rcgmin really wants you to provide gradient code, and that is work. JN On 07/14/2011 06:00 AM, r-help-requ...@r-project.org wrote: > Message: 85 Date: Wed, 13 Jul 2011 20:20:47 + From: Ben Bolker > To: > Subject: Re: [R] Very slow optim() Message-ID: > Content-Type: text/plain; > charset="utf-8" > Hamazaki, Hamachan (DFG alaska.gov> writes: >> > >> > Dear list, >> > >> > I am using optim() function to MLE ~55 parameters, but it is very slow to > converge (~ 25 min), whereas I can do >> > the same in ~1 sec. using ADMB, and ~10 sec using MS EXCEL Solver. >> > >> > Are there any tricks to speed up? >> > >> > Are there better optimization functions? >> > > There's absolutely no way to tell without knowing more about your code. You > might try method="CG": > > Method ?"CG"? is a conjugate gradients method based on that by > Fletcher and Reeves (1964) (but with the option of Polak-Ribiere > or Beale-Sorenson updates). Conjugate gradient methods will > generally be more fragile than the BFGS method, but as they do not > store a matrix they may be successful in much larger optimization > problems. > > If ADMB works better, why not use it? You can use the R2admb > package (on R forge) to wrap your ADMB calls in R code, if you > prefer that workflow. > > Ben > > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting survival curves from a Cox model with time dependent covariates
The time-transform (tt() arguments) feature is the most recent addition to coxph. Most of the follow-up functions, in particular survfit(fit) have not yet been updated to deal with such models. Your message points out that I need to at least update them to add a "not yet available" error message. I'm still learning what can be done with the tt() option. If I assume that your trmt variable is 0/1, then the code below is a clever way to look at time dependent treatment effects. I had not thought of it. Terry Therneau --- begin included message --- Let's assume I have a clinical trial with two treatments and a time to event outcome. I am trying to fit a Cox model with a time dependent treatment effect and then plot the predicted survival curve for one treatment (or both). library(survival) test <- list(time=runif(100,0,10),event=sample(0:1,100,replace=T),trmt=sample(0:1,100,replace=T)) model1 <- coxph(Surv(time, event) ~ tt(trmt), data=test, tt=function(x, t, ...) pspline(x + t)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind in aggregate formula - based on an existing object (vector)
On Jul 15, 2011, at 15:06 , peter dalgaard wrote: > For a little lateral thinking, consider the use of "." on the LHS. That could > play out as follows: > >> myvars <- c("Ozone","Wind") >> f <- . ~ Month >> j <- union(all.vars(f[[3]]), myvars) >> aggregate(. ~ Month, data=airquality[j], mean, na.rm=T) > MonthOzone Wind > 1 5 23.61538 11.457692 > 2 6 29.4 12.18 > 3 7 59.11538 8.523077 > 4 8 59.96154 8.565385 > 5 9 31.44828 10.075862 > > (and of course, when you play with something unusual, a buglet pops up: it > doesn't work with f instead of the explicit formula in the call to aggregate.) > ...however, once you go down that road, you might as well construct the LHS directly: > lhs <- as.call(lapply(c("cbind", myvars), as.name)) > eval(bquote(aggregate(.(lhs) ~ Month, data=airquality, mean, na.rm=T))) MonthOzone Wind 1 5 23.61538 11.457692 2 6 29.4 12.18 3 7 59.11538 8.523077 4 8 59.96154 8.565385 5 9 31.44828 10.075862 -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reorganize data fram
Hi, thanks for your reply. I didn't get cast() to work and didn't know how to find information about it either. I used reshape but then I had to subset only those columns (actually I have 28 columns of other data) Could cast or reshape work also with more columns? Angelica -- View this message in context: http://r.789695.n4.nabble.com/Reorganize-data-fram-tp3662123p3669899.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reorganize data fram
Thank you! I used this one and it worked really great. /Angelica -- View this message in context: http://r.789695.n4.nabble.com/Reorganize-data-fram-tp3662123p3669782.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use pcls to solve least square fitting with constraints
Hi, I need help with imposing constraints on GAM parameters, maybe through pcls.. I have a GAM model without intercept with several strictly parametric and smooth parameters. I need to set a linear constraint such that sum of parametric coefficients and first derivatives of the smoothes is equal to 1. I saw examples with monotonicity and inequality constraints, but can't figure out how to adapt them for my case.. appreciate any help. Rgrds, -- View this message in context: http://r.789695.n4.nabble.com/use-pcls-to-solve-least-square-fitting-with-constraints-tp3074869p3669806.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grey colored lines and overwriting labels i qqplot2
Okay, seems like ddply is not the right method to add my model. That is okay, though. I already calculated the slopes and intercepts fore each for the treatments and country. How can I add those 14 lines? -- View this message in context: http://r.789695.n4.nabble.com/grey-colored-lines-and-overwriting-labels-i-qqplot2-tp3657119p3669823.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Querying RData Files, SQL style?
Hello. Is there a package or functionality available somewhere which will allow for complex searches (such as what SQL can do) of collections of RData files? Search capability within a given RData file at a time (which could be put in a loop) would be good, but the capability to perform joins to data across multiple RData files would be great. These queries might be ad-hoc, so writing an R program to get(load(...)) each file and customize the search in home-grown R code isn't feasible. This shouldn't be dependent on environment details, but just in case: I'm running version 2.13.0 in a Unix environment (but could easily run in Windows too). Thanks very much. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Querying RData Files, SQL style?
http://cran.r-project.org/web/packages/RMySQL/ On 15.07.2011 17:29, Bhushan, Vipul wrote: Hello. Is there a package or functionality available somewhere which will allow for complex searches (such as what SQL can do) of collections of RData files? Search capability within a given RData file at a time (which could be put in a loop) would be good, but the capability to perform joins to data across multiple RData files would be great. These queries might be ad-hoc, so writing an R program to get(load(...)) each file and customize the search in home-grown R code isn't feasible. This shouldn't be dependent on environment details, but just in case: I'm running version 2.13.0 in a Unix environment (but could easily run in Windows too). Thanks very much. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] WLS regression, lm() with weights as a matrix
Dear All, Now I am thinking to use a for loop: for (i in 1:200) { /Results/ <-lm(R[,i] ~ F, weights=W[,i])} The thing is, I can get WLS regression coefficients and residuals for each company each with unique weight, but I am wondering how to easily combine all coefficients and residuals for ALL companies? Any suggestions would be greatly appreciated. -- View this message in context: http://r.789695.n4.nabble.com/WLS-regression-lm-with-weights-as-a-matrix-tp3668577p3670176.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drawing a histogram from a massive dataset
Hi: I would suggest that you avoid the histogram and make a density plot instead. It would be more informative and probably require a lot less time and ink. If you're married to the histogram concept, try taking a sample of about 1 and get a histogram of that instead. The result shouldn't be much different from that of the entire sample - to test out this hypothesis, take several random samples of size 1 and compare the histograms. If they're not much different in shape, it's likely that the full sample is close to the same. If there are noticeable differences, try 5 or 10 instead (rinse and repeat). HTH, Dennis On Fri, Jul 15, 2011 at 4:21 AM, Paul Smith wrote: > Dear All, > > I have a massive dataset from which I would like to draw a histogram. > Any ideas on how to accomplish this? > > Thanks in advance, > > Paul > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Drawing a histogram from a massive dataset
Hello, I assume you have imported the dataset. You can use the hist from the graphics package from the main R program. A tricky part is that the freq=TRUE (the default) plots frequencies and freq=FALSE plots probability densities, not percent of the histogram cells. You can sum the counts and calculate the percent before plotting. hist1<-hist(varname, plot=FALSE) sum <- sum(hist1$counts) hist1$counts <- hist1$counts/sum*100 plot(hist1, main=paste("Histogram of",deparse(substitute(varname))), xlab=deparse(substitute(varname)), ylab="Percent", ) Also, if you are new to R, there are very useful manuals and guides at http://cran.r-project.org/manuals.html . You can look up documention in R, such as ?hist command for documentation for hist function. Regards, Kyaw Sint (Joe) > Dear All, > > I have a massive dataset from which I would like to draw a histogram. > Any ideas on how to accomplish this? > > Thanks in advance, > > Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Querying RData Files, SQL style?
On Jul 15, 2011, at 10:29 AM, Bhushan, Vipul wrote: Hello. Is there a package or functionality available somewhere which will allow for complex searches (such as what SQL can do) of collections of RData files? Search capability within a given RData file at a time (which could be put in a loop) would be good, but the capability to perform joins to data across multiple RData files would be great. These queries might be ad-hoc, so writing an R program to get(load(...)) each file and customize the search in home- grown R code isn't feasible. As I read the question it appears that your are not expecting to load the data into R and are rather asking for a program other than R (or Rscript or littler) to read .Rdata files and perform database joins. As I understand it, that is not available. As I understand it, there is not even a package that can look at .Rdata files for their object names and structure without actual loading them. Hoping to be corrected on either of these points. This shouldn't be dependent on environment details, but just in case: I'm running version 2.13.0 in a Unix environment (but could easily run in Windows too). Thanks very much. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing Complex Formulas
Forgive me. I had a legitimate problem that I found resolvable using "**" instead of "^". I can't seem to recreate the problem to obtain the error message that I was receiving. "Incomplete information" is perhaps more appropriate than "*mis*information." Here is the exact code I used (still not correct to my original question, but a step along the way): > J <- 3 > r_ <- 1.959 > q_ <- 1.45 > scale_ <- 0.3 > N <- 4 > fq <- seq(0, N-1, 1) > fq [1] 0 1 2 3 > center_frequencies <- function(J = 3, r_ = 1.959, q_ = 1.45, scale_ = 0.3){ + j <- seq(0,J-1,1) + fc <- (q_ + j)^r_/scale_ + } > fc <- center_frequencies(3,r_,q_,scale_) > fc [1] 6.902377 19.286575 37.710853 > cf <- t(fc) > cf [,1] [,2] [,3] [1,] 6.902377 19.28657 37.71085 > lambda <- function(cf, J = 3, scale_ = 0.3){ + B <- cf*scale_ + } > B <- lambda(cf, 3, 0.3) > B [,1] [,2] [,3] [1,] 2.070713 5.785972 11.31326 > Fc <- 1/cf > Fc [,1] [,2] [,3] [1,] 0.1448776 0.05184954 0.02651757 > dummy <- fq%*%Fc > dummy [,1] [,2] [,3] [1,] 0.000 0. 0. [2,] 0.1448776 0.05184954 0.02651757 [3,] 0.2897553 0.10369908 0.05303513 [4,] 0.4346329 0.15554862 0.07955270 > U <- -dummy+1 > for(j in 1:J) + { + Z <- dummy**B[j] + U <- (-dummy+1)**B[j] + } > Z [,1] [,2] [,3] [1,] 0.00e+00 0.00e+00 0.00e+00 [2,] 3.222504e-10 2.881245e-15 1.462288e-18 [3,] 8.200170e-07 7.331782e-12 3.721022e-15 [4,] 8.053568e-05 7.200705e-10 3.654498e-13 > U [,1] [,2] [,3] [1,] 1.0 1.000 1.000 [2,] 0.170223061 0.5475283 0.7378244 [3,] 0.020842075 0.2897999 0.5398325 [4,] 0.001577800 0.1476795 0.3914810 > for(i in 1:4) + { + for(j in 1:3){ + U[i,j]<-ifelse( U[i,j]>30,30,U[i,j]) + } + U <- exp(U) > W <- Z*U > U [,1] [,2] [,3] [1,] 2.718282 2.718282 2.718282 [2,] 1.185569 1.728974 2.091381 [3,] 1.021061 1.336160 1.715719 [4,] 1.001579 1.159141 1.479170 > W [,1] [,2] [,3] [1,] 0.00e+00 0.00e+00 0.00e+00 [2,] 3.820502e-10 4.981599e-15 3.058201e-18 [3,] 8.372872e-07 9.796435e-12 6.384229e-15 [4,] 8.066285e-05 8.346635e-10 5.405623e-13 > I can now get W using: W <- as.complex(((fq/cf[j])^B[j])*(exp(-(fq/cf[j])+1)^B[j])) where j is the index for cf (i.e., each center frequency is run individually and written to a table). Still not the most efficient way to accomplish this step, but it is working for me. Jeff On Fri, Jul 15, 2011 at 5:34 AM, peter dalgaard wrote: > > On Jul 14, 2011, at 20:19 , Duncan Murdoch wrote: > > > On 14/07/2011 12:46 PM, warmstron1 wrote: > >> I resolved this issue. It appears that "^" won't work for this case, > but > >> "**" worked. I can't find any reference to this, but where "^" seems to > be > >> used to raise a value to a numerical function, "**" is used for a y > raised > >> to the power of x where x it a computation. > > > > Those should be equivalent. Can you post the code that wasn't working, > and describe what "not working" meant? > > More easily, demonstrate that code that _is_ working stops working if you > replace "**" with "^". Or stop spreading misinformation! > > -- > Peter Dalgaard > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Email: pd@cbs.dk Priv: pda...@gmail.com > > -- *W. Jeffrey Armstrong, Ph.D. *Assistant Professor Exercise Science *Managing Editor Clinical Kinesiology* Official Journal of the American Kinesiotherapy Association [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reorganize data fram
Hi: On Fri, Jul 15, 2011 at 6:04 AM, anglor wrote: > Hi, thanks for your reply. > > I didn't get cast() to work and didn't know how to find information about it > either. Hadley Wickham's home page http://had.co.nz/ has a link (last one under the heading 'R Packages') to the reshape package page, where some documentation resides. Could you post a representative sample of your input data frame using dput()? Here's an example: df <- data.frame(x = 1:4, y = rnorm(4), z = rpois(4, 3)) > dput(df) structure(list(x = 1:4, y = c(-0.49491054748322, -1.53013240418216, 0.0189088048735591, -0.0766510981813545), z = c(2, 2, 3, 4)), .Names = c("x", "y", "z"), row.names = c(NA, -4L), class = "data.frame") Copy and paste the result of dput() into your e-mail. This is the preferred way to transport data that is readable on all platforms while guaranteeing that a potential R-helper sees the same data structure you do. Clearly, you don't want to send 700,000 observations with dput(), but a small sample that is sufficient to illustrate the problem is desirable. If possible, also send the code that you tried and the expected result, as you did in your initial post. I used reshape but then I had to subset only those columns (actually > I have 28 columns of other data) Could cast or reshape work also with more > columns? Are these columns 'constant' apart from Temperature? If so, then the following should work, but this needs the 'new and improved' reshape2 package instead. I'm using the same data frame d as before with a couple added 'constant' variables: d$age <- 12 d$region <- 'NW' d$zone <- 'CET' d Date Temperature Category age region zone 1 2007102 16A 12 NW CET 2 2007102 17B 12 NW CET 3 2007102 18C 12 NW CET library(reshape2) dcast(d, ... ~ Category, value_var = 'Temperature') Date age region zone A B C 1 2007102 12 NW CET 16 17 18 If they're not (all) constant, then you need to post some data per above and describe your desired outcome. HTH, Dennis > > Angelica > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Reorganize-data-fram-tp3662123p3669899.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem in installing rJava from source
Hi all, I was trying to install rJava package (some older version) from source. However could not achieve using "Rcmd build -binary rJava" syntax with windows cmd. The building process stopped with following error: ERROR*> JavaSoft\{JRE|JDK} can't open registry keys. ERROR: cannot find Java Development Kit. Please set JAVA_HOME to specify it's location normally ERROR: configuration failed for package 'rJava' With this error it seems that, I need to install some additional tool(s), however I have Duncan's Rtools installed. Can somebody through some light on this issue, what I should do with this error? Thanks, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in installing rJava from source
Hi, rJava depends on having appropriate version of Java installed. You can download the JDK from oracle for free at their website. It should automatically set the appropriate environment variables, but if you are having difficulty with that still, you may need to set JAVA_HOME to the directory where you installed Java. Cheers, Josh P.S. Duncan Murdoch's Rtools gives you what you need to build R and many packages but not all third party software a particular package may depend on. On Fri, Jul 15, 2011 at 9:40 AM, Bogaso Christofer wrote: > Hi all, I was trying to install rJava package (some older version) from > source. However could not achieve using "Rcmd build -binary rJava" syntax > with windows cmd. The building process stopped with following error: > > > > ERROR*> JavaSoft\{JRE|JDK} can't open registry keys. > > ERROR: cannot find Java Development Kit. > > Please set JAVA_HOME to specify it's location normally > > ERROR: configuration failed for package 'rJava' > > > > With this error it seems that, I need to install some additional tool(s), > however I have Duncan's Rtools installed. > > > > Can somebody through some light on this issue, what I should do with this > error? > > > > Thanks, > > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grey colored lines and overwriting labels i qqplot2
Hi: What did you do and what do you mean by 'add[ing] those 14 lines'? A reproducible example would be helpful. I've used plyr successfully to get model coefficients before, so I'm interested in what you mean by 'ddply is not the right method to add my model.' Here's a toy reproducible example to counter your claim: library('plyr') set.seed(1036) df <- data.frame(gp = rep(1:5, each = 10), x = 1:10, y = 1.5 + 2 * rep(1:10, 5) + rnorm(50)) # function to generate the model coefficients for a generic data frame lmfun <- function(d) coef(lm(y ~ x, data = d)) # Apply the function to each sub-data frame associated with groups: ddply(df, .(gp), lmfun) gp (Intercept)x 1 1 1.2481481 2.011974 2 2 1.3125070 1.977223 3 3 0.5988811 2.212524 4 4 0.8575467 2.075925 5 5 2.1428869 1.903015 Internally, ddply() splits df into five sub-data frames corresponding to each level of gp. The function lmfun() is applied to each sub-data frame. Notice that the function argument is a data frame (observe that data = d inside lm()). It is often advantageous to run lm() by group, exporting the output to a list of lists (since the output from lm() is a list), from which plyr can use the ldply() function to pick off pieces of output from each group. I've done this several times before in this forum, so I'm not going to repeat it here. If you post what you tried that didn't work, perhaps I or someone else can get it to work for you. As mentioned above, reproducible code and data (with dput()) is ideal. Dennis On Fri, Jul 15, 2011 at 5:26 AM, Sigrid wrote: > Okay, seems like ddply is not the right method to add my model. That is okay, > though. I already calculated the slopes and intercepts fore each for the > treatments and country. How can I add those 14 lines? > > -- > View this message in context: > http://r.789695.n4.nabble.com/grey-colored-lines-and-overwriting-labels-i-qqplot2-tp3657119p3669823.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odd behaviour of as.POSIXct
Dear all, how come the first loop in the below fails, but the second performs as expected? days <- as.Date( c("2000-01-01", "2000-01-02") ) for(day in days) { as.POSIXct(day) } for( n in 1:length(days) ) { show(as.POSIXct(days[n])) } Many thanks, Jo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grey colored lines and overwriting labels i qqplot2
> You should only have one scale_ call for each scale type. Here, you have > three scale_colour_ calls, the first selecting a grey scale, the second > defining a single break with its label (and thus implicitly subsetting on > that single break value), and a second which defines a different > break/label/subset. Only the last one has any effect. Just to clarify: breaks/labels control the appearance of the legend/axis, limits modify what data is shown on the plot. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in installing rJava from source
If you cannot build a java program outside R then you won't be able to do so inside R. Find a Java development resource (the JDK is one such) and get command-line ability to compile java enabled, and then come back to interfacing R with Java. --- Jeff Newmiller The . . Go Live... DCN: Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Bogaso Christofer wrote: Hi all, I was trying to install rJava package (some older version) from source. However could not achieve using "Rcmd build -binary rJava" syntax with windows cmd. The building process stopped with following error: ERROR*> JavaSoft\{JRE|JDK} can't open registry keys. ERROR: cannot find Java Development Kit. Please set JAVA_HOME to specify it's location normally ERROR: configuration failed for package 'rJava' With this error it seems that, I need to install some additional tool(s), however I have Duncan's Rtools installed. Can somebody through some light on this issue, what I should do with this error? Thanks, [[alternative HTML version deleted]] _ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odd behaviour of as.POSIXct
day doesn't exist? That would be the 1st problem. Johannes Egner wrote: > > Dear all, > > how come the first loop in the below fails, but the second performs as > expected? > > days <- as.Date( c("2000-01-01", "2000-01-02") ) > > for(day in days) > { > as.POSIXct(day) > } > > for( n in 1:length(days) ) > { > show(as.POSIXct(days[n])) > } > > Many thanks, Jo > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- View this message in context: http://r.789695.n4.nabble.com/Odd-behaviour-of-as-POSIXct-tp3670414p3670454.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] turning coefficients into an lm obect
I'm working with a dataset and fitting and comparing various lms. I also have a fitted model parameter values and SE estimated from the literature. In doing my comparison, I'd like to turn these estimates into an lm object itself for ease of use with some of the code I'm writing. While putting in the coefficients is a simple matter - just take a fitted model object and change the values of the mylm$coefficients, for example, it is not transparent to me how I could incorporate the parameter variance and, say, the unexplained variance in the previous fit. Although, thinking about it further, the unexplained variance is specific to that dataset - so, I shouldn't have to worry about that. But how can I incorporate known variance in the parameter estimates? Thanks! -Jarrett __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using str() in a function.
Thanks, everybody, this has been very edifying. One last question: It seems that sometimes when a function returns something and you don't assign it, it prints to the console, and sometimes it doesn't. I'm not sure I understand which is which. My best current theory is that, if the function returns NULL, by itself and not as part of some larger object, it does not print it, but non-null values are printed. Is that correct? Thanks! Andrew -- View this message in context: http://r.789695.n4.nabble.com/Using-str-in-a-function-tp3655785p3670513.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error Message Help: Differing Number of Rows
Hello all, I'm relatively new to "R" and programming in general - I had previously used MatLab, but decided to make the transition to R, as the computational times are much better! Anyway, I'm trying to use R to run a gamma distribution model to estimate mean transit times of water moving through a hydrological catchment. My input are 3 .txt format files as follow: Precipitation_18o: 2 columns - column 1 is a date (Excel number format) and column 2 is an isotope ratio (i.e. -8.12) Runoff_18o: same as above Daily_Precip: 2 columns - column 1 is the same date format but column 2 is a weekly bulk precipitation value (i.e. 10mm) When running the script, I keep getting the following error message: Error in data.frame(cQ.ou[ind.mea], cQ[cal.cQ:nrow(cQ), 2]) : arguments imply differing number of rows: 42, 44 Now, I know it's not the script, as it run perfectly for one site, but not the other, but having read previous threads on other forums, it suggests that there aren't the same number of values in all 3 input files. A quick use of the str(FILENAME) quickly confirms that all 3 input files have 322 entries with 2 variables. I'd be grateful of ANY help, as this is really hampering my research progress right now! Cheers, Scott_M -- View this message in context: http://r.789695.n4.nabble.com/Error-Message-Help-Differing-Number-of-Rows-tp3670451p3670451.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using str() in a function.
On Jul 15, 2011, at 1:31 PM, andrewH wrote: Thanks, everybody, this has been very edifying. One last question: It seems that sometimes when a function returns something and you don't assign it, it prints to the console, and sometimes it doesn't. I'm not sure I understand which is which. My best current theory is that, if the function returns NULL, by itself and not as part of some larger object, it does not print it, but non-null values are printed. Is that correct? I think you should start testing your theories: fn <- function() return(NULL) fn() -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using str() in a function.
Below. -- Bert On Fri, Jul 15, 2011 at 10:31 AM, andrewH wrote: > Thanks, everybody, this has been very edifying. One last question: > > It seems that sometimes when a function returns something and you don't > assign it, it prints to the console, and sometimes it doesn't. I'm not sure > I understand which is which. My best current theory is that, if the function > returns NULL, by itself and not as part of some larger object, it does not > print it, but non-null values are printed. Is that correct? -- No. It depends on whether the function uses invisible() in the return, ?invisible If invisible() is not used and the value is not assigned, it's printed. Otherwise not.cf: f <- function()NULL g <- function()invisible(NULL) f() ## NULL is printed g() ## nothing printed z1 <- f() ## nothing printed z2 <- g() ## nothing printed z1 ## NULL z2 ##NULL Cheers, Bert > > Thanks! Andrew > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Using-str-in-a-function-tp3655785p3670513.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding rows based on column value
Dear all, I have one problem and did not find any solution. I have attached the question in text file also because sometimes spacing is not good in mail. I have a file(file.txt) attached with this mail.I am reading it using this code to make a data frame (file)- file=read.table("file.txt",fill=T,colClasses = "character",header=T) file looks like this- Chr PosCaseA CaseCCaseG CaseT 10 135344110 0.00 24.00 0.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344113 0.00 0.00 24.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344116 0.00 0.00 0.00 24.00 10 135344118 0.00 24.00 0.00 0.00 10 135344118 0.00 0.00 0.00 24.00 10 135344122 24.00 0.00 0.00 0.00 10 135344122 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 0.00 0.00 24.00 10 135344126 0.00 0.00 24.00 0.00 Now some of the values in column Pos are same.For these same positions i want to add the values of columns 3:6 I will explain with an example- The output of first row should be- Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.00 24.00 48.00 0.00 because first three rows have same value in Pos column. so the whole output for above input should be- Chr PosCaseA CaseC CaseG CaseT 10 1353441100.00 24.00 48.000.00 10 1353441130.00 0.00 24.000.00 10 135344114 48.00 0.000.00 0.00 10 135344116 0.00 0.000.0024.00 10 135344118 0.00 24.00 0.0024.00 10 135344122 24.00 24.00 0.000.00 10 135344123 0.00 48.00 0.0024.00 10 135344126 0.00 0.0024.00 0.00 Can you please help me. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College LondonDear all, I have one problem and did not find any solution. I have a file(file.txt) attached with this mail.I am reading it using this code to make a data frame (file)- file=read.table("file.txt",fill=T,colClasses = "character",header=T) file looks like this- Chr PosCaseA CaseCCaseG CaseT 10 135344110 0.00 24.00 0.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344113 0.00 0.00 24.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344116 0.00 0.00 0.00 24.00 10 135344118 0.00 24.00 0.00 0.00 10 135344118 0.00 0.00 0.00 24.00 10 135344122 24.00 0.00 0.00 0.00 10 135344122 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 0.00 0.00 24.00 10 135344126 0.00 0.00 24.00 0.00 Now some of the values in column Pos are same.For these same positions i want to add the values of columns 3:6 I will explain with an example- The output of first row should be- Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.00 24.00 48.00 0.00 because first three rows have same value in Pos column. so the whole output for above input should be- Chr PosCaseA CaseC CaseG CaseT 10 1353441100.00 24.00 48.000.00 10 1353441130.00 0.00 24.000.00 10 135344114 48.00 0.000.00 0.00 10 135344116 0.00 0.000.0024.00 10 135344118 0.00 24.00 0.0024.00 10 135344122 24.00 24.00 0.000.00 10 135344123 0.00 48.00 0.0024.00 10 135344126 0.00 0.0024.00 0.00 Can you please help me. Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.00 24.00 0.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344113 0.00 0.00 24.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344116 0.00 0.00 0.00 24.00 10 135344118 0.00 24.00 0.00 0.00 10 135344118 0.00 0.00 0.00 24.00 10 135344122 24.00 0.00 0.00 0.
Re: [R] Odd behaviour of as.POSIXct
On 15/07/2011 12:15 PM, Johannes Egner wrote: Dear all, how come the first loop in the below fails, but the second performs as expected? days<- as.Date( c("2000-01-01", "2000-01-02") ) for(day in days) { as.POSIXct(day) } "day" in the loop above is an integer without a class, it's not a Date. If you did for (day in days) { class(day) <- class(days) print(as.POSIXct(day)) } you won't get an error. (I don't know if you'll be happy with what you get; the time zone is an issue.) Duncan Murdoch for( n in 1:length(days) ) { show(as.POSIXct(days[n])) } Many thanks, Jo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] transforming year.weeknumber into dates
Hello! I know how to transform dates into year.weeknumber format using zoo: library(zoo) as.numeric(format(as.Date("2010-10-02"), "%Y.%W")) But is there a straightforward way to do the opposite - to transform character strings like "2009.12" or "2009.30" back into dates (assuming that weeks start on Monday)? Thanks a lot! -- Dimitri Liakhovitski __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Export Unicode characters from R
On 15/07/2011 1:42 PM, Sverre Stausland wrote: >>> >>> >funny.g<- "\u1E21" >>> >funny.g >> >> [1] "ḡ" >> >>> >data.frame (funny.g) ->funny.g >>> >funny.g$funny.g >> >> [1] ḡ >> Levels: > > I think the problem is in the data.frame code, not in writing. Data.frames > try to display things in a readable way, and since you're on Windows where > UTF-8 is not really supported, the code helpfully changes that character to > the "" string. for display. I thought the data.frame function didn't alter the unicode coding, since funny.g$funny.g above still displays the right unicode character (although it does list the levels as). > You should be able to write the Unicode character to file if you use lower > level methods such as cat(), on a connection opened using the file() > function with the encoding set explicitly. I'm sorry, but I don't understand what it means "to use cat() on a connection opened using the file() function". Could you please clarify that? Sorry, I think my suggestion was wrong. What I meant was something like file <- file("your filename", encoding="UTF-8") cat("\u1E21", file= file) but this doesn't appear to work. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using str() in a function.
On 15/07/2011 1:44 PM, Bert Gunter wrote: Below. -- Bert On Fri, Jul 15, 2011 at 10:31 AM, andrewH wrote: > Thanks, everybody, this has been very edifying. One last question: > > It seems that sometimes when a function returns something and you don't > assign it, it prints to the console, and sometimes it doesn't. I'm not sure > I understand which is which. My best current theory is that, if the function > returns NULL, by itself and not as part of some larger object, it does not > print it, but non-null values are printed. Is that correct? -- No. It depends on whether the function uses invisible() in the return, ?invisible If invisible() is not used and the value is not assigned, it's printed. Otherwise not.cf: f<- function()NULL g<- function()invisible(NULL) f() ## NULL is printed g() ## nothing printed z1<- f() ## nothing printed z2<- g() ## nothing printed z1 ## NULL z2 ##NULL Right. And what invisible() does is set a flag so that the console is told "don't print this". You can see the flag if you use the withVisible() function. For example, with Bert's definitions, > withVisible(f()) $value NULL $visible [1] TRUE > withVisible(g()) $value NULL $visible [1] FALSE Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Querying RData Files, SQL style?
Hi: If you load the data into R, there is a package called sqldf that allows one to apply SQL syntax to an R data frame. Is that what you had in mind? If so, Google 'sqldf R' and you should get a pointer to its home page. HTH, Dennis On Fri, Jul 15, 2011 at 7:29 AM, Bhushan, Vipul wrote: > Hello. Is there a package or functionality available somewhere which will > allow for complex searches (such as what SQL can do) of collections of RData > files? Search capability within a given RData file at a time (which could be > put in a loop) would be good, but the capability to perform joins to data > across multiple RData files would be great. These queries might be ad-hoc, so > writing an R program to get(load(...)) each file and customize the search in > home-grown R code isn't feasible. > > This shouldn't be dependent on environment details, but just in case: I'm > running version 2.13.0 in a Unix environment (but could easily run in Windows > too). > > Thanks very much. > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing Complex Formulas
warmstron1 wrote: > >> for(j in 1:J) > + { > + Z <- dummy**B[j] > + U <- (-dummy+1)**B[j] > + } >> Z > I replaced ** with ^ and got the same results as you. But why are you doing a for loop here? At each iteration you are overwriting the previous results of Z and U and retaining only the values obtained for j=J. You could just as well do Z <- dummy^B[J]# J and not j U <- (-dummy+1)^B[J] # same I got the same results as you. >> for(i in 1:4) > + { > + for(j in 1:3){ > + U[i,j]<-ifelse( U[i,j]>30,30,U[i,j]) > + } > + U <- exp(U) > The second closing } seems to be missing. Why are you using ifelse elementwisely? It is a vectorized function. This is equivalent to what you are doing U <- ifelse(U>30,30,U) and still gives the same results. Berend -- View this message in context: http://r.789695.n4.nabble.com/Writing-Complex-Formulas-tp3638379p3670624.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding rows based on column value
Hi: This seems to work: library(plyr) # select the variables to summarize: vars <- paste('Case', c('A', 'C', 'G', 'T'), sep = '') # Alternatively, # vars <- names(df)[grep('Case', names(df))] # One way: the ddply() function in package plyr in # conjunction with the colwise() function > ddply(df, .(Pos), colwise(sum, vars)) Pos CaseA CaseC CaseG CaseT 1 135344110 02448 0 2 135344113 0 024 0 3 13534411448 0 0 0 4 135344116 0 0 024 5 135344118 024 024 6 1353441222424 0 0 7 135344123 048 024 8 135344126 0 024 0 The colwise() function applies the same function (here, sum) to each variable in the variable list given by vars. The wrapper function ddply() applies the colwise() function to each subset of the data defined by a unique value of Pos. Another way is to use the aggregate() function from base R. The following code comes from another thread on this list in the past couple of days due to Bill Dunlap. > aggregate(df[vars], by = df['Pos'], FUN = sum) Pos CaseA CaseC CaseG CaseT 1 135344110 02448 0 2 135344113 0 024 0 3 13534411448 0 0 0 4 135344116 0 0 024 5 135344118 024 024 6 1353441222424 0 0 7 135344123 048 024 8 135344126 0 024 0 HTH, Dennis 2011/7/15 Bansal, Vikas : > Dear all, > > I have one problem and did not find any solution. > I have attached the question in text file also because sometimes spacing is > not good in mail. > > I have a file(file.txt) attached with this mail.I am reading it using this > code to make a data frame (file)- > > file=read.table("file.txt",fill=T,colClasses = "character",header=T) > > file looks like this- > > Chr Pos CaseA CaseC CaseG CaseT > 10 135344110 0.00 24.00 0.00 0.00 > 10 135344110 0.00 0.00 24.00 0.00 > 10 135344110 0.00 0.00 24.00 0.00 > 10 135344113 0.00 0.00 24.00 0.00 > 10 135344114 24.00 0.00 0.00 0.00 > 10 135344114 24.00 0.00 0.00 0.00 > 10 135344116 0.00 0.00 0.00 24.00 > 10 135344118 0.00 24.00 0.00 0.00 > 10 135344118 0.00 0.00 0.00 24.00 > 10 135344122 24.00 0.00 0.00 0.00 > 10 135344122 0.00 24.00 0.00 0.00 > 10 135344123 0.00 24.00 0.00 0.00 > 10 135344123 0.00 24.00 0.00 0.00 > 10 135344123 0.00 0.00 0.00 24.00 > 10 135344126 0.00 0.00 24.00 0.00 > > Now some of the values in column Pos are same.For these same positions i want > to add the values of columns 3:6 > I will explain with an example- > The output of first row should be- > > Chr Pos CaseA CaseC CaseG CaseT > 10 135344110 0.00 24.00 48.00 0.00 > > because first three rows have same value in Pos column. > > so the whole output for above input should be- > > Chr Pos CaseA CaseC CaseG CaseT > 10 135344110 0.00 24.00 48.00 0.00 > 10 135344113 0.00 0.00 24.00 0.00 > 10 135344114 48.00 0.00 0.00 0.00 > 10 135344116 0.00 0.00 0.00 24.00 > 10 135344118 0.00 24.00 0.00 24.00 > 10 135344122 24.00 24.00 0.00 0.00 > 10 135344123 0.00 48.00 0.00 24.00 > 10 135344126 0.00 0.00 24.00 0.00 > > Can you please help me. > > > Thanking you, > Warm Regards > Vikas Bansal > Msc Bioinformatics > Kings College London > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Export Unicode characters from R
On 15/07/2011 1:42 PM, Sverre Stausland wrote: >>> >>> >funny.g<- "\u1E21" >>> >funny.g >> >> [1] "ḡ" >> >>> >data.frame (funny.g) ->funny.g >>> >funny.g$funny.g >> >> [1] ḡ >> Levels: > > I think the problem is in the data.frame code, not in writing. Data.frames > try to display things in a readable way, and since you're on Windows where > UTF-8 is not really supported, the code helpfully changes that character to > the "" string. for display. I thought the data.frame function didn't alter the unicode coding, since funny.g$funny.g above still displays the right unicode character (although it does list the levels as). > You should be able to write the Unicode character to file if you use lower > level methods such as cat(), on a connection opened using the file() > function with the encoding set explicitly. I'm sorry, but I don't understand what it means "to use cat() on a connection opened using the file() function". Could you please clarify that? I just checked on how R does it. We use UTF-8 encodings in the help pages, regardless of what kind of system you're running on. It converts the strings to UTF-8 internally first (your funny.g is already encoded that way; see Encoding(funny.g)) then uses writeLines( ..., useBytes=TRUE) to write it. The useBytes argument says not to try to make the file readable on the local system, just write out the bytes. Another way to do it is to get your strings in the UTF-8 encoding, convert them to raw vectors, and use writeBin() to write those out. For example, funny.g<- "\u1E21" rawstuff<- charToRaw(funny.g) writeBin(rawstuff, "funny.g.txt") All of this appears hard, because you're thinking of UTF-8 as text, but on Windows, R thinks of it as a binary encoding. Modern Windows systems can handle UTF-8, but not all programs on them can. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Export Unicode characters from R
>>> >>> > funny.g<- "\u1E21" >>> > funny.g >> >> [1] "ḡ" >> >>> > data.frame (funny.g) -> funny.g >>> > funny.g$funny.g >> >> [1] ḡ >> Levels: > > I think the problem is in the data.frame code, not in writing. Data.frames > try to display things in a readable way, and since you're on Windows where > UTF-8 is not really supported, the code helpfully changes that character to > the "" string. for display. I thought the data.frame function didn't alter the unicode coding, since funny.g$funny.g above still displays the right unicode character (although it does list the levels as ). > You should be able to write the Unicode character to file if you use lower > level methods such as cat(), on a connection opened using the file() > function with the encoding set explicitly. I'm sorry, but I don't understand what it means "to use cat() on a connection opened using the file() function". Could you please clarify that? Thanks Sverre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Querying RData Files, SQL style?
Thanks very much for your response. This sqldf package looks promising. I just need to figure out if a dbms needs to be running/installed in our environment (to hold the temporary SQLite DB it creates). The examples in the documentation are helpful too. -Original Message- From: Dennis Murphy [mailto:djmu...@gmail.com] Sent: Friday, July 15, 2011 2:10 PM To: Bhushan, Vipul Cc: r-help@r-project.org Subject: Re: [R] Querying RData Files, SQL style? Hi: If you load the data into R, there is a package called sqldf that allows one to apply SQL syntax to an R data frame. Is that what you had in mind? If so, Google 'sqldf R' and you should get a pointer to its home page. HTH, Dennis On Fri, Jul 15, 2011 at 7:29 AM, Bhushan, Vipul wrote: > Hello. Is there a package or functionality available somewhere which will > allow for complex searches (such as what SQL can do) of collections of RData > files? Search capability within a given RData file at a time (which could be > put in a loop) would be good, but the capability to perform joins to data > across multiple RData files would be great. These queries might be ad-hoc, so > writing an R program to get(load(...)) each file and customize the search in > home-grown R code isn't feasible. > > This shouldn't be dependent on environment details, but just in case: I'm > running version 2.13.0 in a Unix environment (but could easily run in Windows > too). > > Thanks very much. > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] running previous versions of R
I'm having problems trying to get an older version of R (2.9.2) running in a Linux terminal. I have both R 2.9.2 and 2.12 installed and typing 'R' into the terminal results in version 2.12 running. I am trying to use a program that requires version 2.4 or greater, but will not run on version 2.10 or higher. Anyone have an idea of what to do? Thanks! -- View this message in context: http://r.789695.n4.nabble.com/running-previous-versions-of-R-tp3670587p3670587.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Executing a function correctly
Marc, Many thanks. -- View this message in context: http://r.789695.n4.nabble.com/Executing-a-function-correctly-tp3665765p3670602.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Querying RData Files, SQL style?
Thank you for your response. To clarify, I don't mind if R loads the data (in the background), but was hoping to have to only specify the query as a simple request and the list of input files. I'd like to do this relatively efficiently, so searching across ~100 RData files (10 to 100 KB each) only takes many seconds and not lots of minutes or hours. -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Friday, July 15, 2011 11:42 AM To: Bhushan, Vipul Cc: r-help@r-project.org Subject: Re: [R] Querying RData Files, SQL style? On Jul 15, 2011, at 10:29 AM, Bhushan, Vipul wrote: > Hello. Is there a package or functionality available somewhere which > will allow for complex searches (such as what SQL can do) of > collections of RData files? Search capability within a given RData > file at a time (which could be put in a loop) would be good, but the > capability to perform joins to data across multiple RData files > would be great. These queries might be ad-hoc, so writing an R > program to get(load(...)) each file and customize the search in home- > grown R code isn't feasible. As I read the question it appears that your are not expecting to load the data into R and are rather asking for a program other than R (or Rscript or littler) to read .Rdata files and perform database joins. As I understand it, that is not available. As I understand it, there is not even a package that can look at .Rdata files for their object names and structure without actual loading them. Hoping to be corrected on either of these points. > > This shouldn't be dependent on environment details, but just in > case: I'm running version 2.13.0 in a Unix environment (but could > easily run in Windows too). > > Thanks very much. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Querying RData Files, SQL style?
My understanding is that sqldf works in conjunction with the sqlite and H2 DBMSs. You should be able to verify that from the sqldf home page; if I'm wrong, Gabor will quickly correct me :) Dennis On Fri, Jul 15, 2011 at 11:51 AM, Bhushan, Vipul wrote: > Thanks very much for your response. This sqldf package looks promising. I > just need to figure out if a dbms needs to be running/installed in our > environment (to hold the temporary SQLite DB it creates). The examples in the > documentation are helpful too. > > -Original Message- > From: Dennis Murphy [mailto:djmu...@gmail.com] > Sent: Friday, July 15, 2011 2:10 PM > To: Bhushan, Vipul > Cc: r-help@r-project.org > Subject: Re: [R] Querying RData Files, SQL style? > > Hi: > > If you load the data into R, there is a package called sqldf that > allows one to apply SQL syntax to an R data frame. Is that what you > had in mind? If so, Google 'sqldf R' and you should get a pointer to > its home page. > > HTH, > Dennis > > On Fri, Jul 15, 2011 at 7:29 AM, Bhushan, Vipul > wrote: >> Hello. Is there a package or functionality available somewhere which will >> allow for complex searches (such as what SQL can do) of collections of RData >> files? Search capability within a given RData file at a time (which could be >> put in a loop) would be good, but the capability to perform joins to data >> across multiple RData files would be great. These queries might be ad-hoc, >> so writing an R program to get(load(...)) each file and customize the search >> in home-grown R code isn't feasible. >> >> This shouldn't be dependent on environment details, but just in case: I'm >> running version 2.13.0 in a Unix environment (but could easily run in >> Windows too). >> >> Thanks very much. >> >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] running previous versions of R
On Jul 15, 2011, at 1:06 PM, jstevens wrote: > I'm having problems trying to get an older version of R (2.9.2) running in a > Linux terminal. I have both R 2.9.2 and 2.12 installed and typing 'R' into > the terminal results in version 2.12 running. I am trying to use a program > that requires version 2.4 or greater, but will not run on version 2.10 or > higher. > > Anyone have an idea of what to do? > > Thanks! How did you install R? Did you build from source or use pre-compiled binaries (eg. RPMs or .debs)? Which Linux distribution are you running? If in fact, you have two versions of R installed, the likelihood is that the 2.12.??? binary is in your $PATH and 2.9.2 is not or is after the former. Hence, the newer version will be found and run. Typically, there is a symlink to the R executable placed in a common location such as /usr/bin or perhaps /usr/local/bin, which is in the default $PATH so that R can be run easily. This may be as simple as knowing where the 2.9.2 installation is located on your HD and running PATH.TO.THE.R.EXECUTABLE/R from the command line. Are you running a CRAN package that has not been updated for more recent versions of R, or are you replicating an analysis that was done some time ago and have to use the same versions? If the former, be sure to contact the package maintainer to request that it be fixed, if they have not already orphaned it. Cheers, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transforming year.weeknumber into dates
try this: > x <- c('2009.12', '2009.30') > as.Date(paste(x, '1'), format = "%Y.%W %w") [1] "2009-03-23" "2009-07-27" > > On Fri, Jul 15, 2011 at 1:54 PM, Dimitri Liakhovitski wrote: > Hello! > > I know how to transform dates into year.weeknumber format using zoo: > > library(zoo) > as.numeric(format(as.Date("2010-10-02"), "%Y.%W")) > > But is there a straightforward way to do the opposite - to transform > character strings like "2009.12" or "2009.30" back into dates > (assuming that weeks start on Monday)? > Thanks a lot! > > > -- > Dimitri Liakhovitski > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Migration Analysis?
Is it possible to do Migration Analysis in R? -- View this message in context: http://r.789695.n4.nabble.com/Migration-Analysis-tp3670866p3670866.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] running previous versions of R
I'm running Ubuntu - natty. Forgive me if I sound lost, its been just over a week since I switched over from Windows. I originally installed 2.12 using the Ubuntu software center, but have now switched to using the terminal. 2.9.2 was installed using a .tar.zip file downloaded from cran. Before I switched over to Linux I had both versions installed in Windows, but I had to switch to Linux because Windows is so limiting on RAM use. Upon further poking around I've figured out how to do it. I found the R executable file in /usr/lib/R-2.9.2/bin, so entering $/usr/lib/R-2.9.2/bin/R starts up version 2.9.2. Thanks for your help. I'm running a CRAN package that has not been updated since 2009, and I'm not rerunning an old analysis. The problem is that the software I'm using for the analysis gives me an error saying that it requires R version 2.4 or higher if I try to run it in version 2.10 or higher. I think someone was a little sloppy with some code so that the program only looks at two integers when verifying the version of R. I've let the author know about it, but the program is no longer in development. -- View this message in context: http://r.789695.n4.nabble.com/running-previous-versions-of-R-tp3670587p3670836.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] barplot question
Hi - I would like to make to make a barplot of my data, but am having issues. An example of my data is: species netpair poundstype Cod Control 1 46 kept Little Skate Control 1 0 kept Summer Flounder Control 1 9 kept Windowpane Flounder Control 1 0 kept Winter Flounder Control 1 0 kept Winter Skate Control 1 0 kept Yellowtail Flounder Control 1 76 kept Cod Experimental 1 19 kept Little Skate Experimental 1 0 kept Summer Flounder Experimental 1 2 kept Windowpane Flounder Experimental 1 0 kept Winter Flounder Experimental 1 0 kept Winter Skate Experimental 1 0 kept Yellowtail Flounder Experimental 1 9 kept Cod Control 1 14 discard Little Skate Control 1 75 discard Summer Flounder Control 1 1 discard Windowpane Flounder Control 1 32 discard Winter Flounder Control 1 16 discard Winter Skate Control 1 225 discard Yellowtail Flounder Control 1 7 discard Cod Experimental 1 7 discard Little Skate Experimental 1 64 discard Summer Flounder Experimental 1 3 discard Windowpane Flounder Experimental 1 26 discard Winter Flounder Experimental 1 12 discard Winter Skate Experimental 1 136 discard Yellowtail Flounder Experimental 1 5 discard I have 9 total pairs. I would like to be able be able to make a barplot by pair that shows the catch of the control net (kept & discard) stacked with the catch of the experimental net also stacked by species like the image below I did in excel. http://r.789695.n4.nabble.com/file/n3670861/image.jpg I can make barplots by net and pair, but I would like to have both nets on one barplot if possible. Thanks Sally -- View this message in context: http://r.789695.n4.nabble.com/barplot-question-tp3670861p3670861.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help! kennard-stone algorithm in soil.spec packages does not work for my dataset!!!
I'm also trying to use the kennard-stone algorithm in the soil.spec package for my dataset, (to generate a training and test set from the data, based on this algorithm, because it's the most commonly used and well-performing algorithm in QSAR studies) but it's generating an error: > ken.sto(mydataIN) Error in ken.sto(mydataIN) : subscript out of bounds My data is a 42 row by 6 column all numerical (except header) matrix of the format: id x1 x2 x3 x4y1 2 66.77.710.079 4.58 3.0792 13 79.79.570.100 4.82 2.8451 5 77.73.100.071 1.42 0.4771 6 82.17.580.071 2.08 0.7160 32 98.85.600.143 3.27 1.7160 36 93.34.740.097 4.16 1.7160 ... I cannot find any documentation for the exact format of the data matrix for this function (http://www.inside-r.org/packages/cran/soil.spec/docs/ken.sto does not have this information). Any help would be appreciated!!! -- View this message in context: http://r.789695.n4.nabble.com/help-kennard-stone-algorithm-in-soil-spec-packages-does-not-work-for-my-dataset-tp3031344p3670857.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scaling advice
Hi, I have a consultants nightmare -- I was given a project that another consultant did and I was told to do the same calculations, but there's no documentation on what he did. Basically, I have yes/no answers to survey questions about the effectiveness of product attributes by brands. There are 44 attributes and 13 brands. The other guy scaled the proportion of respondents who said Yes to be mean 0 and variance 1.0, apparently doing this by brand within each attribute. He then created a matrix of 44 rows for the attributes and 13 columns for the brands. No problem with this; I can always replicate this much. But then he apparently rescaled this 44x13 matrix so that the rows all sum to zero and the columns all sum to zero. None of the row and column standard deviations are 1.0. This I can't see how to do. How can I rescale the rows and columns so that they all sum to zero? Any suggestions? Thanks, Walt Walter R. Paczkowski, Ph.D. Data Analytics Corp. 44 Hamilton Lane Plainsboro, NJ 08536 (V) 609-936-8999 (F) 609-936-3733 w...@dataanalyticscorp.com www.dataanalyticscorp.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculate Az (A sub z) with R?
dd2es virginia.edu> writes: > > I am looking for (or interested in writing) a function that calculates Az, > an alternative measure of discriminability from SDT (alternative to d', Az). > I have written my own functions for d', A', B"d, and am aware of the 'sdtalt' > package, but I have yet to find a way to calculate Az, since it require the > phi operator. The Phi --- not Psi, see the paper --- function is simply the cumulative normal distribution, so you can use pnorm() instead. -- Hans Werner > For a relevant paper/discussion (and formula), please see Verde and > McMillian, 2006 (Measures of sensitivity based on a single hit rate and > false alarm rate: The accuracy, precision, and robustness of d', Az, and Az) > > Any help on this would be greatly appreciated! > > David Dobolyi > Graduate Student > Cognitive Psychology > University of Virginia > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Add permanently environment variable
Hello everyone, I know how to add a folder path to my EV path but it only works for the current R session. Is there a way to add it permanently? Here is my code: Sys.setenv(PATH=paste("C:\\Program Files\\Java\\jre1.6.0_13\\bin;", Sys.getenv(x="PATH"), sep="")) Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Add-permanently-environment-variable-tp3670920p3670920.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] running previous versions of R
On Jul 15, 2011, at 2:58 PM, jstevens wrote: > I'm running Ubuntu - natty. Forgive me if I sound lost, its been just over a > week since I switched over from Windows. I originally installed 2.12 using > the Ubuntu software center, but have now switched to using the terminal. > 2.9.2 was installed using a .tar.zip file downloaded from cran. Before I > switched over to Linux I had both versions installed in Windows, but I had > to switch to Linux because Windows is so limiting on RAM use. > > > Upon further poking around I've figured out how to do it. I found the R > executable file in /usr/lib/R-2.9.2/bin, so entering $/usr/lib/R-2.9.2/bin/R > starts up version 2.9.2. Thanks for your help. > > > I'm running a CRAN package that has not been updated since 2009, and I'm not > rerunning an old analysis. The problem is that the software I'm using for > the analysis gives me an error saying that it requires R version 2.4 or > higher if I try to run it in version 2.10 or higher. I think someone was a > little sloppy with some code so that the program only looks at two integers > when verifying the version of R. I've let the author know about it, but the > program is no longer in development. Hi, I am presuming that you obtained the 2.9.2 source tarball for R from CRAN and then compiled it? Otherwise, you would only have the source files from the extracted archive and R should not otherwise run. There are not pre-compiled tar files of binaries for R on CRAN for Linux. A couple of other comments: 1. There is a SIG e-mail list for R on Debian based distros, of which Ubuntu is one. Info here: https://stat.ethz.ch/mailman/listinfo/r-sig-debian I would recommend posting any technical questions related to using R on Ubuntu there. Also, a good *recent* intro book on Ubuntu would not be a bad idea. I was in the same boat about 10 years ago, when I made the transition from Windows to Linux (Red Hat then Fedora), though I have been running OSX for the past two years or so. 2. There are 64 bit versions of R for Windows now, if you have a 64 bit version of Windows running. See the R FAQ for Windows for more info. A final comment, which is that 2.13.1 is the current version of R, so you should look to upgrade from 2.12.??. Regards, Marc __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding rows based on column value
I have tried the aggregate command but it shows this error- vars <- paste('Case', c('A', 'C', 'G', 'T'), sep = '') > vars [1] "CaseA" "CaseC" "CaseG" "CaseT" > aggregate(file[vars], by = df['Pos'], FUN = sum) Error in aggregate.data.frame(file[vars], by = df["Pos"], FUN = sum) : arguments must have same length the thing is I cant use the plyr because I want the coding so that I can use it to make a tool. Can you please tell me why aggregate function is showing this error.I am confused. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London From: Dennis Murphy [djmu...@gmail.com] Sent: Friday, July 15, 2011 7:38 PM To: Bansal, Vikas Cc: r-help@r-project.org Subject: Re: [R] Adding rows based on column value Hi: This seems to work: library(plyr) # select the variables to summarize: vars <- paste('Case', c('A', 'C', 'G', 'T'), sep = '') # Alternatively, # vars <- names(df)[grep('Case', names(df))] # One way: the ddply() function in package plyr in # conjunction with the colwise() function > ddply(df, .(Pos), colwise(sum, vars)) Pos CaseA CaseC CaseG CaseT 1 135344110 02448 0 2 135344113 0 024 0 3 13534411448 0 0 0 4 135344116 0 0 024 5 135344118 024 024 6 1353441222424 0 0 7 135344123 048 024 8 135344126 0 024 0 The colwise() function applies the same function (here, sum) to each variable in the variable list given by vars. The wrapper function ddply() applies the colwise() function to each subset of the data defined by a unique value of Pos. Another way is to use the aggregate() function from base R. The following code comes from another thread on this list in the past couple of days due to Bill Dunlap. > aggregate(df[vars], by = df['Pos'], FUN = sum) Pos CaseA CaseC CaseG CaseT 1 135344110 02448 0 2 135344113 0 024 0 3 13534411448 0 0 0 4 135344116 0 0 024 5 135344118 024 024 6 1353441222424 0 0 7 135344123 048 024 8 135344126 0 024 0 HTH, Dennis 2011/7/15 Bansal, Vikas : > Dear all, > > I have one problem and did not find any solution. > I have attached the question in text file also because sometimes spacing is > not good in mail. > > I have a file(file.txt) attached with this mail.I am reading it using this > code to make a data frame (file)- > > file=read.table("file.txt",fill=T,colClasses = "character",header=T) > > file looks like this- > > Chr PosCaseA CaseCCaseG CaseT > 10 135344110 0.00 24.00 0.00 0.00 > 10 135344110 0.00 0.00 24.00 0.00 > 10 135344110 0.00 0.00 24.00 0.00 > 10 135344113 0.00 0.00 24.00 0.00 > 10 135344114 24.00 0.00 0.00 0.00 > 10 135344114 24.00 0.00 0.00 0.00 > 10 135344116 0.00 0.00 0.00 24.00 > 10 135344118 0.00 24.00 0.00 0.00 > 10 135344118 0.00 0.00 0.00 24.00 > 10 135344122 24.00 0.00 0.00 0.00 > 10 135344122 0.00 24.00 0.00 0.00 > 10 135344123 0.00 24.00 0.00 0.00 > 10 135344123 0.00 24.00 0.00 0.00 > 10 135344123 0.00 0.00 0.00 24.00 > 10 135344126 0.00 0.00 24.00 0.00 > > Now some of the values in column Pos are same.For these same positions i want > to add the values of columns 3:6 > I will explain with an example- > The output of first row should be- > > Chr Pos CaseA CaseC CaseG CaseT > 10 135344110 0.00 24.00 48.00 0.00 > > because first three rows have same value in Pos column. > > so the whole output for above input should be- > > Chr PosCaseA CaseC CaseG CaseT > 10 1353441100.00 24.00 48.000.00 > 10 1353441130.00 0.00 24.000.00 > 10 135344114 48.00 0.000.00 0.00 > 10 135344116 0.00 0.000.0024.00 > 10 135344118 0.00 24.00 0.0024.00 > 10 135344122 24.00 24.00 0.000.00 > 10 135344123 0.00 48.00 0.0024.00 > 10 135344126 0.00 0.0024.00 0.00 > > Can you please help me. > > > Thanking you, > Warm Regards > Vikas Bansal > Msc Bioinformatics > Kings College London > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailin
[R] Convert continuous variable into discrete variable
Dear all, I have a continuous variable that can take on values between 0 and 100, for example: x<-runif(100,0,100) I also have a second variable that defines a series of thresholds, for example: y<-c(3, 4.5, 6, 8) I would like to convert my continuous variable into a discrete one using the threshold variables: If x is between 0 and 3 the discrete variable should be 1 If x is between 3 and 4.5 the discrete variable should be 2 If x is between 4.5 and 6 the discrete variable should be 3 If x is between 6 and 8 the discrete variable should be 4 If x is larger than 8 the discrete variable should be 5 Is there a straightforward way of doing this (besides working with several if statements in a row)? Thanks, Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding rows based on column value
I have tried the aggregate command but it shows this error- vars <- paste('Case', c('A', 'C', 'G', 'T'), sep = '') > vars [1] "CaseA" "CaseC" "CaseG" "CaseT" > aggregate(file[vars], by = file['Pos'], FUN = sum) Error in FUN(X[[1L]], ...) : invalid 'type' (character) of argument the thing is I cant use the plyr because I want the coding so that I can use it to make a tool. Can you please tell me why aggregate function is showing this error.I am confused. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London From: Dennis Murphy [djmu...@gmail.com] Sent: Friday, July 15, 2011 7:38 PM To: Bansal, Vikas Cc: r-help@r-project.org Subject: Re: [R] Adding rows based on column value Hi: This seems to work: library(plyr) # select the variables to summarize: vars <- paste('Case', c('A', 'C', 'G', 'T'), sep = '') # Alternatively, # vars <- names(df)[grep('Case', names(df))] # One way: the ddply() function in package plyr in # conjunction with the colwise() function > ddply(df, .(Pos), colwise(sum, vars)) Pos CaseA CaseC CaseG CaseT 1 135344110 02448 0 2 135344113 0 024 0 3 13534411448 0 0 0 4 135344116 0 0 024 5 135344118 024 024 6 1353441222424 0 0 7 135344123 048 024 8 135344126 0 024 0 The colwise() function applies the same function (here, sum) to each variable in the variable list given by vars. The wrapper function ddply() applies the colwise() function to each subset of the data defined by a unique value of Pos. Another way is to use the aggregate() function from base R. The following code comes from another thread on this list in the past couple of days due to Bill Dunlap. > aggregate(df[vars], by = df['Pos'], FUN = sum) Pos CaseA CaseC CaseG CaseT 1 135344110 02448 0 2 135344113 0 024 0 3 13534411448 0 0 0 4 135344116 0 0 024 5 135344118 024 024 6 1353441222424 0 0 7 135344123 048 024 8 135344126 0 024 0 HTH, Dennis 2011/7/15 Bansal, Vikas : > Dear all, > > I have one problem and did not find any solution. > I have attached the question in text file also because sometimes spacing is > not good in mail. > > I have a file(file.txt) attached with this mail.I am reading it using this > code to make a data frame (file)- > > file=read.table("file.txt",fill=T,colClasses = "character",header=T) > > file looks like this- > > Chr PosCaseA CaseCCaseG CaseT > 10 135344110 0.00 24.00 0.00 0.00 > 10 135344110 0.00 0.00 24.00 0.00 > 10 135344110 0.00 0.00 24.00 0.00 > 10 135344113 0.00 0.00 24.00 0.00 > 10 135344114 24.00 0.00 0.00 0.00 > 10 135344114 24.00 0.00 0.00 0.00 > 10 135344116 0.00 0.00 0.00 24.00 > 10 135344118 0.00 24.00 0.00 0.00 > 10 135344118 0.00 0.00 0.00 24.00 > 10 135344122 24.00 0.00 0.00 0.00 > 10 135344122 0.00 24.00 0.00 0.00 > 10 135344123 0.00 24.00 0.00 0.00 > 10 135344123 0.00 24.00 0.00 0.00 > 10 135344123 0.00 0.00 0.00 24.00 > 10 135344126 0.00 0.00 24.00 0.00 > > Now some of the values in column Pos are same.For these same positions i want > to add the values of columns 3:6 > I will explain with an example- > The output of first row should be- > > Chr Pos CaseA CaseC CaseG CaseT > 10 135344110 0.00 24.00 48.00 0.00 > > because first three rows have same value in Pos column. > > so the whole output for above input should be- > > Chr PosCaseA CaseC CaseG CaseT > 10 1353441100.00 24.00 48.000.00 > 10 1353441130.00 0.00 24.000.00 > 10 135344114 48.00 0.000.00 0.00 > 10 135344116 0.00 0.000.0024.00 > 10 135344118 0.00 24.00 0.0024.00 > 10 135344122 24.00 24.00 0.000.00 > 10 135344123 0.00 48.00 0.0024.00 > 10 135344126 0.00 0.0024.00 0.00 > > Can you please help me. > > > Thanking you, > Warm Regards > Vikas Bansal > Msc Bioinformatics > Kings College London > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailma
Re: [R] Export Unicode characters from R
Hi, I'm interested in the suggestion to use writeLines( ..., useBytes=TRUE), but how can I use this function on the way to exporting from R? Could you please provide a simple example? The following suggestion worked very well: > funny.g<- "\u1E21" > rawstuff<- charToRaw(funny.g) > writeBin(rawstuff, "funny.g.txt") But the function charToRaw() only allows an object with a single character, and writeBin cannot be used to export data frames. Is there any solution along these lines when I have a data frame with Unicode characters? Best Sverre On Fri, Jul 15, 2011 at 2:38 PM, Duncan Murdoch wrote: > On 15/07/2011 1:42 PM, Sverre Stausland wrote: >> >> >>> >> >>> > funny.g<- "\u1E21" >> >>> > funny.g >> >> >> >> [1] "ḡ" >> >> >> >>> > data.frame (funny.g) -> funny.g >> >>> > funny.g$funny.g >> >> >> >> [1] ḡ >> >> Levels: >> > >> > I think the problem is in the data.frame code, not in writing. >> > Data.frames >> > try to display things in a readable way, and since you're on Windows >> > where >> > UTF-8 is not really supported, the code helpfully changes that >> > character to >> > the "" string. for display. >> >> I thought the data.frame function didn't alter the unicode coding, >> since funny.g$funny.g above still displays the right unicode character >> (although it does list the levels as). >> >> > You should be able to write the Unicode character to file if you use >> > lower >> > level methods such as cat(), on a connection opened using the file() >> > function with the encoding set explicitly. >> >> I'm sorry, but I don't understand what it means "to use cat() on a >> connection opened using the file() function". Could you please clarify >> that? >> > > I just checked on how R does it. We use UTF-8 encodings in the help pages, > regardless of what kind of system you're running on. > > It converts the strings to UTF-8 internally first (your funny.g is already > encoded that way; see Encoding(funny.g)) then uses > > writeLines( ..., useBytes=TRUE) > > to write it. The useBytes argument says not to try to make the file > readable on the local system, just write out the bytes. > > Another way to do it is to get your strings in the UTF-8 encoding, convert > them to raw vectors, and use writeBin() to write those out. For example, > > funny.g<- "\u1E21" > rawstuff<- charToRaw(funny.g) > writeBin(rawstuff, "funny.g.txt") > > > All of this appears hard, because you're thinking of UTF-8 as text, but on > Windows, R thinks of it as a binary encoding. Modern Windows systems can > handle UTF-8, but not all programs on them can. > > Duncan Murdoch > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scaling advice
On Jul 15, 2011, at 23:05 , Data Analytics Corp. wrote: > Hi, > > I have a consultants nightmare -- I was given a project that another > consultant did and I was told to do the same calculations, but there's no > documentation on what he did. Basically, I have yes/no answers to survey > questions about the effectiveness of product attributes by brands. There are > 44 attributes and 13 brands. The other guy scaled the proportion of > respondents who said Yes to be mean 0 and variance 1.0, apparently doing this > by brand within each attribute. He then created a matrix of 44 rows for the > attributes and 13 columns for the brands. No problem with this; I can always > replicate this much. But then he apparently rescaled this 44x13 matrix so > that the rows all sum to zero and the columns all sum to zero. None of the > row and column standard deviations are 1.0. This I can't see how to do. How > can I rescale the rows and columns so that they all sum to zero? Any > suggestions? > If the _sum_ is zero, there must be both negative and positive elements, so it can't be a pure scaling. sweep()'ing out the row and column means would be the first thing to come to my mind. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting one column value into multiple rows
Hi, This is working with when i have few lines and when i give those input lines in R window. But i want to apply this function on a variable which is a part of dataset and the data set is very large in size. Any help in this aspect will really help me a lot. Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/Splitting-one-column-value-into-multiple-rows-tp3668835p3671087.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert continuous variable into discrete variable
Hi: x<-runif(100,0,100) u <- cut(x, breaks = c(0, 3, 4.5, 6, 8, Inf), labels = c(1:5)) Based on the x I obtained, > table(u) u 1 2 3 4 5 3 2 1 2 92 cut() or findInterval() are the two basic functions for discretizing a numeric variable. HTH, Dennis On Fri, Jul 15, 2011 at 2:29 PM, Michael Haenlein wrote: > Dear all, > > I have a continuous variable that can take on values between 0 and 100, for > example: x<-runif(100,0,100) > > I also have a second variable that defines a series of thresholds, for > example: y<-c(3, 4.5, 6, 8) > > I would like to convert my continuous variable into a discrete one using the > threshold variables: > > If x is between 0 and 3 the discrete variable should be 1 > If x is between 3 and 4.5 the discrete variable should be 2 > If x is between 4.5 and 6 the discrete variable should be 3 > If x is between 6 and 8 the discrete variable should be 4 > If x is larger than 8 the discrete variable should be 5 > > Is there a straightforward way of doing this (besides working with several > if statements in a row)? > > Thanks, > > Michael > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scaling advice
On Fri, Jul 15, 2011 at 2:05 PM, Data Analytics Corp. wrote: > But then he apparently rescaled this 44x13 > matrix so that the rows all sum to zero and the columns all sum to zero. > None of the row and column standard deviations are 1.0. This I can't see > how to do. How can I rescale the rows and columns so that they all sum to > zero? Any suggestions? Well, he could have used the Gower's centering transformation described for example in this pdf: www.stat.auckland.ac.nz/~mja/prog/PCO_UserNotes.pdf As you can convince yourself very easily, given any matrix A, the matrix G calculated as on page 3 of the document will have zero sum rows and columns. HTH, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help page becomes unavailable after a package is reinstalled
Hi all, I have noticed this problem ever since R changed its static HTML help pages to dynamic help pages: when I reinstall a package and try to view any help page of this package, I always get this error (in the terminal or html page) Error in fetch(key) : internal error -3 in R_decompress1 As a package developer, I often have to reinstall a package again and again, so I wish I do not have to restart R to see the new documentation. Anybody ever met a similar situation and has an idea? Thanks! I use R 2.13.1 under Ubuntu, and it also appears in Windows 7. Regards, Yihui -- Yihui Xie Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting one column value into multiple rows
On Jul 15, 2011, at 6:05 PM, Madana_Babu wrote: Hi, This is working with when i have few lines and when i give those input lines in R window. But i want to apply this function on a variable which is a part of dataset and the data set is very large in size. Any help in this aspect will really help me a lot. Define "very large". And provide machine specifics and the full text of any errors you are encountering. There is no reason you cannot offer a column of an R data.frame to teh textConnection function. It will behave exactly like a file. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help page becomes unavailable after a package is reinstalled
I can verify that I get exactly the same error (also R 2.13.1 under Ubuntu). No idea what to *do* about it, though. :-( cheers, Rolf Turner On 16/07/11 11:25, Yihui Xie wrote: Hi all, I have noticed this problem ever since R changed its static HTML help pages to dynamic help pages: when I reinstall a package and try to view any help page of this package, I always get this error (in the terminal or html page) Error in fetch(key) : internal error -3 in R_decompress1 As a package developer, I often have to reinstall a package again and again, so I wish I do not have to restart R to see the new documentation. Anybody ever met a similar situation and has an idea? Thanks! I use R 2.13.1 under Ubuntu, and it also appears in Windows 7. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] summarized data set - how to use an "occurs" field
I have a data set with 22 fields and several thousand records in which one field (count) indicates the number of times that each specific combination of the other 21 fields occurred in a bigger and largely unavailable data set. So each record is unique in its combination of field values and has a field that identifies how many multiples of this record actually occurred. Without resorting to writing a program that re-expands the data set to several million rows by cloning each row by the number of times the "count" field indicated, is there a way in R to use that field to come up with summary stats and bargraphs of the distribution of any one of the other fields? best Matthew __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Placing eps files from R into Adobe InDesign documents: specifying fontfamily
Also try using pdf() instead of postscript(). It seems to keep everything happy, and retain higher resolution. -- View this message in context: http://r.789695.n4.nabble.com/Placing-eps-files-from-R-into-Adobe-InDesign-documents-specifying-fontfamily-tp1012186p3671150.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple ggplot in a single plot
Hello friends i have to created several ggplots. I have to combine them together to a new plot. any ideas ?? I am new to R. I am attaching a sample plot -- http://r.789695.n4.nabble.com/file/n3671184/1A2.jpeg Thanks in Advance. -- View this message in context: http://r.789695.n4.nabble.com/Multiple-ggplot-in-a-single-plot-tp3671184p3671184.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help page becomes unavailable after a package is reinstalled
On Fri, Jul 15, 2011 at 8:14 PM, Rolf Turner wrote: > > I can verify that I get exactly the same error (also R 2.13.1 under Ubuntu). > No idea what to *do* about it, though. :-( > You could check if text help works: options(help_type = "text") ?by or help("by", help_type = "text") -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot question
Start out with ?barplot Then please tell us what your "issues" are. The barplot function is pretty flexible. If I guess that you are having difficulty simultaneously plotting one set of stacked bars and another set of non-stacked bars next to them, I would recommend two approaches. One is to play with > barplot(first_stuff) > par(new=TRUE) > barplot(second_stuff) The other, and probably better, approach, is to write your data of interest into a new matrix with some zeroes added to certain columns, and use the "beside=FALSE" argument to barplot(). Carl Sally Roman wrote: Hi - I would like to make to make a barplot of my data, but am having issues. An example of my data is: species netpair poundstype Cod Control 1 46 kept Little Skate Control 1 0 kept Summer Flounder Control 1 9 kept Windowpane Flounder Control 1 0 kept Winter Flounder Control 1 0 kept Winter Skate Control 1 0 kept Yellowtail Flounder Control 1 76 kept Cod Experimental 1 19 kept Little Skate Experimental 1 0 kept Summer Flounder Experimental 1 2 kept Windowpane Flounder Experimental 1 0 kept Winter Flounder Experimental 1 0 kept Winter Skate Experimental 1 0 kept Yellowtail Flounder Experimental 1 9 kept Cod Control 1 14 discard Little Skate Control 1 75 discard Summer Flounder Control 1 1 discard Windowpane Flounder Control 1 32 discard Winter Flounder Control 1 16 discard Winter Skate Control 1 225 discard Yellowtail Flounder Control 1 7 discard Cod Experimental 1 7 discard Little Skate Experimental 1 64 discard Summer Flounder Experimental 1 3 discard Windowpane Flounder Experimental 1 26 discard Winter Flounder Experimental 1 12 discard Winter Skate Experimental 1 136 discard Yellowtail Flounder Experimental 1 5 discard I have 9 total pairs. I would like to be able be able to make a barplot by pair that shows the catch of the control net (kept & discard) stacked with the catch of the experimental net also stacked by species like the image below I did in excel. http://r.789695.n4.nabble.com/file/n3670861/image.jpg I can make barplots by net and pair, but I would like to have both nets on one barplot if possible. -- - Sent from my Cray XK6 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple ggplot in a single plot
Hi, Here is one option, though this may be a bit tricky for you if you are new. ggplot2 is based on grid graphics, so using grid you can obtain more customization. There may be easier ways and even within grid it may be possible to do more simply than I am demonstrating, I am still finding my footing with that package. Example code follows that should reproduce the attached PDF. Cheers, Josh ### require(ggplot2) ## some plots using the built in "mtcars" dataset p1 <- ggplot(melt(as.data.frame(lapply(mtcars, scale))), aes(x = variable, y = value)) + geom_boxplot() + geom_jitter() p2 <- ggplot(melt(abs(cor(mtcars))), aes(x = X1, y = X2, fill = value)) + geom_tile() p3 <- ggplot(mtcars, aes(x = wt, y = disp, colour = factor(am))) + geom_point() + geom_smooth(method = "lm", aes(group = 1), se = FALSE) p4 <- ggplot(mtcars, aes(x = hp, y = mpg, size = factor(cyl))) + geom_point() ## Start a new device with a specified size dev.new(width = 11, height = 8.5) ## use the grid package to customize the layout pushViewport(vpList( viewport(x = 0, y = .45, width = .5, height = .45, just = c("left", "bottom"), name = "p1"), viewport(x = .5, y = .45, width = .5, height = .45, just = c("left", "bottom"), name = "p2"), viewport(x = 0, y = 0, width = .5, height = .4, just = c("left", "bottom"), name = "p3"), viewport(x = .5, y = 0, width = .5, height = .45, just = c("left", "bottom"), name = "p4"), viewport(x = 0, y = .9, width = 1, height = .1, just = c("left", "bottom"), name = "title"))) ## Add the plots from ggplot2 upViewport() downViewport("p1") print(p1, newpage = FALSE) upViewport() downViewport("p2") print(p2, newpage = FALSE) upViewport() downViewport("p3") print(p3, newpage = FALSE) upViewport() downViewport("p4") print(p4, newpage = FALSE) ## add an overall title (note I left space for it and gave it its own viewport) upViewport() downViewport("title") grid.text("Four Plots created by the Excellent ggplot2 package", x = .5, gp = gpar(fontsize = 18)) On Fri, Jul 15, 2011 at 4:46 PM, hrishi wrote: > Hello friends i have to created several ggplots. > I have to combine them together to a new plot. > any ideas ?? I am new to R. > I am attaching a sample plot > -- > http://r.789695.n4.nabble.com/file/n3671184/1A2.jpeg > Thanks in Advance. > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Multiple-ggplot-in-a-single-plot-tp3671184p3671184.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles https://joshuawiley.com/ multiggplot2.pdf Description: Adobe PDF document __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summarized data set - how to use an "occurs" field
Hi: Your count variable is a frequency associated with a given row of the data set. If you're more specific about what you want and can post a representative sample of (some facsimile of) your data using dput(), the list is likely to be more helpful. See the posting guide linked at the bottom of this message for guidelines. Dennis On Fri, Jul 15, 2011 at 3:10 PM, mloxton wrote: > I have a data set with 22 fields and several thousand records in which > one field (count) indicates the number of times that each specific > combination of the other 21 fields occurred in a bigger and largely > unavailable data set. > So each record is unique in its combination of field values and has a > field that identifies how many multiples of this record actually > occurred. > > Without resorting to writing a program that re-expands the data set to > several million rows by cloning each row by the number of times the > "count" field indicated, is there a way in R to use that field to come > up with summary stats and bargraphs of the distribution of any one of > the other fields? > > best > Matthew > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summarized data set - how to use an "occurs" field
On Jul 15, 2011, at 6:10 PM, mloxton wrote: I have a data set with 22 fields and several thousand records in which one field (count) indicates the number of times that each specific combination of the other 21 fields occurred in a bigger and largely unavailable data set. So each record is unique in its combination of field values and has a field that identifies how many multiples of this record actually occurred. Without resorting to writing a program that re-expands the data set to several million rows by cloning each row by the number of times the "count" field indicated, is there a way in R to use that field to come up with summary stats and bargraphs of the distribution of any one of the other fields? > dfrm <- expand.grid(A=1:3, B=1:3) > dfrm$counts <- 1:9 > xtabs(counts~A, data=dfrm) A 1 2 3 12 15 18 >barplot(xtabs(counts~A, data=dfrm), xlab="Counts by A level") -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add permanently environment variable
On Fri, Jul 15, 2011 at 1:45 PM, Anna Lippel wrote: > Hello everyone, I know how to add a folder path to my EV path but it only > works for the current R session. Is there a way to add it permanently? Here Yes, you can add it permanently using Windows. If you are on Windows 7, something like this should work: WindowsKey + R (to bring up the run console) powershell RET (to bring up the powershell) # Create a new variable with the path to Java $newpath = "C:\Program Files\Java\jre1.6.0_13\bin;" # add the contents of the machine path to the above $newpath += [environment]::GetEnvironmentVariable("PATH", "Machine") $newpath # verify this is correct # now set the environment variable "PATH" to the contents of $newpath [Environment]::SetEnvironmentVariable("PATH", $newpath, "Process") # Check that things look as they should (again) [Environment]::GetEnvironmentVariable("PATH", "Process") # Note that where I put "Process" you would need to put # "Machine" if you want it to be permanent # but be careful because you could really mess things up # which is why I left it at the Process level which will be trashed # when you exit that session of the powershell Another, perhaps simpler option would be to use the control panel. Searching for windows set environment variable will bring up countless guides. HTH, Josh > is my code: > Sys.setenv(PATH=paste("C:\\Program Files\\Java\\jre1.6.0_13\\bin;", > Sys.getenv(x="PATH"), sep="")) > Thanks a lot! > > -- > View this message in context: > http://r.789695.n4.nabble.com/Add-permanently-environment-variable-tp3670920p3670920.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Z-test
Hi, please could you recommend a R package that computes a 2 sample z-test ? thanks, Bogdan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help page becomes unavailable after a package is reinstalled
Unfortunately, no. We can use any package to reproduce this error, e.g. rgl > library(rgl) > options(help_type = "text") > ?rgl.open # works fine > install.packages('rgl') # reinstall it Installing package(s) into ‘/home/yihui/R/x86_64-pc-linux-gnu-library/2.13’ (as ‘lib’ is unspecified) trying URL 'http://streaming.stat.iastate.edu/CRAN/src/contrib/rgl_0.92.798.tar.gz' Content type 'application/x-gzip' length 162 bytes (1.6 Mb) opened URL == downloaded 1.6 Mb * installing *source* package ‘rgl’ ... checking for gcc... gcc ** testing if installed package can be loaded * DONE (rgl) The downloaded packages are in ‘/tmp/Rtmp0NdwY4/downloaded_packages’ > ?rgl.open Error in fetch(key) : internal error -3 in R_decompress1 > sessionInfo() R version 2.13.1 (2011-07-08) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rgl_0.92.798 loaded via a namespace (and not attached): [1] tools_2.13.1 Regards, Yihui -- Yihui Xie Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Fri, Jul 15, 2011 at 7:33 PM, Gabor Grothendieck wrote: > On Fri, Jul 15, 2011 at 8:14 PM, Rolf Turner wrote: >> >> I can verify that I get exactly the same error (also R 2.13.1 under Ubuntu). >> No idea what to *do* about it, though. :-( >> > You could check if text help works: > > options(help_type = "text") > ?by > > or > > help("by", help_type = "text") > > -- > Statistics & Software Consulting > GKX Group, GKX Associates Inc. > tel: 1-877-GKX-GROUP > email: ggrothendieck at gmail.com > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Z-test
Hi Bogdan, Look at ?pnorm Josh On Fri, Jul 15, 2011 at 9:10 PM, Bogdan Tanasa wrote: > Hi, > > please could you recommend a R package that computes a 2 sample z-test ? > > thanks, > > Bogdan > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Z-test
Hi, The Z is basically: (mean(x) - mean(y))/sqrt(var(x)/length(x) + var(y)/length(y)) and pnorm will give you a p-value, if you desire it. If the n - 1 divisior used in var() is a problem for you, it is trivial to work around: X <- cbind(x, y) XX <- crossprod(X - tcrossprod(matrix(1, nrow(X))) %*% X * (1/nrow(X))) * 1/nrow(X) diff(colMeans(X))/sqrt(sum(diag(XX)/nrow(X))) where the last line gives the Z and again, pnorm() will give you a p-value if desired. In most cases a t-test is preferred (and is available using the t.test function). HTH, Josh On Fri, Jul 15, 2011 at 9:56 PM, Bogdan Tanasa wrote: > Hi Josh, > > thanks for your email. I have been looking into pnorm, but hmmm ... it does > not seem to assess the difference between 2 populations, it says > that it works on a vector of quantiles, and sd=1, mean = 0. please let me > know if you have any suggestions. thanks, > > bogdan > > On Fri, Jul 15, 2011 at 9:49 PM, Joshua Wiley > wrote: >> >> Hi Bogdan, >> >> Look at ?pnorm >> >> Josh >> >> On Fri, Jul 15, 2011 at 9:10 PM, Bogdan Tanasa wrote: >> > Hi, >> > >> > please could you recommend a R package that computes a 2 sample z-test ? >> > >> > thanks, >> > >> > Bogdan >> > >> > [[alternative HTML version deleted]] >> > >> > __ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> >> >> -- >> Joshua Wiley >> Ph.D. Student, Health Psychology >> University of California, Los Angeles >> https://joshuawiley.com/ > > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R sign test for censored data
does anyone know a statistical test implemented in R that can do a sign test for difference of medians, except that can handle censored data? Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.