[R] Best Practices for submitting packages to CRAN
Dear all, I (and a colleague) have been working on our first package (for fitting a certain type of cognitive models: http://www.psychologie.uni-freiburg.de/Members/singmann/R/mptinr) for quite a while now and have the feeling, that it is "good to go". That is, we want to submit it to CRAN. However, I still have two questions before doing so and would appreciate any comments. 1. How often is it appropriate to submit updated versions of your package? Background for this question: Although we think we have tested the package thoroughly, there are some errors that only pop up in daily use. That is, it could happen that, especially shortly after the release, fixes need to be released rather frequently (say, every 2 weeks). On the other hand, I know that humans have to operate CRAN and their time is limited. Therefore, any update will consume their time. 2. Is it necessary to put examples that take a considerable amount of time to run (> 1 hour) into a \dontrun block? Background: We have a really slow MCMC function. Some of the examples take ~1 hour to finish. If these examples are run each time the package is checked, it will significantly prolong the checking time. On the other hand, this check will ensure that all changes to the function do not corrupt it. Thank you for taking your time, Henrik -- View this message in context: http://r.789695.n4.nabble.com/Best-Practices-for-submitting-packages-to-CRAN-tp3480870p3480870.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Position of Axis Labels (Base Graphics)
Hi, I wonder if there is a way to selectively manipulate the position of the axis labels (i.e., the numbers). As far as I get it from ?par the only way to manipulate it is via mgp = c(x,y,z), where z is the relevant number. However, this manipulates the position of the axis labels for both the x- and y-axis. The reason I want to do this is that I have the impression that when using the standard values, the x-axis labels are somewhat further from the axis ticks than the y-axis labels. To make both more similar I would like to position the x-axis labels a little bit nearer to the axis. I know I can work around this problem by suppressing one axis (yaxt="n") and then adding it afterwards via axis(side=2,...). But I wonder if one cannot do this in a better way (hey, it is R). Here is a plot where you should see the problem with the different distance of the labels from the axis ticks: plot(50,50,xlab="", ylab="", xlim=c(30,100), ylim=c(30,100), cex.axis=0.8) Thanks, Henrik Singmann -- View this message in context: http://r.789695.n4.nabble.com/Position-of-Axis-Labels-Base-Graphics-tp2250129p2250129.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] car::Anova - Can it be used for ANCOVA with repeated-measures factors.
Dear list, I would like to run an ANCOVA using car::Anova with repeated measures factors, but I can't figure out how to do it. My (between-subjects) covariate always interacts with my within-subject factors. As far as I understand ANCOVA, covariates usually do not interact with the effects of interest but are simply additive (or am I wrong here?). More specifically, I can add a covariate as a factor to the between-subjects part when fitting the MLM that behaves like expected (i.e., does not interact with the other factors), but when calling Anova on the model, I don't know how I can specify the between-within design (i.e., which parts of the model should interact with the repeated measures factors). As far as I understand it, neither the idesign, icontrasts or imatrix arguments, nor the linearHypothesis function can specify the within-between design (as far as I get it they all specify the within or intra-subject design, see John Fox's slides from User 2011: http://web.warwick.ac.uk/statsdept/useR-2011/TalkSlides/Contributed/17Aug_1705_FocusV_4-Multivariate_1-Fox.pdf). If this it is not possible using car::Anova, is there another way to achiebve what I want or is it plainly wrong? I have the feeling that using R's "New Functions for Multivariate Analysis" (Dalgaard, 2007, R News) this could be possible, but some advice on how, would be greatly appreciated, as this does not seem to be the most straight forward way. Below is an example using the car::OBrienKaiser dataset adding an age covariate. The example is merely an adoption from ?Anova with miniml changes and includes e.g. age:phase:hour which I don't want to have. Note that I posted this question to stackoverflow two days ago (http://stackoverflow.com/q/11567446/289572) and did not receive any responses. Please excuse my "crossposting", but I think R-help may be the better place. Best, Henrik PS: I know that the posting guide says "No questions about contributed packages" but there are some questions about car on R-help, so I thought this would be the correct place. ## Example follows # require(car) set.seed(1) n.OBrienKaiser <- within(OBrienKaiser, age <- sample(18:35, size = 16, replace = TRUE)) phase <- factor(rep(c("pretest", "posttest", "followup"), c(5, 5, 5)), levels=c("pretest", "posttest", "followup")) hour <- ordered(rep(1:5, 3)) idata <- data.frame(phase, hour) mod.ok <- lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, post.1, post.2, post.3, post.4, post.5, fup.1, fup.2, fup.3, fup.4, fup.5) ~ treatment * gender + age, data=n.OBrienKaiser) (av.ok <- Anova(mod.ok, idata=idata, idesign=~phase*hour, type = 3)) # Type II Repeated Measures MANOVA Tests: Pillai test statistic # Df test stat approx F num Df den Df Pr(>F) # (Intercept) 1 0.971299.9 1 9 0.00032 *** # treatment2 0.492 4.4 2 9 0.04726 * # gender 1 0.193 2.1 1 9 0.17700 # age 1 0.045 0.4 1 9 0.53351 # treatment:gender 2 0.389 2.9 2 9 0.10867 # phase1 0.855 23.6 2 8 0.00044 *** # treatment:phase 2 0.696 2.4 4 18 0.08823 . # gender:phase 1 0.079 0.3 2 8 0.71944 # age:phase1 0.140 0.7 2 8 0.54603 # treatment:gender:phase 2 0.305 0.8 4 18 0.53450 # hour 1 0.939 23.3 4 6 0.00085 *** # treatment:hour 2 0.346 0.4 8 14 0.92192 # gender:hour 1 0.286 0.6 4 6 0.67579 # age:hour 1 0.262 0.5 4 6 0.71800 # treatment:gender:hour2 0.539 0.6 8 14 0.72919 # phase:hour 1 0.663 0.5 8 2 0.80707 # treatment:phase:hour 2 0.893 0.3 16 6 0.97400 # gender:phase:hour1 0.700 0.6 8 2 0.76021 # age:phase:hour 1 0.813 1.1 8 2 0.56210 # treatment:gender:phase:hour 2 1.003 0.4 16 6 0.94434 # --- # Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] car::Anova - Can it be used for ANCOVA with repeated-measures factors.
Dear John, thanks for your response. But if I simply ignore the unwanted effects, the estimates of the main effects for the within-subjects factors are distroted (rationale see below). Or doesn't this hold for between-within interactions? Or put another way: Do you think this approach is the correct way of running an ANCOVA involving within-subject factors? As far as I understand ANCOVA, the covariate(s) should only be additive factors and do not interact with the factors of interest: "Suppose that differences in [the mean of the covariate] are due to sources of variation related to [the mean of the dependent variable], but not directly related to the treatment effects." (Winer, 1972, p. 753, the parts in squared bracktes exchange the mathematical symbols with the definition). Best, Henrik PS: Showing that adding the interaction term massively changes the main effect for a between-factor: # The ANCOVA: Anova(lm(pre.1 ~ treatment + age, data = n.OBrienKaiser), type = 3) Anova Table (Type III tests) Response: pre.1 Sum Sq Df F value Pr(>F) (Intercept)0.0 10.01 0.90 treatment 0.3 20.06 0.94 age4.5 11.54 0.24 Residuals 34.9 12 # The ANOVA: Anova(lm(pre.1 ~ treatment, data = n.OBrienKaiser), type = 3) Anova Table (Type III tests) Response: pre.1 Sum Sq Df F value Pr(>F) (Intercept) 225.6 1 74.47 0.0097 *** treatment 1.1 20.17 0.84 Residuals 39.4 13 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # The model with interaction Anova(lm(pre.1 ~ treatment * age, data = n.OBrienKaiser), type = 3) Anova Table (Type III tests) Response: pre.1 Sum Sq Df F value Pr(>F) (Intercept) 3.01 11.40 0.264 treatment 13.71 23.18 0.085 . age11.56 15.37 0.043 * treatment:age 13.37 23.11 0.089 . Residuals 21.53 10 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Am 22.07.2012 16:59, schrieb John Fox: Dear Henrik, As you discovered, entering the covariate age additively into the between-subject model doesn't prevent Anova() from reporting tests for the interactions between age and the within-subjects factors. I'm not sure why you would want to do so, but you could simply ignore these tests. I hope this helps, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Henrik Singmann Sent: July-21-12 1:29 PM To: r-h...@stat.math.ethz.ch Subject: [R] car::Anova - Can it be used for ANCOVA with repeated- measures factors. Dear list, I would like to run an ANCOVA using car::Anova with repeated measures factors, but I can't figure out how to do it. My (between-subjects) covariate always interacts with my within-subject factors. As far as I understand ANCOVA, covariates usually do not interact with the effects of interest but are simply additive (or am I wrong here?). More specifically, I can add a covariate as a factor to the between- subjects part when fitting the MLM that behaves like expected (i.e., does not interact with the other factors), but when calling Anova on the model, I don't know how I can specify the between-within design (i.e., which parts of the model should interact with the repeated measures factors). As far as I understand it, neither the idesign, icontrasts or imatrix arguments, nor the linearHypothesis function can specify the within- between design (as far as I get it they all specify the within or intra-subject design, see John Fox's slides from User 2011: http://web.warwick.ac.uk/statsdept/useR- 2011/TalkSlides/Contributed/17Aug_1705_FocusV_4-Multivariate_1- Fox.pdf). If this it is not possible using car::Anova, is there another way to achiebve what I want or is it plainly wrong? I have the feeling that using R's "New Functions for Multivariate Analysis" (Dalgaard, 2007, R News) this could be possible, but some advice on how, would be greatly appreciated, as this does not seem to be the most straight forward way. Below is an example using the car::OBrienKaiser dataset adding an age covariate. The example is merely an adoption from ?Anova with miniml changes and includes e.g. age:phase:hour which I don't want to have. Note that I posted this question to stackoverflow two days ago (http://stackoverflow.com/q/11567446/289572) and did not receive any responses. Please excuse my "crossposting", but I think R-help may be the better place. Best, Henrik PS: I know that the posting guide says "No questions about contributed packages" but there are some questions about car on R-help, so I thought this would be the correct
Re: [R] pvalue calculate
Hi Mary, I think the good old t-test is what you want: x <- sample(1:50) t.test(x, mu = 300) gives: One Sample t-test data: x t = -133.2, df = 49, p-value < 0.00022 alternative hypothesis: true mean is not equal to 300 95 percent confidence interval: 21.36 29.64 sample estimates: mean of x 25.5 Best, Henrik Am 22.07.2012 21:37, schrieb Mary Kindall: I have a value a=300 observation (x) = sample(1:50) How to find a p-value from this. I need to show that "a" is different fom mean(x). Thanks -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] car::Anova - Can it be used for ANCOVA with repeated-measures factors.
n-subjects contrasts are constructed by Anova() to be orthogonal in the row-basis of the design, so you should be able to safely ignore the effects in which (for some reason that escapes me) you are uninterested. This would also be true (except for the estimated error) for the between-subjects design if you used "type-II" tests. It's true that the "type-III" between-subjects tests will be affected by the presence of an interaction, but for these tests to make sense at all, you have to formulate the model very carefully. For example, your type-III test for the "main effect" of treatment with the interaction in the model is for the treatment effect at age 0. Does that really make sense to you? Indeed, the type-III tests for the ANOVA (not ANCOVA) model only make sense because I was careful to use contrasts for the between-subjects factors that are orthogonal in the basis of the design: > contrasts(OBrienKaiser$treatment) [,1] [,2] control -20 A 1 -1 B 11 contrasts(OBrienKaiser$gender) [,1] F1 M -1 Best, John On Sun, 22 Jul 2012 22:06:58 +0200 Henrik Singmann wrote: Dear John, thanks for your response. But if I simply ignore the unwanted effects, the estimates of the main effects for the within-subjects factors are distroted (rationale see below). Or doesn't this hold for between-within interactions? Or put another way: Do you think this approach is the correct way of running an ANCOVA involving within-subject factors? As far as I understand ANCOVA, the covariate(s) should only be additive factors and do not interact with the factors of interest: "Suppose that differences in [the mean of the covariate] are due to sources of variation related to [the mean of the dependent variable], but not directly related to the treatment effects." (Winer, 1972, p. 753, the parts in squared bracktes exchange the mathematical symbols with the definition). Best, Henrik PS: Showing that adding the interaction term massively changes the main effect for a between-factor: # The ANCOVA: Anova(lm(pre.1 ~ treatment + age, data = n.OBrienKaiser), type = 3) Anova Table (Type III tests) Response: pre.1 Sum Sq Df F value Pr(>F) (Intercept)0.0 10.01 0.90 treatment 0.3 20.06 0.94 age4.5 11.54 0.24 Residuals 34.9 12 # The ANOVA: Anova(lm(pre.1 ~ treatment, data = n.OBrienKaiser), type = 3) Anova Table (Type III tests) Response: pre.1 Sum Sq Df F value Pr(>F) (Intercept) 225.6 1 74.47 0.0097 *** treatment 1.1 20.17 0.84 Residuals 39.4 13 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # The model with interaction Anova(lm(pre.1 ~ treatment * age, data = n.OBrienKaiser), type = 3) Anova Table (Type III tests) Response: pre.1 Sum Sq Df F value Pr(>F) (Intercept) 3.01 11.40 0.264 treatment 13.71 23.18 0.085 . age11.56 15.37 0.043 * treatment:age 13.37 23.11 0.089 . Residuals 21.53 10 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Am 22.07.2012 16:59, schrieb John Fox: Dear Henrik, As you discovered, entering the covariate age additively into the between-subject model doesn't prevent Anova() from reporting tests for the interactions between age and the within-subjects factors. I'm not sure why you would want to do so, but you could simply ignore these tests. I hope this helps, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Henrik Singmann Sent: July-21-12 1:29 PM To: r-h...@stat.math.ethz.ch Subject: [R] car::Anova - Can it be used for ANCOVA with repeated- measures factors. Dear list, I would like to run an ANCOVA using car::Anova with repeated measures factors, but I can't figure out how to do it. My (between-subjects) covariate always interacts with my within-subject factors. As far as I understand ANCOVA, covariates usually do not interact with the effects of interest but are simply additive (or am I wrong here?). More specifically, I can add a covariate as a factor to the between- subjects part when fitting the MLM that behaves like expected (i.e., does not interact with the other factors), but when calling Anova on the model, I don't know how I can specify the between-within design (i.e., which parts of the model should interact with the repeated measures factors). As far as I understand it, neither the idesign, icontrasts or imatrix arguments, nor the linearHypothesis function can specify the within- between design (as far as I get it they all specify the within o
Re: [R] R Beginner : Loop and adding row to dataframe
Hi Phil, I think you want: merge(listA, listB, by = "NACE") which will give you: NACE Name aaa bbb ccc 11a a a c 21a a a c 31a a a c 42b a a c 52b a a c 63c a a c If you want to get rid of the Name column, the following should help: tmp <- merge(listA, listB, by = "NACE") tmp[,-2] NACE aaa bbb ccc 11 a a c 21 a a c 31 a a c 42 a a c 52 a a c 63 a a c Cheers, Henrik Am 22.07.2012 18:35, schrieb ph!l: Hi everybody, I am currently quite inexperienced with R. I try to create a function that simply take a value in a dataframe, look for this value in another dataframe and copy all the row that have this value This example is a simplified version of what I am doing but it's enough to help me listA Name NACE a 1 b 2 c 3 ListB NACE aaa bbb ccc 1 a a c 1 a a c 1 a a c 2 a a c 2 a a c 3 a a c 4 a a c 4 a a c 4 a a c The output i would like to have NACE aaa bbb ccc 1 a a c 1 a a c 1 a a c 2 a a c 2 a a c 3 a a c Code: listpeer <- function (x) { for (i in 1:length(listA$NACE)) TriNACE[i] <- subset.data.frame(ListB, NACE == NACEsample$NACE[i],) TriNACE } But the result is Warning message: In `[<-.data.frame`(`*tmp*`, i, value = list(NACE = c(3L, 3L, 3L : provided xx variables to replace x variables" I guess there is something wrong "TriNACE[i]", instead i should use something to add rows, but I really don't find anything ? Somebody has any clue ? Thank you for your time and help! -- View this message in context: http://r.789695.n4.nabble.com/R-Beginner-Loop-and-adding-row-to-dataframe-tp4637360.html Sent from the R help mailing list archive at Nabble.com. -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] First value in a row
Hi Camilo, as you want to work on all rows, apply() is your friend. In the following, I use an anonymous function getting the first non-na value while looping over each row: dat <- read.table(text = " Lat Lon x1 x2 x3 0110 NA NA .1 0111 NA .2 .3 0112 .4 .5 .6 ", header = TRUE) apply(dat[,-(1:2)], 1, function(x) x[!is.na(x)][1]) gives: [1] 0.1 0.2 0.4 Cheers, Henrik Camilo Mora schrieb: Hi. This is likely a trivial problem but have not found a solution. Imagine the following dataframe: Lat Lon x1 x2 x3 0110 NA NA .1 0111 NA .2 .3 0112 .4 .5 .6 I want to generate another column that consist of the first value in each row from columns x1 to x3. That is NewColumn .1 .2 .4 Any input greatly appreciated, Thanks, Camilo Camilo Mora, Ph.D. Department of Geography, University of Hawaii __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] First value in a row
Hi, As Arun's idea was also my first idea let me pinpoint the problem of this solution. It only works if the data in question (i.e., columns x1 to x3) follow the pattern of the example data insofar that the NAs form a triangle like structure. This is so because it loops over columns instead of rows and takes advantage of the triangle NA structure. For example, slightly changing the data leads to a result that does not follow the description of Camilo seem to want: dat1<-read.table(text=" Lat Lon x1 x2 x3 0110 NA NA .1 0111 .4 NA .3 0112 NA .5 .6 ",sep="",header=TRUE) # correct answer from description would be .1, .4, .5 # arun's solution: data.frame(dat1,NewColumn=rev(unlist(lapply(dat1[,3:5],function(x) x[!is.na(x)][1] # x3 x2 x1 # 0.1 0.5 0.4 # my solution: apply(dat1[,-(1:2)], 1, function(x) x[!is.na(x)][1]) # [1] 0.1 0.4 0.5 So the question is, what you want and how the data looks. Cheers, Henrik Am 24.07.2012 14:27, schrieb arun: Hi, Try this: dat1<-read.table(text=" Lat Lon x1 x2 x3 0110 NA NA .1 0111 NA .2 .3 0112 .4 .5 .6 ",sep="",header=TRUE) dat2<-dat1[,3:5] dat3<-data.frame(dat1,NewColumn=rev(unlist(lapply(dat2,function(x) x[!is.na(x)][1] row.names(dat3)<-1:nrow(dat3) dat3 Lat Lon x1 x2 x3 NewColumn 1 1 10 NA NA 0.1 0.1 2 1 11 NA 0.2 0.3 0.2 3 1 12 0.4 0.5 0.6 0.4 A.K. - Original Message - From: Camilo Mora To: r-help@r-project.org Cc: Sent: Tuesday, July 24, 2012 2:48 AM Subject: [R] First value in a row Hi. This is likely a trivial problem but have not found a solution. Imagine the following dataframe: Lat Lon x1 x2 x3 0110 NA NA .1 0111 NA .2 .3 0112 .4 .5 .6 I want to generate another column that consist of the first value in each row from columns x1 to x3. That is NewColumn .1 .2 .4 Any input greatly appreciated, Thanks, Camilo Camilo Mora, Ph.D. Department of Geography, University of Hawaii __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dump.frames and global environment
Dear Jannis, is there any specific reason you use dump.frames instead of recover? As far as I see it, options(error = recover) would allow to access the global environment. And as ?recover tells you: "The use of recover largely supersedes dump.frames as an error option, unless you really want to wait to look at the error. If recover is called in non-interactive mode, it behaves like dump.frames. For computations involving large amounts of data, recover has the advantage that it does not need to copy out all the environments in order to browse in them. If you do decide to quit interactive debugging, call dump.frames directly while browsing in any frame (see the examples)." However, as I haven't used dump.frames ever, this is not really an answer to your question. Hope it helps, Henrik Am 24.07.2012 16:10, schrieb Jannis: Dear list members, I am trying to use dump.frames to debug some code that i run non interactively. I use the following method: dump.frames.mod = function() { dump.frames(dumpto = 'test', to.file = TRUE) quit(save = 'no', status = 10) } options(error = dump.frames.mod) Is there any way to acess the content of the global environment from the *.rda file created in case of an error? When I run the following, for example, I would like to access the contents of a,b and c from the debugging file: dump.frames.mod = function() { dump.frames(dumpto = 'test', to.file = TRUE) quit(save = 'no', status = 10) } options(error = dump.frames.mod) #uncomment with care: #rm(list=ls()) a = 2 source('testscript.R', local = TRUE) load('test.rda') debugger(test) testscript.R in this testcase contains: b = 2 c = 3 plot(d) The only way I found is wrapping a function around the lines of code but this would mean changing a lot of code. Any Ideas? Cheers Jannis > sessionInfo() R version 2.14.1 (2011-12-22) Platform: x86_64-unknown-linux-gnu (64-bit) -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dump.frames and global environment
I somehow was unsure whether or not your procedure was interactive or not. But when it is non-interactive, your problem remains: "If recover is called in non-interactive mode, it behaves like dump.frames" (and will likely not save the global environment." The only othjer idea I have is that you could save the global environment occasionally in your script and load it with the dumpoed data: save.image(file = "save.RData") #after crash: load("save.RData") Best, Henrik Am 24.07.2012 17:45, schrieb Jannis: Thanks, Henrik for your reply. Well, the reason (until now) was that I thought recover would only work in interactive sessions. The question, however, no would be how to save the error object an daccess it later. Additionally, are you sure that the content of the global environment is saved with recover? The handling looks very much the same as browsing dum.frame objects. Cheers Jannis On 24.07.2012 16:25, Henrik Singmann wrote: Dear Jannis, is there any specific reason you use dump.frames instead of recover? As far as I see it, options(error = recover) would allow to access the global environment. And as ?recover tells you: "The use of recover largely supersedes dump.frames as an error option, unless you really want to wait to look at the error. If recover is called in non-interactive mode, it behaves like dump.frames. For computations involving large amounts of data, recover has the advantage that it does not need to copy out all the environments in order to browse in them. If you do decide to quit interactive debugging, call dump.frames directly while browsing in any frame (see the examples)." However, as I haven't used dump.frames ever, this is not really an answer to your question. Hope it helps, Henrik Am 24.07.2012 16:10, schrieb Jannis: Dear list members, I am trying to use dump.frames to debug some code that i run non interactively. I use the following method: dump.frames.mod = function() { dump.frames(dumpto = 'test', to.file = TRUE) quit(save = 'no', status = 10) } options(error = dump.frames.mod) Is there any way to acess the content of the global environment from the *.rda file created in case of an error? When I run the following, for example, I would like to access the contents of a,b and c from the debugging file: dump.frames.mod = function() { dump.frames(dumpto = 'test', to.file = TRUE) quit(save = 'no', status = 10) } options(error = dump.frames.mod) #uncomment with care: #rm(list=ls()) a = 2 source('testscript.R', local = TRUE) load('test.rda') debugger(test) testscript.R in this testcase contains: b = 2 c = 3 plot(d) The only way I found is wrapping a function around the lines of code but this would mean changing a lot of code. Any Ideas? Cheers Jannis > sessionInfo() R version 2.14.1 (2011-12-22) Platform: x86_64-unknown-linux-gnu (64-bit) -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regular Expression
Hi, one problem, many solutions, only one of which uses regular expression but work equally well. dat1<-read.table(text=" MONTH QUARTER YEAR 2012-07 2012-32012 2001-07 2001-32001 2002-01 2002-12002 ",sep="",as.is = TRUE, header=TRUE) # using substr: substr(dat1$MONTH, 6,7) substr(dat1$QUARTER, 6,7) # using strsplit: vapply(strsplit(dat1$MONTH, "-"), "[", i = 2, "") vapply(strsplit(dat1$QUARTER, "-"), "[", i = 2, "") # using sub: sub("[[:digit:]]*-", "", dat1$MONTH) sub("[[:digit:]]*-", "", dat1$QUARTER) all produce the desired outcome. [1] "07" "07" "01" and [1] "3" "3" "1" IF the data is regularly like this, I personally would prefer substr. Cheers, Henrik Am 24.07.2012 19:36, schrieb Fred G: Hi-- I have three columns in an input file: MONTH QUARTER YEAR 2012-07 2012-32012 2001-07 2001-32001 2002-01 2002-12002 I want to make output like so: MONTH QUARTER YEAR 07 32012 07 32001 01 12002 I was having some trouble getting the regular expression to work. I think it should be something like the following: tmp <- uncurated$MONTH *tmp <- gsub("[^-\\d\\d]","",tmp,perl=TRUE)* *tmp[tmp=="-"] <- ""* *curated$MONTH <- tmp* * * tmp <- uncurated$QUARTER *tmp <- gsub("[^-\\d]","",tmp,perl=TRUE)* *tmp[tmp=="-"] <- ""* *curated$QUARTER <- tmp* * * *but it's not quite working. I want to be able to isolate any digits that occur after the hyphen and to delete everything before and including the hyphen. Would greatly appreciate any clarification anyone can provide.* [[alternative HTML version deleted]] -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ERROR : cannot allocate vector of size (in MB & GB)
However, this wouldn't help much with Win XP, as this only allows for 2GB (maximum of 3 GB): http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-be-a-limit-on-the-memory-it-uses_0021 If you want to use more RAM with windows you need to use a 64bit Version. Cheers, Henrik Am 24.07.2012 19:59, schrieb Sarah Goslee: Sure, get more RAM. 2GB is a tiny amount if you need to load files of 1GB into R, and as you've discovered won't work. You can try a few simpler things, like making sure there's nothing loaded into R except what you absolutely need. It looks like there's no reason to read the entire file into R at once for what you want to do, so you could also load a chunk, process that, then move onto the next one. Sarah On Tue, Jul 24, 2012 at 9:45 AM, Rantony wrote: Hi, Here in R, I need to load a huge file(.csv) , its size is 200MB. [may come more than 1GB sometimes]. When i tried to load into a variable it taking too much of time and after that when i do cbind by groups, getting an error like this " Error: cannot allocate vector of size 82.4 Mb" My requirement is, spilt data from Huge-size-file(.csv) to no. of small csv files. Here i will give no of lines to be 'split by' as input. Below i give my code --- SplitLargeCSVToMany <- function(DataMatrix,Destination,NoOfLineToGroup) { test <- data.frame(read.csv(DataMatrix)) # create groups No.of rows group <- rep(1:NROW(test), each=NoOfLineToGroup) new.test <- cbind(test, group=group) new.test2 <- new.test new.test2[,ncol(new.test2)] <- NULL # now get indices to write out indices <- split(seq(nrow(test)), new.test[, 'group']) # now write out the files for (i in names(indices)) { write.csv(new.test2[indices[[i]],], file=paste(Destination,"data.", i, ".csv", sep=""),row.names=FALSE) } } - My system Configuration is, Intel Core2 Duo speed : 3GHz 2 GB RAM OS: Windows-XP [ServicePack-3] --- Any hope to solve this issue ? Thanks in advance, Antony. -- -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] warning message while plotting taylor diagram
Dear Waheed, As you correctly inferred, these are just warnings and dont need to bother you now. The maintainer/author of the taylor.diagram function should be more worried. These warnings just say that in upcoming versions of R the functions within taylor.diagram() will not work anymore. However, when this will happen is unclear. Currently, everything is okay. Best, Henrik Am 26.07.2012 03:46, schrieb yssp03: Dear all I am new to R and not know much about it. However through googling I am able to plot taylor diagram. Here is the message from R *> taylor.diagram(obs,M3,pos.cor=FALSE,add=FALSE,pcex=1,col="darkgreen") Warning messages: 1: sd() is deprecated. Use sapply(*, sd) instead. 2: sd() is deprecated. Use sapply(*, sd) instead. 3: mean() is deprecated. Use colMeans() or sapply(*, mean) instead. 4: mean() is deprecated. Use colMeans() or sapply(*, mean) instead. 5: mean() is deprecated. Use colMeans() or sapply(*, mean) instead. 6: mean() is deprecated. Use colMeans() or sapply(*, mean) instead.* I am worried about these warning messages.will this effect my taylor diagram or they are just warnings. Thanking you in anticipation -- View this message in context: http://r.789695.n4.nabble.com/warning-message-while-plotting-taylor-diagram-tp4637862.html Sent from the R help mailing list archive at Nabble.com. -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I access an element of a string?
Dear Miao, substr() ius waht you want. substr("ABCD", 2,2) [1] "B" Cheers, Henrik jpm miao schrieb: Dear Daniel and Jorge, Thank you very much and it does help. If I have a string "ABCD", how can I access the second element of the string "B"? Thanks, Miao 2012/7/27 Daniel Nordlund -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of jpm miao Sent: Thursday, July 26, 2012 9:12 PM To: r-help Subject: [R] How can I access the title of a table read via read.csv? Hi, I have a table which I can read via read.csv: fx1<-read.csv(file="A_FX_M.csv", header=TRUE) TIME REERNTDJPY GBPHKD 1 198001 124.26 36.030 237.96 2.263980 4.8366 2 198002 126.59 36.030 244.05 2.290426 4.8765 3 198003 128.33 36.026 248.62 2.206045 4.9960 4 198004 127.85 36.063 251.67 2.215330 4.9760 5 198005 124.40 36.050 228.35 2.302026 4.8891 6 198006 124.64 36.028 218.05 2.336995 4.9017 7 198007 125.17 36.007 220.95 2.371917 4.9046 8 198008 128.87 35.966 224.45 2.369107 4.9360 ... How can I access the title of the table? For example, I would like to access the character string "REER"; how can I do it? Thanks, Miao Look at ?colnames. colnames(fx1)[2] Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] warning message while plotting taylor diagram
Dear Jim, indeed, you are right. As the OP did not include the package name I was to lazy to check where taylor.diagram actually is from. Would I have known that it is from plotrix I would have seen that it actually is not a problem of the author but of handing over the wrong object. As I know, functions in plotrix are very well-kept and those errors (i.e., using mean on a data.frame) would not occur, thanks to your work. Better call ??taylor.diagram next time before prematurely posting. Best, Henrik Am 27.07.2012 11:59, schrieb Jim Lemon: On 07/26/2012 10:27 PM, Henrik Singmann wrote: Dear Waheed, As you correctly inferred, these are just warnings and dont need to bother you now. The maintainer/author of the taylor.diagram function should be more worried. These warnings just say that in upcoming versions of R the functions within taylor.diagram() will not work anymore. However, when this will happen is unclear. Currently, everything is okay. Hi Waheed and Henrik, The problem is that either "obs" or "M3" is a data frame. The first two arguments to the function are supposed to be vectors, so perhaps you have extracted one element of a data frame using something like this: M3<-my.data.frame[1] OR M3<-my.data.frame["M3"] which produces a one column data frame. I'll stick a conditional "unlist" into the function to prevent this from causing trouble in future. Thanks for letting me know. Jim -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple question about formulae in R!?
n enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stopping all code execution when ANY error occurs (OR error handling without try/tryCatch)
Hi Phil, I really have to concur with Uwe Ligges here. You probably want to wrap everything in a function. Because usually this is where long lines of code will end up. The browser() function makes this option really useful, as you can step inside the temporary environment and work from there as if you are in the global environment with the added benefits that all your variables will be gone when you exit the execution, so you start with a new blank environment the next time you call the function. And if you put the browser call to the botton of your functions, any error before will halt execution. The downsites are, that you need to execute the function code each time you make changes to the function prior to rerunning it. But IDEs will help on this by sourcing a function qith simple keystrokes (e.g., C-c C-f in ESS-Emacs). However, if there is a syntax error, the original problem remains. An alternative to browser() is to use options(error = recover) when working with functions. It is also extremely helpful, as it brings you to the point of the function where the error occurs to work from there. Give it a try. BTW, I got the info with browser(error = recover) from the Chambers book: Sofwtare for Data analysis. It is worth a buy. Cheers, Henrik enocko schrieb: Hi, thanks for the ideas, folks. I'm on Windows 7, R 2.15.0 x64, RStudio 0.97.71. I do appreciate your time... I would like to say my goal of dealing with errors without R's error trapping tools is not nonsensical given that those tools are cumbersome and not well-suited to the development phase of coding where one informally runs various snippets all the time. The suggestion of looking at IDE's is a good one because it would not be hard for an IDE to just wait and see if any line of code gives an error, and halt execution if so (a global option could enable this). RStudio doesn't have this--does anyone know of something that does? I can post a suggestion to RStudio. Thanks, Phil -- View this message in context: http://r.789695.n4.nabble.com/Stopping-all-code-execution-when-ANY-error-occurs-OR-error-handling-without-try-tryCatch-tp4640023p4640073.html Sent from the R help mailing list archive at Nabble.com. -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANOVA repeated measures and post-hoc
Hi Diego, I am struggeling with this question also for some time and there does not seem to be an easy and general solution to this problem. At least I haven't found one. However, if you have just one repeated-measures factor, use the solution describe by me here: http://stats.stackexchange.com/a/15532/442 Furthermore, you might wanna check the package phia with accompanying vignette (but it uses multivariate tests). For running the ANOVA you could also check my new package afex. Cheers, Henrik Diego Bucci schrieb: Hi, I performed an ANOVA repeated measures but I still can't find any good news regarding the possibility to perform multiple comparisons. Can anyone help me? Thanks Diego Bucci Fisiologia Veterinaria Dipartimento di Scienze Mediche Veterinarie Università degli Studi di Bologna Via Tolara di Sopra, 50 40064 Ozzano dell'Emilia, BO Tel. 00390512097904 mail diego.buc...@unibo.it [[alternative HTML version deleted]] -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.