Hello Michael, thank you for the reply, it realy helped me to simplify my script. Basically all my questions are a bit the same, but with your hint I could solve most of my problems.
Met vriendelijke groeten - With kind regards, Joachim Audenaert onderzoeker gewasbescherming - crop protection researcher PCS | proefcentrum voor sierteelt - ornamental plant research Schaessestraat 18, 9070 Destelbergen, België T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95 E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be From: Michael Dewey <li...@dewey.myzen.co.uk> To: Joachim Audenaert <joachim.audena...@pcsierteelt.be>, r-help@r-project.org Date: 14/04/2015 18:17 Subject: Re: [R] : automated levene test and other tests for variable datasets You ask quite a lot of questions, I have given some hints about your first example inline On 14/04/2015 09:07, Joachim Audenaert wrote: > Hello all, > > I am writing a script for statistical comparison of means. I'm doing many > field trials with plants, where we have to compare the efficacy of > different treatments on, different groups of plants. Therefore I would > like to automate this script so it can be used for different datasets of > different experiments (which will have different dimensions). An example > dataset is given here under, I would like to compare if the data of 5 > columns (A,B,C,D,E) are statistically different from each other, where A, > B, C, D and A are different treatments of my plants and I have 5 > replications for this experiment > > dataset <- structure(list(A = c(62, 55, 57, 103, 59), B = c(36, 24, 61, > 19, 79), C = c(33, 97, 54, 48, 166), D = c(106, 82, 116, 85, 94), E = > c(32, 16, 9, 7, 46)), .Names = c("A", "B", "C", "D", "E"), row.names = > c(NA, 5L), class = "data.frame") > > 1) First I would like to do a levene test to check the equality of > variances of my datasets. Currently I do this as follows: > > library("car") > attach(dataset) Usually best to avoid this and use the data=parameter or with or within > y <- c(A,B,C,D,E) you could use unlist( ) here > group <- as.factor(c(rep(1, length(A)), rep(2, length(B)),rep(3, > length(C)), rep(4, length(D)),rep(5, length(E)))) you can get the lengths which you need with lengtha <- lapply(dataset, length) or lengths <- sapply(dataset, length) depending then rep(letters[1:length(lengths)], lengths) should get you the group variable you want. I have just typed all those in so there may be typos but at least you know where to look. I am not suggesting that I think automating all statistical analyses is necessarily a good idea either. > leveneTest(y, group) > > Is there a way to automate this for all types of datasets, so that I can > use the same script for a datasets with any number of columns of data to > compare? My above script only works for a dataset with 5 columns to > compare > > 2) For my boxplots I use > > boxplot(dataset) > > which gives me all the boxplots of each dataset, so this is how I want it > > 3) To check normality I currently use the kolmogorov smirnov test as > follows > > ks.test(A,pnorm) > ks.test(B,pnorm) > ks.test(C,pnorm) > ks.test(D,pnorm) > ks.test(E,pnorm) > > Is there a way to replace the A, B, C, ... on the five lines into one line > of entry so that the kolmogorov smirnov test is done on all columns of my > dataset at once? > > 4) if data is normally distributed and the variances are equal I want to > do a t-test and do pairwise comparison, currently like this > > pairwise.t.test(y,group,p.adjust.method = "none") > > if data is not normally distributed or variances are unequal I do a > pairwise comparison with the wilcoxon test > > pairwise.wilcox.test(y,group,p.adjust.method = "none") > > But again I would like to make this easier, is there a way to replace the > y and group in my datalineby something so it works for any size of > dataset? > > 5) Once I have my paiwise comparison results I know which groups are > statistically different from others, so I can add a and b and c to > different groups in my graph. Currently I do this on a sheet of paper by > comparing them one by one. Is there also a way to automate this? So R > gives me for example something like this > > A: a > B: a > C: b > D: ab > E: c > > All help and commentys are welcome. I'm quite new to R and not a > statistical genious, so if I'm overseeing things or thinking in a wrong > way please let me know how I can improve my way of working. In short I > would like to build a script that can compare the means of different > groups of data and check if they are statistically diiferent > > Met vriendelijke groeten - With kind regards, > > Joachim Audenaert > onderzoeker gewasbescherming - crop protection researcher > > PCS | proefcentrum voor sierteelt - ornamental plant research > > Schaessestraat 18, 9070 Destelbergen, Belgi� > T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95 > E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be > > Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het > PCS op LinkedIn > Disclaimer | Please consider the environment before printing. Think green, > keep it on the screen! > [[alternative HTML version deleted]] > > > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Michael http://www.dewey.myzen.co.uk/home.html Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het PCS op LinkedIn Disclaimer | Please consider the environment before printing. Think green, keep it on the screen! [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.