from:"Joachim Audenaert"

[R] - dunn.test gives strange output

2015-04-22 Thread Joachim Audenaert

Dear all,

I have a question concerning my output of the dunn.test function in R. I 
like to compare different datasets, which are not distributed normally, so 
I use the Dunn.test to do pairwise comparison. I have 2 questions 
concerning the output:

- why are my groupnames changed, the output gives 4 times swirskii, 
although my groupnames are longer (see my dataset)
- secondly when I check the p-values, I see something very odd: I see that 
p values where the same groups are compared sometimes indicate that they 
are different, why are not all those p-values = 1 ?

This is my dataset:

structure(list(controle = c(111, 88, 216, 169), chemie = c(47, 31, 35, 
30), IPM = c(0, 0, 0, 1), gallicus = c(102, 152, 102, 75), swirskii3 = 
c(1, 0, 0, 0), swirskiiA = c(0, 0, 1, 3), swirskiiP = c(0, 0, 1, 6), 
swirskii1x = c(12, 2, 75, 46)), .Names = c("controle", "chemie", "IPM", 
"gallicus", "swirskii3", "swirskiiA", "swirskiiP", "swirskii1x"), 
row.names = c(NA, 4L), class = "data.frame")

I get the following output:

 DUNN <- dunn.test(value, variable, method = "none",kw=TRUE)
 
  Kruskal-Wallis rank sum test

data: value and variable
Kruskal-Wallis chi-squared = 26.8894, df = 7, p-value = 0


Comparison of value by variable  
(No adjustment)  
Col Mean-|
Row Mean |   controle chemieIPM   gallicus   swirskii swirskii
-+--
  chemie |  -1.341044  -3.410083  -2.069039  -0.325682   1.015361 3.084401
 | 0.0900 0.0003 0.0193 0.3723 0.1550 0.0010
 |
 IPM |  -3.410083  -2.069039  -0.325682   1.015361   3.084401 
-3.410083
 | 0.0003 0.0193 0.3723 0.1550 0.0010 0.0003
 |
gallicus |  -0.325682   1.015361   3.084401  -3.410083  -2.069039 0.00
 | 0.3723 0.1550 0.0010 0.0003 0.0193 0.5000
 |
swirskii |  -3.410083  -2.069039   0.00  -3.084401  -3.007770 
-1.666726
 | 0.0003 0.0193 0.5000 0.0010 0.0013 0.0478
 |
swirskii |  -3.007770  -1.666726   0.402313  -2.682088   0.402313 
-2.969454
 | 0.0013 0.0478 0.3437 0.0037 0.3437 
0.0015
 |
swirskii |  -2.969454  -1.628410   0.440628  -2.643772   0.440628 0.038315
 | 0.0015 0.0517 0.3297 0.0041 0.3297 0.4847
 |
swirskii |  -1.475148  -0.134104   1.934935  -1.149466   1.934935 1.532621
 | 0.0701 0.4467 0.0265 0.1252 0.0265 0.0627
Col Mean-|
Row Mean |   swirskii
-+---
  chemie |  -3.410083
 | 0.0003
 |
 IPM |  -2.069039
 | 0.0193
 |
gallicus |  -3.084401
 | 0.0010
 |
swirskii |   0.402313
 | 0.3437
 |
swirskii |  -1.628410
 | 0.0517
 |
swirskii |  -1.475148
 | 0.0701
 |
swirskii |   1.494306
 | 0.0675 




Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, Belgi�
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be   

Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het 
PCS op LinkedIn
Disclaimer | Please consider the environment before printing. Think green, 
keep it on the screen!
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] - Obtaining superscripts to affix to means that are not significantly different from each other with R

2015-04-23 Thread Joachim Audenaert

Hello all,

It is often time consuming to interpret p-values of multiple pairwise 
comparisons of groups and assign them a letter code for publication 
purposes. So I found this interesting link to a program that does this for 
you. 

http://www.jerrydallal.com/lhsp/similar.htm

I was wondering if something similar exists in R?


Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, Belgi�
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be   

Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het 
PCS op LinkedIn
Disclaimer | Please consider the environment before printing. Think green, 
keep it on the screen!
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] - Obtaining superscripts to affix to means that are not significantly different from each other with R

2015-04-23 Thread Joachim Audenaert

Is there also a version for non parametric tests like: 

pairwise.wilcox.test {stats} 



Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, België
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be 



From:   David L Carlson 
To: Joachim Audenaert , 
"r-help@r-project.org" 
Date:   23/04/2015 14:51
Subject:RE: [R] - Obtaining superscripts to affix to means that 
are not significantly different from each other with R



The function cld() in package multcomp generates compact letter displays, 
but does not format them as exponents of the group names.

-
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Joachim 
Audenaert
Sent: Thursday, April 23, 2015 4:58 AM
To: r-help@r-project.org
Subject: [R] - Obtaining superscripts to affix to means that are not 
significantly different from each other with R

Hello all,

It is often time consuming to interpret p-values of multiple pairwise 
comparisons of groups and assign them a letter code for publication 
purposes. So I found this interesting link to a program that does this for 

you. 

http://www.jerrydallal.com/lhsp/similar.htm

I was wondering if something similar exists in R?


Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, Belgi�
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be 

Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het 
PCS op LinkedIn
Disclaimer | Please consider the environment before printing. Think green, 

keep it on the screen!
 [[alternative HTML version deleted]]




Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het 
PCS op LinkedIn
Disclaimer | Please consider the environment before printing. Think green, 
keep it on the screen!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] : automated levene test and other tests for variable datasets

2015-04-14 Thread Joachim Audenaert

Hello all,

I am writing a script for statistical comparison of means. I'm doing many 
field trials with plants, where we have to compare the efficacy of 
different treatments on, different groups of plants. Therefore I would 
like to automate this script so it can be used for different datasets of 
different experiments (which will have different dimensions). An example 
dataset is given here under, I would like to compare if the data of 5 
columns (A,B,C,D,E) are statistically different from each other, where A, 
B, C, D and A are different treatments of my plants and I have 5 
replications for this experiment

dataset <- structure(list(A = c(62, 55, 57, 103, 59), B = c(36, 24, 61, 
19, 79), C = c(33, 97, 54, 48, 166), D = c(106, 82, 116, 85, 94), E = 
c(32, 16, 9, 7, 46)), .Names = c("A", "B", "C", "D","E"), row.names = 
c(NA, 5L), class = "data.frame")

1) First I would like to do a levene test to check the equality of 
variances of my datasets. Currently I do this as follows:

library("car")
attach(dataset)
y <- c(A,B,C,D,E)
group <- as.factor(c(rep(1, length(A)), rep(2, length(B)),rep(3, 
length(C)), rep(4, length(D)),rep(5, length(E
leveneTest(y, group)

Is there a way to automate this for all types of datasets, so that I can 
use the same script for a datasets with any number of columns of data to 
compare? My above script only works for a dataset with 5 columns to 
compare

2) For my boxplots I use

boxplot(dataset) 

which gives me all the boxplots of each dataset, so this is how I want it

3) To check normality I currently use the kolmogorov smirnov test as 
follows

ks.test(A,pnorm)
ks.test(B,pnorm)
ks.test(C,pnorm)
ks.test(D,pnorm)
ks.test(E,pnorm)

Is there a way to replace the A, B, C, ... on the five lines into one line 
of entry so that the kolmogorov smirnov test is done on all columns of my 
dataset at once?

4) if data is normally distributed and the variances are equal I want to 
do a t-test and do pairwise comparison, currently like this

pairwise.t.test(y,group,p.adjust.method = "none")

if data is not normally distributed or variances are unequal I do a 
pairwise comparison with the wilcoxon test

pairwise.wilcox.test(y,group,p.adjust.method = "none")

But again I would like to make this easier, is there a way to replace the 
y and group in my datalineby something so it works for any size of 
dataset?

5) Once I have my paiwise comparison results I know which groups are 
statistically different from others, so I can add a and b and c to 
different groups in my graph. Currently I do this on a sheet of paper by 
comparing them one by one. Is there also a way to automate this? So R 
gives me for example something like this

A: a
B: a
C: b
D: ab
E: c

All help and commentys are welcome. I'm quite new to R and not a 
statistical genious, so if I'm overseeing things or thinking in a wrong 
way please let me know how I can improve my way of working. In short I 
would like to build a script that can compare the means of different 
groups of data and check if they are statistically diiferent

Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, Belgi�
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be   

Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het 
PCS op LinkedIn
Disclaimer | Please consider the environment before printing. Think green, 
keep it on the screen!
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] : automated levene test and other tests for variable datasets

2015-04-15 Thread Joachim Audenaert

Thank you very much for the reply Thierry,

It was very useful for me, currently I updated my script as follows, to be 
able to use the same script for different datasets:

adapting my dataset : y <- melt(dataset, na.rm=TRUE) where "na.rm = true" 
ommits missing data points

variable <- y[,1] 
value <- y[,2]

and then for the tests

leveneTest(value~variable,y)
apply(dataset,MARGIN=2,FUN=function(x) ks.test(x,pnorm)$p.value)

pairwise.t.test(value,variable,p.adjust.method = "none")
pairwise.wilcox.test(value,variable,p.adjust.method = "none")

Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, Belgi�
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be 



From:   Thierry Onkelinx 
To: Joachim Audenaert 
Cc: "r-help@r-project.org" 
Date:   15/04/2015 13:31
Subject:Re: [R] : automated levene test and other tests for 
variable datasets



Dear Joachim,

Storing your data in a long format will make this a lot easier.

library(reshape2)
long.data <- melt(dataset, measure.var = c("A", "B", "C", "D", "E"))
library(car)
leveneTest(value ~ variable, data = long.data)

library(plyr)
ddply(long.data, "variable", function(x){ks.test(x$value})

Best regards,



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest 
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance 
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more 
than asking him to perform a post-mortem examination: he may be able to 
say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner 
The combination of some data and an aching desire for an answer does not 
ensure that a reasonable answer can be extracted from a given body of 
data. ~ John Tukey

2015-04-14 10:07 GMT+02:00 Joachim Audenaert <
joachim.audena...@pcsierteelt.be>:
Hello all,

I am writing a script for statistical comparison of means. I'm doing many
field trials with plants, where we have to compare the efficacy of
different treatments on, different groups of plants. Therefore I would
like to automate this script so it can be used for different datasets of
different experiments (which will have different dimensions). An example
dataset is given here under, I would like to compare if the data of 5
columns (A,B,C,D,E) are statistically different from each other, where A,
B, C, D and A are different treatments of my plants and I have 5
replications for this experiment

dataset <- structure(list(A = c(62, 55, 57, 103, 59), B = c(36, 24, 61,
19, 79), C = c(33, 97, 54, 48, 166), D = c(106, 82, 116, 85, 94), E =
c(32, 16, 9, 7, 46)), .Names = c("A", "B", "C", "D","E"), row.names =
c(NA, 5L), class = "data.frame")

1) First I would like to do a levene test to check the equality of
variances of my datasets. Currently I do this as follows:

library("car")
attach(dataset)
y <- c(A,B,C,D,E)
group <- as.factor(c(rep(1, length(A)), rep(2, length(B)),rep(3,
length(C)), rep(4, length(D)),rep(5, length(E
leveneTest(y, group)

Is there a way to automate this for all types of datasets, so that I can
use the same script for a datasets with any number of columns of data to
compare? My above script only works for a dataset with 5 columns to
compare

2) For my boxplots I use

boxplot(dataset)

which gives me all the boxplots of each dataset, so this is how I want it

3) To check normality I currently use the kolmogorov smirnov test as
follows

ks.test(A,pnorm)
ks.test(B,pnorm)
ks.test(C,pnorm)
ks.test(D,pnorm)
ks.test(E,pnorm)

Is there a way to replace the A, B, C, ... on the five lines into one line
of entry so that the kolmogorov smirnov test is done on all columns of my
dataset at once?

4) if data is normally distributed and the variances are equal I want to
do a t-test and do pairwise comparison, currently like this

pairwise.t.test(y,group,p.adjust.method = "none")

if data is not normally distributed or variances are unequal I do a
pairwise comparison with the wilcoxon test

pairwise.wilcox.test(y,group,p.adjust.method = "none")

But again I would like to make this easier, is there a way to replace the
y and group in my datalineby something so it works for any size of
dataset?

5) Once I have my paiwise comparison results I know which groups are
statistically different from others, so I can add a and b and c to
different groups in my graph. Currently I do this on a sheet of paper by
comparing them one by one. Is there also a way to automate this? So

Re: [R] : automated levene test and other tests for variable datasets

2015-04-15 Thread Joachim Audenaert

Hello Michael,

thank you for the reply, it realy helped me to simplify my script. 
Basically all my questions are a bit the same, but with your hint I could 
solve most of my problems. 

Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, België
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be 



From:   Michael Dewey 
To: Joachim Audenaert , 
r-help@r-project.org
Date:   14/04/2015 18:17
Subject:Re: [R] : automated levene test and other tests for 
variable datasets



You ask quite a lot of questions, I have given some hints about your 
first example inline

On 14/04/2015 09:07, Joachim Audenaert wrote:
> Hello all,
>
> I am writing a script for statistical comparison of means. I'm doing 
many
> field trials with plants, where we have to compare the efficacy of
> different treatments on, different groups of plants. Therefore I would
> like to automate this script so it can be used for different datasets of
> different experiments (which will have different dimensions). An example
> dataset is given here under, I would like to compare if the data of 5
> columns (A,B,C,D,E) are statistically different from each other, where 
A,
> B, C, D and A are different treatments of my plants and I have 5
> replications for this experiment
>
> dataset <- structure(list(A = c(62, 55, 57, 103, 59), B = c(36, 24, 61,
> 19, 79), C = c(33, 97, 54, 48, 166), D = c(106, 82, 116, 85, 94), E =
> c(32, 16, 9, 7, 46)), .Names = c("A", "B", "C", "D","E"), row.names 
=
> c(NA, 5L), class = "data.frame")
>
> 1) First I would like to do a levene test to check the equality of
> variances of my datasets. Currently I do this as follows:
>
> library("car")
> attach(dataset)
Usually best to avoid this and use the data=parameter or with or within

> y <- c(A,B,C,D,E)
you could use unlist( ) here
> group <- as.factor(c(rep(1, length(A)), rep(2, length(B)),rep(3,
> length(C)), rep(4, length(D)),rep(5, length(E
you can get the lengths which you need with
lengtha <- lapply(dataset, length)
or
lengths <- sapply(dataset, length)
depending

then
rep(letters[1:length(lengths)], lengths)
should get you the group variable you want.


I have just typed all those in so there may be typos but at least you 
know where to look. I am not suggesting that I think automating all 
statistical analyses is necessarily a good idea either.

> leveneTest(y, group)
>
> Is there a way to automate this for all types of datasets, so that I can
> use the same script for a datasets with any number of columns of data to
> compare? My above script only works for a dataset with 5 columns to
> compare
>
> 2) For my boxplots I use
>
> boxplot(dataset)
>
> which gives me all the boxplots of each dataset, so this is how I want 
it
>
> 3) To check normality I currently use the kolmogorov smirnov test as
> follows
>
> ks.test(A,pnorm)
> ks.test(B,pnorm)
> ks.test(C,pnorm)
> ks.test(D,pnorm)
> ks.test(E,pnorm)
>
> Is there a way to replace the A, B, C, ... on the five lines into one 
line
> of entry so that the kolmogorov smirnov test is done on all columns of 
my
> dataset at once?
>
> 4) if data is normally distributed and the variances are equal I want to
> do a t-test and do pairwise comparison, currently like this
>
> pairwise.t.test(y,group,p.adjust.method = "none")
>
> if data is not normally distributed or variances are unequal I do a
> pairwise comparison with the wilcoxon test
>
> pairwise.wilcox.test(y,group,p.adjust.method = "none")
>
> But again I would like to make this easier, is there a way to replace 
the
> y and group in my datalineby something so it works for any size of
> dataset?
>
> 5) Once I have my paiwise comparison results I know which groups are
> statistically different from others, so I can add a and b and c to
> different groups in my graph. Currently I do this on a sheet of paper by
> comparing them one by one. Is there also a way to automate this? So R
> gives me for example something like this
>
> A: a
> B: a
> C: b
> D: ab
> E: c
>
> All help and commentys are welcome. I'm quite new to R and not a
> statistical genious, so if I'm overseeing things or thinking in a wrong
> way please let me know how I can improve my way of working. In short I
> would like to build a script that can compare the means of different
> groups of data and check if they are statistically diiferent
>
> Met vriendelijke groeten - With kind regards,
>

[R] melt function chooses wrong id variable with large datasets

2015-04-16 Thread Joachim Audenaert

Hello all,

I'm using a large dataset consisting of 2 groups of data, 2 columns in 
excel with a header (group name) and 15 000 rows of data. I would like 
like to compare this data, so I transform my dataset with the melt 
function to get 1 column of data and 1 column of ID variables, then I can 
apply different statistical tests. With small datasets this works great, 
the melt function automatically chooses the name in row 1 as ID variable 
and melts the data, thus giving me a matrix with all ID variables in 
column one and the data accordingly in column 2. 
With this big dataset however it chooses the whole first column as ID 
variables in stead of the first row. Is there a reason why this happens 
and how can I make sure the first row is chosen as ID variabele and the 
lower rows as data? 

If I specify that I want the first row to be the id variable I also get 
error. 

melt(dataset,id.vars=dataset[1,], na.rm=TRUE)

Error: id variables not found in data: norm, jaar

Are there alternative ways to create a good reshaped dataset?

Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, Belgi�
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be   

Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het 
PCS op LinkedIn
Disclaimer | Please consider the environment before printing. Think green, 
keep it on the screen!
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] melt function chooses wrong id variable with large datasets

2015-04-16 Thread Joachim Audenaert

Hello,

This is a part of my dataset:

structure(list(januari = c(38.1, 32.4, 34.5, 20.7, 21.5, 23.1, 
29.7, 36.6, 36.1, 20.6, 20.4, 30.1, 38.7, 41.4, 37, 36, 37, 38, 
23, 26.7), februari = c(31.5, 36.2, 38.2, 26.4, 20.9, 21.5, 30.2, 
33.4, 32.6, 22.2, 21.7, 30, 35.7, 32.8, 39.3, 25.5, 23, 19.9, 
21.3, 20.8), maart = c(34.2, 27, 24.2, 19.9, 19.7, 21.5, 30.6, 
30, 19, 19.6, 20.6, 23.6, 17.9, 17.3, 21.4, 24.1, 20.9, 30.1, 
32.6, 21.3), april = c(26.3, 29.6, 30.3, 23.6, 28.4, 20.7, 24.1, 
27.3, 23.2, 18.3, 24.6, 27.4, 20.4, 18.1, 25.2, 19.8, 21, 23.7, 
19.6, 18.1), mei = c(23.7, 24, 17.2, 23.2, 25.2, 17.2, 16, 15.6, 
13.4, 16, 16.8, 14.6, 19.4, 21, 19.5, 18.5, 13.3, 13.7, 14.3, 
14.1), juni = c(17.7, 14.2, 16.6, 15.7, 13.7, 14.7, 13.1, 12.9, 
15.4, 11.9, 15.2, 15.3, 16.5, 16.1, 11.7, 11.2, 11.5, 10.8, 16.1, 
14.8), juli = c(15.7, 14.5, 10.8, 10.5, 13.4, 12.2, 13.2, 13, 
12.4, 13.1, 9.8, 10.5, 13.4, 11, 13.1, 15, 16.7, 16.1, 18.2, 
15.7), augustus = c(12.9, 12.8, 15.2, 14.5, 17.2, 14.5, 14.4, 
11, 13.1, 13.6, 14.6, 12.7, 13.6, 12.7, 15.5, 17.4, 15.2, 14.2, 
17.7, 19.2), september = c(15.6, 15.5, 15.9, 15.1, 16, 19.4, 
21.5, 23.7, 18.7, 23.8, 18, 16.2, 18.5, 20.6, 18.3, 22.5, 26.9, 
19.4, 15.9, 20.5), oktober = c(21.4, 20.8, 14, 17, 23, 26.4, 
19.6, 22.7, 26.9, 14.7, 15.2, 19.8, 26.9, 20.2, 14.3, 14.8, 18.5, 
21.7, 21.4, 21.8), november = c(24.7, 26.2, 29, 21.6, 17.1, 16.9, 
19.1, 24.7, 25.4, 19.8, 18.2, 16.3, 17, 17.7, 15.5, 14.7, 15.8, 
19.9, 20.4, 23.3), december = c(19.8, 27, 21, 33, 22.6, 28.3, 
21.1, 19, 17.3, 27, 30.2, 24.8, 17.9, 17.9, 20.7, 30.9, 36.2, 
21, 20.2, 21.3), norm = c("45.8713463281901", "24.047250681782984", 
"3.7533684144746324", "38.594241119279324", "26.391897460120358", 
"61.746470001194638", "6.8321020448487992", "11.933109250115226", 
"51.951891096493924", "37.424611852237945", "5.1587836676942374", 
"36.552835044409434", "31.781209673851027", "29.09146215582853", 
"4.856812959269508", "5.3982910143166514", "46.553976273304215", 
"17.566272518985429", "20.552451905814117", "61.894775704479279"
)), .Names = c("januari", "februari", "maart", "april", "mei", 
"juni", "juli", "augustus", "september", "oktober", "november", 
"december", "norm"), row.names = c(NA, 20L), class = "data.frame")

I transform my dataset with the following script:

y <- melt(dataset,na.rm=TRUE)
variable <- y[,1] 
value <- y[,2]

and can then perform a levene test as follows:

LEVENE <- leveneTest(value~variable,y)

When the dataset is small, lets say less than 100 values per column 
everything works great. I get the message: 

No id variables; using all as measure variables

When the dataset is much bigger I get the following message

Using norm as id variables, why does this function pick norm as id 
variable? and how can I tell R that each column title is my variable

 
Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, Belgi�
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be 



From:   PIKAL Petr 
To: Joachim Audenaert , 
"r-help@r-project.org" 
Date:   16/04/2015 12:13
Subject:RE: [R]  melt function chooses wrong id variable with 
large datasets



Hi

There is something weird with your data and melt function.

AFAIK melt does not use first row as id.variables.

What is result of

str(dataset)

Instead of

melt(dataset,id.vars=dataset[1,], na.rm=TRUE)

melt expects something like

melt(dataset, id.vars=c("norm, "jaar"), na.rm=TRUE)

If you want more specific answer you shall show us part of your data, 
preferably copy output of

dput(dataset[1:20,])

into your mail.

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Joachim
> Audenaert
> Sent: Thursday, April 16, 2015 11:37 AM
> To: r-help@r-project.org
> Subject: [R] melt function chooses wrong id variable with large
> datasets
>
> Hello all,
>
> I'm using a large dataset consisting of 2 groups of data, 2 columns in
> excel with a header (group name) and 15 000 rows of data. I would like
> like to compare this data, so I transform my dataset with the melt
> function to get 1 column of data and 1 column of ID variables, then I
> can apply different statistical tests. With small datasets this works
> great, the melt function automatically chooses the name in row 1 as ID
> variable and melts the data, thus giving me a matrix with all ID
&

Re: [R] melt function chooses wrong id variable with large datasets

2015-04-16 Thread Joachim Audenaert

Thanks,

indeed norm should be in the same group as as the months. everything works 
fine when the number of data is quite small, but with big datasets (15 000 
values) things seem to go wrong and I can't explain why. It puts norm as 
an individual column in stead of in the group of months as it does when 
the dataset is small.

Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, Belgi�
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be 



From:   PIKAL Petr 
To: Joachim Audenaert 
Cc: "r-help@r-project.org" 
Date:   16/04/2015 13:41
Subject:RE: [R]  melt function chooses wrong id variable with 
large datasets



Hi
 
With this dataset I get
 
> dd.m0<-melt(dataset, na.rm=T)
Using norm as id variables
> head(dd.m0)
norm variable value
1   45.8713463281901  januari  38.1
2 24.047250681782984  januari  32.4
3 3.7533684144746324  januari  34.5
4 38.594241119279324  januari  20.7
5 26.391897460120358  januari  21.5
6 61.746470001194638  januari  23.1
> 
or
 
dd.m<-melt(dataset, id.vars=NULL, na.rm=T)
 
> head(dd.m)
  variable value
1  januari  38.1
2  januari  32.4
3  januari  34.5
4  januari  20.7
5  januari  21.5
6  januari  23.1
> tail(dd.m)
variable  value
255 norm  4.856812959269508
256 norm 5.3982910143166514
257 norm 46.553976273304215
258 norm 17.566272518985429
259 norm 20.552451905814117
260 norm 61.894775704479279
 
The latter will put norm to the same column as months. Is it intended?
 
Maybe you want
 
> dd.m1<-melt(dataset[,-13], na.rm=T)
No id variables; using all as measure variables
> head(dd.m1)
  variable value
1  januari  38.1
2  januari  32.4
3  januari  34.5
4  januari  20.7
5  januari  21.5
6  januari  23.1
> tail(dd.m1)
variable value
235 december  20.7
236 december  30.9
237 december  36.2
238 december  21.0
239 december  20.2
240 december  21.3
 
Cheers
Petr
 
From: Joachim Audenaert [mailto:joachim.audena...@pcsierteelt.be] 
Sent: Thursday, April 16, 2015 1:13 PM
To: PIKAL Petr
Cc: r-help@r-project.org
Subject: RE: [R] melt function chooses wrong id variable with large 
datasets
 
Hello, 

This is a part of my dataset: 

structure(list(januari = c(38.1, 32.4, 34.5, 20.7, 21.5, 23.1, 
29.7, 36.6, 36.1, 20.6, 20.4, 30.1, 38.7, 41.4, 37, 36, 37, 38, 
23, 26.7), februari = c(31.5, 36.2, 38.2, 26.4, 20.9, 21.5, 30.2, 
33.4, 32.6, 22.2, 21.7, 30, 35.7, 32.8, 39.3, 25.5, 23, 19.9, 
21.3, 20.8), maart = c(34.2, 27, 24.2, 19.9, 19.7, 21.5, 30.6, 
30, 19, 19.6, 20.6, 23.6, 17.9, 17.3, 21.4, 24.1, 20.9, 30.1, 
32.6, 21.3), april = c(26.3, 29.6, 30.3, 23.6, 28.4, 20.7, 24.1, 
27.3, 23.2, 18.3, 24.6, 27.4, 20.4, 18.1, 25.2, 19.8, 21, 23.7, 
19.6, 18.1), mei = c(23.7, 24, 17.2, 23.2, 25.2, 17.2, 16, 15.6, 
13.4, 16, 16.8, 14.6, 19.4, 21, 19.5, 18.5, 13.3, 13.7, 14.3, 
14.1), juni = c(17.7, 14.2, 16.6, 15.7, 13.7, 14.7, 13.1, 12.9, 
15.4, 11.9, 15.2, 15.3, 16.5, 16.1, 11.7, 11.2, 11.5, 10.8, 16.1, 
14.8), juli = c(15.7, 14.5, 10.8, 10.5, 13.4, 12.2, 13.2, 13, 
12.4, 13.1, 9.8, 10.5, 13.4, 11, 13.1, 15, 16.7, 16.1, 18.2, 
15.7), augustus = c(12.9, 12.8, 15.2, 14.5, 17.2, 14.5, 14.4, 
11, 13.1, 13.6, 14.6, 12.7, 13.6, 12.7, 15.5, 17.4, 15.2, 14.2, 
17.7, 19.2), september = c(15.6, 15.5, 15.9, 15.1, 16, 19.4, 
21.5, 23.7, 18.7, 23.8, 18, 16.2, 18.5, 20.6, 18.3, 22.5, 26.9, 
19.4, 15.9, 20.5), oktober = c(21.4, 20.8, 14, 17, 23, 26.4, 
19.6, 22.7, 26.9, 14.7, 15.2, 19.8, 26.9, 20.2, 14.3, 14.8, 18.5, 
21.7, 21.4, 21.8), november = c(24.7, 26.2, 29, 21.6, 17.1, 16.9, 
19.1, 24.7, 25.4, 19.8, 18.2, 16.3, 17, 17.7, 15.5, 14.7, 15.8, 
19.9, 20.4, 23.3), december = c(19.8, 27, 21, 33, 22.6, 28.3, 
21.1, 19, 17.3, 27, 30.2, 24.8, 17.9, 17.9, 20.7, 30.9, 36.2, 
21, 20.2, 21.3), norm = c("45.8713463281901", "24.047250681782984", 
"3.7533684144746324", "38.594241119279324", "26.391897460120358", 
"61.746470001194638", "6.8321020448487992", "11.933109250115226", 
"51.951891096493924", "37.424611852237945", "5.1587836676942374", 
"36.552835044409434", "31.781209673851027", "29.09146215582853", 
"4.856812959269508", "5.3982910143166514", "46.553976273304215", 
"17.566272518985429", "20.552451905814117", "61.894775704479279"
)), .Names = c("januari", "februari", "maart", "april", "mei", 
"juni", "juli", "augustus", "september", "oktober", "november", 
"december", "norm"), row.names = c(NA, 20L), class = "data.frame") 

I transfo

[R] - help with the predict function

2012-06-05 Thread Joachim Audenaert

Hi all,

I would like to predict some values for an nls regression function 
(functional response model Rogers type II). This is an asymptotic function 
of which I would like to predict the asymptotic value
I estimated the paramters with nls, but can't seem to get predictions for 
values of m choice..
This is my script:

RogersII_N <- 
nls(FR~N0-lambertW(attackR3_N*Th3_N*N0*exp(-attackR3_N*(24-Th3_N*N0)))/(attackR3_N*Th3_N),start=list(attackR3_N=0.04,Th3_N=1.46),control=list(maxiter=1))
lista <- c(1,2,100,1000)
predict(RogersII_N,newdata=lista)

I created a list (lista) with some values of which I would like the 
predict function to give me function values

What am I doing wrong?

Kind regards,
Met vriendelijke groeten,
Joachim

Don't waste paper! Think about the environment before printing this e-mail

______

Joachim Audenaert
Adviesdienst Gewasbescherming
Proefcentrum voor Sierteelt
Schaessestraat 18
B-9070 Destelbergen
Belgium
Tel. +32 9 353 94 71
Fax +32 9 353 94 95
E-mail: joachim.audena...@pcsierteelt.be
www.pcsierteelt.be
__
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] - detecting outliers

2012-06-06 Thread Joachim Audenaert

Hello all,

I am estimating parameters for regression functions on experimental data. 
Functional response of Rogers type II.

I would like to know which points of my dataset are outliers. What is the 
best method to do this with R?
I found a method via R help, but would like to know if there are better 
methods for my purpose. 
Here is the script I us now:

library("mvoutlier")
dat <- read.delim("C:/data.txt")
uni.plot(dat)

My data looks like the following (copied into a txt file):
(N0 is the initial number of eggs fed to the predator, FR is the number of 
eggs eaten by the predator during 24h)

N0  FR
37  30
27  15
36  14
37  13
45  8
25  0
47  20
34  6
25  8
21  7
24  24
34  17
23  10
29  5
38  38
24  24
20  17
14  8
18  15
15  10
26  5
33  5
22  21
38  3
22  20
23  19
20  6
20  4
21  18
25  5
13  13
9   8
8   4
7   7
8   5
11  9


Kind regards,
Met vriendelijke groeten,
Joachim

Don't waste paper! Think about the environment before printing this e-mail

______

Joachim Audenaert
Adviesdienst Gewasbescherming
Proefcentrum voor Sierteelt
Schaessestraat 18
B-9070 Destelbergen
Belgium
Tel. +32 9 353 94 71
Fax +32 9 353 94 95
E-mail: joachim.audena...@pcsierteelt.be
www.pcsierteelt.be
__
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] gnm and gnlr3

2012-04-02 Thread Joachim Audenaert

Hi,

I am quite new to R and would like to do nonlinear regressions with 
Poisson distributed data.
I would like to estimate paramters of an equation of this type:

FR = [c*NO * exp(a+b*NO)] / [(c+NO)*(1+exp(a+b*NO))]

a,b and c are parameters, NO are input values


I found both the gnm and gnlr3 function which should be able to do this 
regression but I can't manage to make it work.
How can I write my equation to fit into these functions? 
If I understand it correct I have to split my equation in a "mu", a 
"shape" and a "family" part for the gnlr function, but don't have a clue 
how to do this.

Is there maybe a different function that is better for my purpose? A 
version of nls where I can choose a different distribution?

Kind Regards
Joachim

Don't waste paper! Think about the environment before printing this e-mail

______

Joachim Audenaert
Adviesdienst Gewasbescherming
Proefcentrum voor Sierteelt
Schaessestraat 18
B-9070 Destelbergen
Tel. +32 9 353 94 71
Fax +32 9 353 94 95
E-mail: joachim.audena...@pcsierteelt.be
www.pcsierteelt.be
__
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] regression for poisson distributed data

2012-04-03 Thread Joachim Audenaert

Hello all,

I would like to get parameter estimates for different models. For one of 
them I give the code in example. I am estimating the parameters (i,j and 
k) with the nls function, which sees the error distribution as normal, I 
would however like to do the same as nls with the assumption that the 
errors are poisson distributed.

Is there a way to do this with R? Are there packages designed for this? I 
tried with the gnm package, but don't understand how to transform my 
equation to a generalised equation. Is there an option for nls to choose 
family = poisson? 

Lower in the mail the code with the model and visualisations I use to 
check my results. I also copied the test dataset from my txt file.

I am using R 2.15 and Rstudio to visualise it.


plot(FR~N0)
x <- 
nls(FR~(exp(i+j*N0)/(1+exp(i+j*N0)))*(k*N0/(k+N0)),start=list(i=0.02,j=0.002,k=6))
summary(x)
hatx <- predict(x)
lines(spline(N0,hatx))

N0  FR
10  2
10  3
10  2
10  4
10  2
10  2
10  1
10  2
10  2
10  2
20  2 
20  3
20  3
20  3
20  4
20  2
20  4
20  2
20  3
20  2
30  1
30  2
30  3
30  4
30  5
30  6
30  2
30  3
30  2
30  2
40  2
40  3
40  3
40  6
40  5
40  4
40  3
40  3
40  2
40  3
50  4
50  5
50  2
50  3
50  7
50  5
50  4
50  3
50  4
50  5
60  5
60  6
60  8
60  4
60  4
60  3
60  2
60  2
60  5
60  4

Kind Regards
Joachim

Don't waste paper! Think about the environment before printing this e-mail

______

Joachim Audenaert
Adviesdienst Gewasbescherming
Proefcentrum voor Sierteelt
Schaessestraat 18
B-9070 Destelbergen
Tel. +32 9 353 94 71
Fax +32 9 353 94 95
E-mail: joachim.audena...@pcsierteelt.be
www.pcsierteelt.be
__
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] automatically scan multiple starting values

2012-04-16 Thread Joachim Audenaert

Hi all,

I am doing nls regression of the Rogers type III equation. However I can't 
find good starting values and keep getting the error messages: "Missing 
value or an infinity produced when evaluating the model" OR "singular 
gradient matrix at initial parameter estimates"
Is there a way to automatically check different starting values and give a 
list of values for each parameter? In my model only Th3 is a parameter 
with a good estimate (2.5). for v,w and x I would like to check a broad 
range of values. Is it possible to do this in R?

Under my script for plots and nls regression and a copy of the values of 
my data file.

plot(FR~N0,main="Rogers III")
RogersIII <- 
nls(FR~N0*(1-exp(((v+w*FR)/(1+x*FR))*(Th3*FR-24))),start=list(v=0.002,w=0.00075,x=-0.1,Th3=2.5),control=list(maxiter=100,minFactor=0.1))
 
# 24 is tijd in uren
hatac <- predict(RogersIII)
lines(spline(N0,hatac))
summary(RogersIII)

N0  FR
5   4
3   2
5   3
5   4
6   4
5   4
4   2
5   4
6   5
5   3
10  5
14  7
12  6
17  9
10  4
16  8
15  8
14  5
15  6
15  5
19  10
20  11
22  10
21  12
25  12
26  11
25  10
25  12
25  11
24  9
35  12
33  12
35  13
36  12
31  12
37  12
39  13
36  14
35  12
35  10
46  11
44  10
41  12
43  14
45  12
48  11
45  13
45  9
49  12
45  11
56  12
59  15
52  11
54  10
51  12
55  12
61  13
55  12
54  10
55  12

Kind regards,
Met vriendelijke groeten,
Joachim

Don't waste paper! Think about the environment before printing this e-mail

______

Joachim Audenaert
Adviesdienst Gewasbescherming
Proefcentrum voor Sierteelt
Schaessestraat 18
B-9070 Destelbergen
Belgium
Tel. +32 9 353 94 71
Fax +32 9 353 94 95
E-mail: joachim.audena...@pcsierteelt.be
www.pcsierteelt.be
__
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] error estimating parameters with mle2

2012-04-18 Thread Joachim Audenaert

Hi all,

When I try to estimate the functional response of the Rogers type I 
equation (for the mle2 you need the package bbmle):

> RogersIbinom <- function(N0,attackR2_B,u_B) {attackR2_B+u_B*N0}
> RogersI_B <- 
mle2(FR~dbinom(size=N0,prob=RogersIbinom(N0,attackR2_B,u_B)/N0),start=list(attackR2_B=4.5,u_B=0.16),method="Nelder-Mead",data=data5)


 I get following error message

Error in optim(par = c(4.5, 0.16), fn = function (p)  : 
  function cannot be evaluated at initial parameters

Can someone tell me what I'm doing wrong? I used estimate starting values 
which were predicted with the nls function 

RogersI_N <- 
nls(FR~attackR2_N+u_N*N0,start=list(attackR2_N=1,u_N=4),control=list(maxiter=1))

For some other models, I do the exact same thing and it works well, so I 
don't understand why this one doesn't work

Kind regards,
Met vriendelijke groeten,
Joachim

Don't waste paper! Think about the environment before printing this e-mail

______

Joachim Audenaert
Adviesdienst Gewasbescherming
Proefcentrum voor Sierteelt
Schaessestraat 18
B-9070 Destelbergen
Belgium
Tel. +32 9 353 94 71
Fax +32 9 353 94 95
E-mail: joachim.audena...@pcsierteelt.be
www.pcsierteelt.be
__
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] non-numeric argument in mle2

2012-04-19 Thread Joachim Audenaert

Hi all,

I have some problems with the mle2 function


> RogersIIbinom <- function(N0,attackR3_B,Th3_B) 
{N0-lambertW(attackR3_B*Th3_B*N0*exp(-attackR3_B*(24-Th3_B*N0)))/(attackR3_B*Th3_B)}
> RogersII_B <- 
mle2(FR~dbinom(size=N0,prob=RogersIIbinom(N0,attackR3_B,Th3_B)/N0),start=list(attackR3_B=1.5,Th3_B=0.04),method="Nelder-Mead",data=dat)
Error in dbinom(x, size, prob, log) : 
  Non-numeric argument to mathematical function

Can somenone explain met what this error means? All my parameters and data 
are defined, so I don't understand what the non-numeric argument is??? 
With different equations the script does work...

For this function to run you need packages: bbmle and emdbook.
Testdata I pasted here

N0  FR
5   4
3   2
5   3
5   4
6   4
5   4
4   2
5   4
6   5
5   3
10  5
14  7
12  6
17  9
10  4
16  8
15  8
14  5
15  6
15  5
19  10
20  11
22  10
21  12
25  12
26  11
25  10
25  12
25  11
24  9
35  12
33  12
35  13
36  12
31  12
37  12
39  13
36  14
35  12
35  10
46  11
44  10
41  12
43  14
45  12
48  11
45  13
45  9
49  12
45  11
56  12
59  15
52  11
54  10
51  12
55  12
61  13
55  12
54  10
55  12

Kind regards,
Met vriendelijke groeten,
Joachim

Don't waste paper! Think about the environment before printing this e-mail

______

Joachim Audenaert
Adviesdienst Gewasbescherming
Proefcentrum voor Sierteelt
Schaessestraat 18
B-9070 Destelbergen
Belgium
Tel. +32 9 353 94 71
Fax +32 9 353 94 95
E-mail: joachim.audena...@pcsierteelt.be
www.pcsierteelt.be
__
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] - dunn.test gives strange output

[R] - Obtaining superscripts to affix to means that are not significantly different from each other with R

Re: [R] - Obtaining superscripts to affix to means that are not significantly different from each other with R

[R] : automated levene test and other tests for variable datasets

Re: [R] : automated levene test and other tests for variable datasets

Re: [R] : automated levene test and other tests for variable datasets

[R] melt function chooses wrong id variable with large datasets

Re: [R] melt function chooses wrong id variable with large datasets

Re: [R] melt function chooses wrong id variable with large datasets

[R] - help with the predict function

[R] - detecting outliers

[R] gnm and gnlr3

[R] regression for poisson distributed data

[R] automatically scan multiple starting values

[R] error estimating parameters with mle2

[R] non-numeric argument in mle2

16 matches

Site Navigation

Mail list logo

Footer information