Hi, thank you very much for your useful help =). just a question...I don't know what is the distribution of my data (normal, T, etc...). So, how can I set the type parameter? There is a type value to use in case of a distribution-free statistical test?
Thank you so much! Fernando Marmolejo-Ramos wrote: > > hi giov > > about the dixon test... i just run a simple test with a sample of 40 and I > got: > > Error in dixon.test(x) : Sample size must be in range 3-30 > > So it seems that most of the test in the "outliers" package are designed > for small samples. See also the Rnews article published in May 2006 (vol > 6/2) called "processing data for outliers" by Lukasz Komsta (the developer > of the package). > > However there is in that package a function called "scores" which works > for big samples. You can also see the p-values and z scores for the > observations you have and determine which values are considered outliers. > > Try this simple syntax: > > library(outliers) > library(gamlss.dist) > > # this produces a exponential+Gaussian distribution (which usually has > heaps of outliers!) > x <- rexGAUS(100,2000,3000,5000) > > # this confirms that Dixon works for samples between 3 and 30!!! > dixon.test(x) > > # just to see what the data set looks like and visually confirm the > outliers > boxplot(x, notch=T) > > # sort the scores in ascending order > sort(x) > > # returns probability of each score (using z scores) to be an outlier in > order > sort(scores(x, type="z", prob=1)) > > # determines which scores are considered outliers with a 95% confidence > sort(scores(x, prob=0.95)) > > The author points regarding the "prob" part... > > prob ---- If set, the corresponding p-values instead of scores are given. > If value is set to 1, p-value are returned. Otherwise, a logical vector is > formed, indicating which values are exceeding specified probability. In > "z" and "mad" types, there is also possibility to set this value to zero, > and then scores are confirmed to (n-1)/sqrt(n) value, according to > Shiffler (1998). The "iqr" type does not support probabilities, but "lim" > value can be specified. > > The reference of Shiffler is not as the one that appears in the help. It > is this one: > > Schiffler, R.E (1988). Maximum Z scores and outliers. Am. Stat. 42, 1, > 79-80. > > I hope this helps, > > Fernando > > -- View this message in context: http://www.nabble.com/dixon-test-tp18940260p18960162.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.