This is my first attempt at this, so hopefully a few kind pointers can get me
going in the right direction...
I have a large data frame of 20+ columns and 20,000 rows. I'd like to
evaluate the distribution of values in each row, to determine whether they
meet the criteria of a normal distribution. I'd loop this over all the rows
in the data frame, and output the summary results to a new data frame.
I have a loop that should run a Shapiro-Wilk test over each row,
y= data frame
for (j in 1:nr) {
y.temp<-list(y[j,])
testsw <- lapply(y.temp, shapiro.test)
testtable <- t(sapply(testsw, function(x) c(x$statistic, x$p.value)))
colnames(testtable) <- c("W", "p.value")
}
but it is currently throwing out an error:
"Error in `rownames<-`(`*tmp*`, value = "1") :
attempt to set rownames on object with no dimensions"
...which I guess is unrelated to the evaluation of normality, and more
likely a faulty loop?
Any suggestions either for this test, or a better way to evaluate the normal
distribution (e.g. qq-plot residuals for each row) would be greatly
received. Thanks.
--
View this message in context:
http://r.789695.n4.nabble.com/Finding-non-normal-distributions-per-row-of-data-frame-tp3259439p3259439.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.