So to simplify this a bit:
Using dataframe:
name x1 x2 x3 x4 x5 x6 x7 x8
1 fred 2 3 4 6 7 8 9 12
2 fred 4 5 6 8 9 10 11 14
3 fred 6 7 8 10 11 12 13 16
4 fred 8 9 10 12 13 14 15 18
5 james 10 11 12 14 15 16 17 20
6 james 12 13 14 16 17 18 19 22
7 james 14 15 16 18
Thanks Mike - this doesn't quite do it, but I think that you've hit of the
right method.
I am just trying to use 'plot' initially - I don't care so much about the
arrangement in the file.
plot(df$y,group=df$f) outputs the Y column in the appropriate plot. What I
would like to do is have 10 Y col
In terms of a reproducible example:
ProbeSet.ID.F ProbeSet.ID Feature.ID Gene.Symbol X0030V120810.4
X0143V120110.4 X0258V111710.4 X0283V111710.4 X0430V120710.4 X0472V111610.4
X0520V111610.4 X0546V113010.4 X0578V111810.4 X0624V111810.4
7896741_479302 7896741 479302 OR4F17
Suspect that this is easier than I realize, but taking some figuring out
currently. Any help would be appreciated.
I have a data frame (testhm) with many rows such as:
ProbeSet.ID.F ProbeSet.ID Feature.ID G.S X0030V120810.14 X0143V120110.14
X0258V111710.14 X0283V111710.14 X0430V120710.14 X047
Hugo - thanks for the link, but that's a bit beyond me right now. I have
ideas of the distribution for some 'nuggets', but others might be new. I
hope that there's an easier way to pick them out than going after
pre-defined patterns...
Greg - I use bioconductor for a lot of processing, but I'm no
Greg, Dennis - thanks for your input, I really appreciate the feedback, as it
is not easy to source.
In terms of the data; I've described it as 20 columns, which is the smallest
dataset, but this can run to 320 columns, so in some cases there is likely
to be enough power to detect non-normality.
Thanks Peter.
I understand your point, and that there is potentially a high false
discovery rate - but I'd expect the interesting data points (genes on a
microarray) to be within that list too. The next step would be to filter
based on some greater understanding of the biology...
Alternative ap
Yes, that was dumb - I got that...
--
View this message in context:
http://r.789695.n4.nabble.com/Finding-non-normal-distributions-per-row-of-data-frame-tp3259439p3260843.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-projec
Hi Greg,
In addition to the reply above, to address your questions - I fully
appreciate that my understanding of the code is basic - this is my first
attempt at putting this together...
My starting point is a data frame with numeric and text columns, but I can
cut columns to make a fully numeric
Thanks David - but '1' (if I understood correctly) returns the same value for
each row, which I took to be an error.
nt
V1V2V3V4V5V6
1 24.71 23.56 24.71 23.56 24.71 23.56
2 25.64 25.06 25.64 25.06 25.64 25.06
3 21.29 20.87 21.29 20.87 21.29 20.87
4 25.92 26.92 25.92 26.92
Thanks for the feedback Patrizio - but your function is performing the
shapiro.test on columns instead of rows...
I tried:
nt<-data.frame(#a dataframe with 6 columns and 9 rows)
nr <- nrow(nt)
test <- apply(nt, nt[1:nr,], shapiro.test)
Error in ds[-MARGIN] : invalid subscript type 'list'
fr
This is my first attempt at this, so hopefully a few kind pointers can get me
going in the right direction...
I have a large data frame of 20+ columns and 20,000 rows. I'd like to
evaluate the distribution of values in each row, to determine whether they
meet the criteria of a normal distribution
12 matches
Mail list logo