Re: [R] Please help me in Converting this from C# to R
2008/9/14 rajivv <[EMAIL PROTECTED]>: > > Random r = new Random(); >DirectedGraph graph = GetGraph(); >decimal B = 0.1m; >decimal D = 0.05m; [ deletia ] >if (P[i] < 0) >P[i] = 0; >} >} > >} If you convert it into English first, then more people will be able to help. It's much easier to convert English to any programming language than from programming language A to programming language B. Given that this code must have derived from a specification written in a human language (such as English) just supply us with that. Sometimes you don't need the original spec if the code is well-commented. But this code is null-commented. At a guess, it seems to get a graph from out of nowhere (graph=GetGraph()) and then do 100 iterations of some calculation based on the graph adjacency. This should not be too difficult to convert to R, but with any conversion problem there are always hidden traps to beware of. Here's one in your code: You have: for (int i = 7; i <= 10; ++i) in one loop, and: for (int t = 0; t < 100; ++t) Now, much as C style loop specifications are concise and elegant, they can cause confusion. The subtle differences here (using < instead of <=, and the 'preincrement' ++i) confuse me as to what values the loop variable takes in the loop. The way to get by all these issues in any conversion problem is to have a good set of test cases. You run the test cases in language A and get a set of answers. You then run the test cases using the converted code in language B and if you don't get the same answers then the conversion has failed. If you can describe what the code does, add some meaningful comments, and produce a set of sample data test cases and results then perhaps you'll get more help than just pasting the code in and asking nicely (you did say 'please', which is more than some people do on this list!). Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with misclass function on tree classification
I am working through Tom Minka's lectures on Data Mining and am now on Day 32. The following is the link: http://alumni.media.mit.edu/~tpminka/courses/36-350.2001/lectures/day32/ In order to use the functions cited I followed the instructions as follows: Installed tree package from CRAN mirror (Ca-1) Downloaded and sourced the file "tree.r" Downloaded the function "clus1.r" Having defined a tree "tr, when I write "misclass(tr,x$test)" as shown in the link I get an error message that "R does not find the function pred1.tree". Is this function included in the tree package? If so it was not in my download. Is this a bug? Do you know of a fix? Thanks for your help Meir <> <> Meir Preiszler - Research Engineer I t a m a r M e d i c a l Ltd. Caesarea, Israel: Tel: +(972) 4 617 7000 ext 232 Fax: +(972) 4 627 5598 Cell: +(972) 54 699 9630 Email: [EMAIL PROTECTED] Web: www.Itamar-medical.com * 8<8<---8<--- This E-mail is confidential information of Itamar medical Ltd. It may also be legally privileged. If you are not the addressee you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return E-mail. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. The sender does not accept liability for any errors or omissions. Before printing this email , kindly think about the environment. Itamar Medical Ltd. MIS Yan Malgin. 8<8<---8<--- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Where is the error
rajivv wrote: > > P <- vector(mode="numeric",length =10) > > SS<-function(){for(id in 0:9){ > if(0 print("ss") > else > print("ss") > }} > > SS() > --- > Error in if (0 < P[id]) print("ss") else print("ss") : > argument is of length zero > Use for(id in 1:10) Berend -- View this message in context: http://www.nabble.com/Where-is-the-error-tp19477706p19478528.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Where is the error
P <- vector(mode="numeric",length =10) SS<-function(){for(id in 0:9){ if(0http://www.nabble.com/Where-is-the-error-tp19477706p19477706.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Where is the error
2008/9/14 rajivv <[EMAIL PROTECTED]>: > > P <- vector(mode="numeric",length =10) > > SS<-function(){for(id in 0:9){ >if(0print("ss") > else > print("ss") > }} > > SS() > --- > Error in if (0 < P[id]) print("ss") else print("ss") : > argument is of length zero Arrays/vectors/matrices in R are indexed from 1, not 0. Regards, Nicky Chorley __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Please help me in Converting this from C# to R
# bit hard to provide a simple conversion without definitions of the class 'Node', the template 'DirectedGraph' and the function 'Writed'! # I've used the package 'igraph' as a drop in - hope it is still clear. # # by the way: # - your curly braces don't match, # - not all elements of P are initialised before they are used. # # original code (cleaned to make comparison easier). # #Random r = new Random(); #DirectedGraph graph = GetGraph(); #decimal B = 0.1m; #decimal D = 0.05m; #int nodes = graph.NodesCount; #decimal[] E = new decimal[nodes]; #decimal[] P = new decimal[nodes]; # #for (int i = 7; i <= 10; ++i) P[i] = (decimal)r.NextDouble(); # #for (int t = 0; t < 100; ++t){ #Writed(P, "P"); # #foreach (SimpleNode n in graph.Nodes) { #int id = graph.index[n]; # #decimal product = 1; #foreach (var item in graph.GetAdjacentNodes(n)){ #int j = graph.index[item]; #product *= (1 - B * P[j]); #} # #E[id] = product; #} # #foreach (SimpleNode n in graph.Nodes){ #int i = graph.index[n]; #P[i] = 1 - ((1 - P[i]) * E[i] + D * (1 - P[i]) * E[i] + 0.5m * D * P[i] * (1 - E[i])); #if (P[i] < 0) P[i] = 0; #} #} # #} # # drop-in for your method getGraph (produces a 10 'random' node directed graph). I only assign to temporary so I can use the same 'grph' and 'P' in both implementations. # library(igraph) GetGraph <- function() graph.adjacency(matrix(sample(0:1, size=100, replace=T), nrow=10)) grph.t <- GetGraph() P.t <- runif(nodes) # assume you meant to initialise all elements of P # # IMPLEMENTATON 1. # A 'mirror' implementation. Some of the code relies # on the specifics of package igraph, but I've tried to # be as similar as possible. Hope it still makes sense! # B <- 0.1 D <- 0.05 grph <- grph.t nodes <- vcount(grph) E <- numeric(nodes) P <- P.t for(t in 0:99){ cat('P:', P, '\n')# is this equivalent to 'Writed(P, "P")' ??? graph.Nodes <- get.adjlist(grph) # returns a list of vectors, where each vector is the nodes a node is connected to. id <- 0 # we loop over the vectors and so must index separately for(n in graph.Nodes){ # n is a vector containing the verticies the vertex at index id+1 is connected to. id <- id+1 product <- 1; for(item in n){ product <- product * (1 - B * P[item+1]); # verticies are indexed from 0. no operator*= in R. } E[id] <- product; } at <- 0 for(i in 1:nodes){ P[i] <- 1 - ((1 - P[i]) * E[i] + D * (1 - P[i]) * E[i] + 0.5 * D * P[i] * (1 - E[i])); # we are accessing nodes in order so the indexes are also ordered. if (P[i] < 0) P[i] <- 0; } } P # print the result # # IMPLEMENTATION 2. # a more 'R-ish' implementation. # B <- 0.1 D <- 0.05 P <- P.t grph <- grph.t for(t in 0:99){ E <- sapply(get.adjlist(grph), function(node) prod(1-B*P[node+1])) P <- 1 - ((1 - P) * E + D * (1 - P) * E + 0.5 * D * P * (1 - E)) } P # print the result __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with misclass function on tree classification
Did you say: library("tree") at the top of your script? On Sun, Sep 14, 2008 at 5:47 PM, Meir Preiszler <[EMAIL PROTECTED]> wrote: > > I am working through Tom Minka's lectures on Data Mining and am now on Day > 32. The following > is the link: > http://alumni.media.mit.edu/~tpminka/courses/36-350.2001/lectures/day32/ > In order to use the functions cited I followed the instructions as follows: > > Installed tree package from CRAN mirror (Ca-1) > Downloaded and sourced the file "tree.r" > Downloaded the function "clus1.r" > > Having defined a tree "tr, when I write "misclass(tr,x$test)" as shown in the > link > I get an error message that "R does not find the function pred1.tree". > > Is this function included in the tree package? If so it was not in my > download. Is this a bug? > Do you know of a fix? > > Thanks for your help > Meir > > <> <> > > > Meir Preiszler - Research Engineer > I t a m a r M e d i c a l Ltd. > Caesarea, Israel: > Tel: +(972) 4 617 7000 ext 232 > Fax: +(972) 4 627 5598 > Cell: +(972) 54 699 9630 > Email: [EMAIL PROTECTED] > Web: www.Itamar-medical.com > * > > > > > 8<8<---8<--- > This E-mail is confidential information of Itamar medical Ltd. It may also > be legally privileged. If you are not the addressee you may not copy, > forward, > disclose or use any part of it. If you have received this message in error, > please delete it and all copies from your system and notify the sender > immediately by return E-mail. Internet communications cannot be guaranteed > to be timely, secure, error or virus-free. The sender does not accept > liability for any errors or omissions. Before printing this email , > kindly think about the environment. Itamar Medical Ltd. MIS Yan Malgin. > 8<8<---8<--- > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data format for BiodiversityR
Greetings dear friends. Please, I really find problems having the program read my datasets (here attached). Have converted datasets to csv, imported but always not reaching the target. Would be very happy if some one out can help me on time. Thanks Ndoh Mbue Innocent International corporation office China University of Geosciences 388 Lumo road 430074, Wuhan-China Tel: 0086 27 67885947/0086 15927262962 A gentlemen should be truly a moral person, a straightforward and reliable personality,in solidarity with the community and rooted in self rescpect __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem plotting axes on graphs
2008/9/14 [EMAIL PROTECTED] I ran your example Speed <- cars$speed Distance <- cars$dist Speed [1] 4 4 7 7 8 9 10 10 10 11 11 12 12 12 12 13 13 13 13 14 14 14 14 15 15 [26] 15 16 16 17 17 17 18 18 18 18 19 19 19 20 20 20 20 20 22 23 24 24 24 24 25 Distance [1] 2 10 4 22 16 10 18 26 34 17 28 14 20 24 28 26 34 34 46 [20] 26 36 60 80 20 26 54 32 40 32 40 50 42 56 76 84 36 46 68 [39] 32 48 52 56 64 66 54 70 92 93 120 85 plot(Speed, Distance, panel.first = grid(8,8), pch = 0, cex = 1.2, col = "blue") plot(Speed, Distance, panel.first = lines(stats::lowess(Speed, Distance), lty = "dashed"), pch = 0, cex = 1.2, col = "blue") And got the following Error in axis(side = side, at = at, labels = labels, ...) : too few arguments I got a graph of the points with a dashed lined line through them but did not get any axes I am runing R 2.7.2 under windowa XP Service Pack 2 on an Acer Extensa 5200 Les Stather R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Greyed text in the background of a plot
Yes, it's easy but you get a significant "jump" between a time step and the next one, which makes the animation unpleasant and difficult to follow. I think that this problem is because of the use of plot(), which redraws everything, and that there is no way around within R (is it?). I also understand that these "jumps" are avoided in your package by creating the animated gif and html pages. Actually your animation with the word "Animation" in your page http://animation.yihui.name/animation:start#generate_an_animation_sequence is much "softer" that what you get by running the code within R. Regarding what you have in http://animation.yihui.name/da:ts:hans_rosling_s_talk I'm missing the function Rosling.bubbles(), so cannot actually try it. And congratulations, great site and package. Agus Yihui Xie wrote: Well, his talk seems to have attracted a lot of people... You may simply use gray text in your plot. Here is an example: ## x = runif(10) y = runif(10) z = runif(10, 0.1, 0.3) cl = rgb(runif(10), runif(10), runif(10), 0.5) # transparent colors! par(mar = c(4, 4, 0.2, 0.2)) for (i in 1917:2007) { x = x + rnorm(10, 0, 0.02) y = y + rnorm(10, 0, 0.02) z = abs(z + rnorm(10, 0, 0.05)) plot(x, y, xlim = c(0, 1), ylim = c(0, 1), type = "n", panel.first = { grid() text(0.5, 0.5, i, cex = 5, col = "gray") # here is the text! }) symbols(x, y, circles = z, add = T, bg = cl, inches = 0.8) box() Sys.sleep(0.2) } ## Not difficult at all, right? :) BTW, if you are interested in such animations, you may as well take a look at my "animation" package: http://cran.r-project.org/web/packages/animation/index.html http://animation.yihui.name/ Regards, Yihui On Fri, Sep 12, 2008 at 8:35 PM, Agustin Lobo <[EMAIL PROTECTED]> wrote: Hi! Is there any way of having a greyed ("ghosted") text (i.e, 2006) in the background of a plot? I'm making a dynamic plot and would like to show the year of each time step as a big greyed text in the background. (the idea comes from Hans Rosling video: http://video.google.com/videoplay?docid=4237353244338529080&sourceid=searchfeed ) Thanks Agus -- Dr. Agustin Lobo Institut de Ciencies de la Terra "Jaume Almera" (CSIC) LLuis Sole Sabaris s/n 08028 Barcelona Spain Tel. 34 934095410 Fax. 34 934110012 email: [EMAIL PROTECTED] http://www.ija.csic.es/gt/obster -- Dr. Agustin Lobo Institut de Ciencies de la Terra "Jaume Almera" (CSIC) LLuis Sole Sabaris s/n 08028 Barcelona Spain Tel. 34 934095410 Fax. 34 934110012 email: [EMAIL PROTECTED] http://www.ija.csic.es/gt/obster __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help please! How to code a mixed-model with 2 within-subject factors using lme or lmer?
Hello, I'm using aov() to analyse changes in brain volume between males and females. For every subject (there are 331 in total) I have 8 volume measurements (4 different brain lobes and 2 different tissues (grey/white matter)). The data looks like this: Subject Sex LobeTissue Volume subect1 1 F g 262374 subect1 1 F w 173758 subect1 1 O g 67155 subect1 1 O w 30067 subect1 1 P g 117981 subect1 1 P w 85441 subect1 1 T g 185241 subect1 1 T w 83183 subect2 1 F g 255309 subect2 1 F w 164335 subect2 1 O g 71769 subect2 1 O w 31879 subect2 1 P g 120518 subect2 1 P w 90334 subect2 1 T g 168413 subect2 1 T w 75790 subect3 0 F g 243621 subect3 0 F w 167025 subect3 0 O g 65998 subect3 0 O w 29758 subect3 0 P g 118026 subect3 0 P w 91903 subect3 0 T g 156279 subect3 0 T w 82349 I'm trying to see if there is an interaction Sex*Lobe*Tissue. This is the command I use with aov(): mod1<-aov(Volume~Sex*Lobe*Tissue+Error(Subject/(Lobe*Tissue)),data.vslt) Subject is a random effect, Sex, Lobe and Tissue are fixed effects; Sex is an outer factor (between subjects), and Lobe and Tissue are inner factors (within-subjects); and there is indeed a significant 3-way interaction. I was told, however, that the results reported by aov() may depend on the order of the factors (type I anova), and that is better to use lme() or lmer() with type II, but I'm struggling to find the right syntaxis... To begin, how should I write the model using lme() or lmer()?? I tried this with lme(): gvslt<-groupedData(Volume~1|Subject,outer=~Val,inner=list(~Lobe,~Tissue),data=vslt) mod2<-lme(Volume~Val*Lobe*Tissue,random=~1|Subject,data=gvslt) but I have interaction terms for every level of Lobe and Tissue, and 8 times the number of DF I should have... (around 331*8 instead of ~331). Using lmer(), the specification of Subject as a random effect is straightforward: mod2<-lmer(Volume~Sex*Lobe*Tissue+(1|Subject),data.vslt) but I can't figure out the /(Lobe*Tissue) part... Thank you very much in advance! roberto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Join data by minimum distance
> I am wondering if there is a function which will do a join between 2 > data.frames by minimum distance, as it is done in ArcGIS for example. For > people who are not familiar with ArcGIS here it is an explanation: > > Suppose you have a data.frame with x, y, coordinates called track, and a > second data frame with different x, y coordinates and some other attributes > called classif. The track data.frame has a different number of rows than > classif. I want to join the rows from classif to track in such a way that for > each row in track I add only the row from classif that has coordinates > closest to the coordinates in the track row (and hence minimum distance in > between the 2 rows), and also add a new column which will record this minimum > distance. Even if the coordinates in the 2 data.frames have same name, the > values are not identical between the data.frames, so a merge by column is not > what I am after. #--- # get the distance between two points on the globe. # # args: # lat1 - latitude of first point. # long1 - longitude of first point. # lat2 - latitude of first point. # long2 - longitude of first point. # radius - average radius of the earth in km # # see: http://en.wikipedia.org/wiki/Great_circle_distance #--- greatCircleDistance <- function(lat1, long1, lat2, long2, radius=6372.795){ sf <- pi/180 lat1 <- lat1*sf lat2 <- lat2*sf long1 <- long1*sf long2 <- long2*sf lod <- abs(long1-long2) radius * atan2( sqrt((cos(lat1)*sin(lod))**2 + (cos(lat2)*sin(lat1)-sin(lat2)*cos(lat1)*cos(lod))**2), sin(lat2)*sin(lat1)+cos(lat2)*cos(lat1)*cos(lod) ) } #--- # Calculate the nearest point using latitude and longitude. # and attach the other args and nearest distance from the # other data.frame. # # args: # x as you describe 'track' # y as you describe 'classif' # xlongnme name of longitude variable in x # xlatnme name of latitude location variable in x # ylongnme name of longitude location variable on y # ylatnme name of latitude location variable on y #--- dist.merge <- function(x, y, xlongnme, xlatnme, ylongnme, ylatnme){ tmp <- t(apply(x[,c(xlongnme, xlatnme)], 1, function(x, y){ dists <- apply(y, 1, function(x, y) greatCircleDistance(x[2], x[1], y[2], y[1]), x) cbind(1:nrow(y), dists)[dists == min(dists),,drop=F][1,] } , y[,c(ylongnme, ylatnme)])) tmp <- cbind(x, min.dist=tmp[,2], y[tmp[,1],-match(c(ylongnme, ylatnme), names(y))]) row.names(tmp) <- NULL tmp } # demo track <- data.frame(xt=runif(10,0,360), yt=rnorm(10,-90, 90)) classif <- data.frame(xc=runif(10,0,360), yc=rnorm(10,-90, 90), v1=letters[1:20], v2=1:20) dist.merge(track, classif, 'xt', 'yt', 'xc', 'yc') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help please! How to code a mixed-model with 2 within-subject factors using lme or lmer?
Hi Roberto, >> but I can't figure out the /(Lobe*Tissue) part... This type of nesting is easier to do using lmer(). To do it using lme() you have to generate the crossed factor yourself. Do something like this: ## tfac <- with(vslt, interaction(Lobe, Tissue, drop=T)) str(tfac); head(tfac) mod2<-lme(Volume ~ Val*Lobe*Tissue, random = ~1|Subject/tfac, data = vslt) Pre-Scriptum: You can also use ?":" but ?interaction is more flexible and powerful. Regards, Mark. roberto toro wrote: > > Hello, > > I'm using aov() to analyse changes in brain volume between males and > females. For every subject (there are 331 in total) I have 8 volume > measurements (4 different brain lobes and 2 different tissues > (grey/white matter)). The data looks like this: > > Subject Sex LobeTissue Volume > subect1 1 F g 262374 > subect1 1 F w 173758 > subect1 1 O g 67155 > subect1 1 O w 30067 > subect1 1 P g 117981 > subect1 1 P w 85441 > subect1 1 T g 185241 > subect1 1 T w 83183 > subect2 1 F g 255309 > subect2 1 F w 164335 > subect2 1 O g 71769 > subect2 1 O w 31879 > subect2 1 P g 120518 > subect2 1 P w 90334 > subect2 1 T g 168413 > subect2 1 T w 75790 > subect3 0 F g 243621 > subect3 0 F w 167025 > subect3 0 O g 65998 > subect3 0 O w 29758 > subect3 0 P g 118026 > subect3 0 P w 91903 > subect3 0 T g 156279 > subect3 0 T w 82349 > > > I'm trying to see if there is an interaction Sex*Lobe*Tissue. This is > the command I use with aov(): > > mod1<-aov(Volume~Sex*Lobe*Tissue+Error(Subject/(Lobe*Tissue)),data.vslt) > > Subject is a random effect, Sex, Lobe and Tissue are fixed effects; > Sex is an outer factor (between subjects), and Lobe and Tissue are > inner factors (within-subjects); and there is indeed a significant > 3-way interaction. > > I was told, however, that the results reported by aov() may depend on > the order of the factors > (type I anova), and that is better to use lme() or lmer() with type > II, but I'm struggling to find the right syntaxis... > > To begin, how should I write the model using lme() or lmer()?? > > I tried this with lme(): > > gvslt<-groupedData(Volume~1|Subject,outer=~Val,inner=list(~Lobe,~Tissue),data=vslt) > mod2<-lme(Volume~Val*Lobe*Tissue,random=~1|Subject,data=gvslt) > > but I have interaction terms for every level of Lobe and Tissue, and 8 > times the number of DF I should have... (around 331*8 instead of > ~331). > > Using lmer(), the specification of Subject as a random effect is > straightforward: > > mod2<-lmer(Volume~Sex*Lobe*Tissue+(1|Subject),data.vslt) > > but I can't figure out the /(Lobe*Tissue) part... > > Thank you very much in advance! > roberto > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://www.nabble.com/Help-please%21-How-to-code-a-mixed-model-with-2-within-subject-factors-using-lme-or-lmer--tp19479860p19480387.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Greyed text in the background of a plot
Hi Agus, Yes you are absolutely right about the awkward jumps in the animations and this has also been my big problem for a long time. To solve this problem, I think I need a third-party software, as I don't know any solutions merely using R. Maybe the "swfc" utility in the SWF Tools or the Processing language can be possible solutions. I'll try them when I have enough time. As for the function Rosling.bubbles(), you have to wait until the version 1.0-2 is published on CRAN. (I've submitted the new version this morning) Sorry it seems I have been discussing a different topic under this thread... Yihui On Sun, Sep 14, 2008 at 8:05 PM, Agustin Lobo <[EMAIL PROTECTED]> wrote: > Yes, it's easy but you get a significant "jump" between a time > step and the next one, which makes the animation unpleasant and difficult to > follow. I think that this problem is because > of the use of plot(), which redraws everything, and that > there is no way around within R (is it?). > > I also understand that > these "jumps" are avoided in your package by creating > the animated gif and html pages. Actually your > animation with the word "Animation" in > your page > http://animation.yihui.name/animation:start#generate_an_animation_sequence > is much "softer" that what you get by running the code within R. > > Regarding what you have in > http://animation.yihui.name/da:ts:hans_rosling_s_talk > > I'm missing the function Rosling.bubbles(), so cannot > actually try it. > > And congratulations, great site and package. > > Agus > > -- Yihui Xie <[EMAIL PROTECTED]> Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086 Mobile: +86-15810805877 Homepage: http://www.yihui.name School of Statistics, Room 1037, Mingde Main Building, Renmin University of China, Beijing, 100872, China __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Scaling X axis from -1 to 1
Hi, I have a density plot in which the x axis ranged from 0 to 2000. How can I scale the data so that the x-axis is scaled in -1 to 1 form? - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scaling X axis from -1 to 1
2 * (x - min(x))/(max(x) - min(x)) - 1 On Sun, Sep 14, 2008 at 10:13 PM, Gundala Viswanath <[EMAIL PROTECTED]> wrote: > Hi, > > I have a density plot in which the x axis > ranged from 0 to 2000. > > How can I scale the data so that the x-axis > is scaled in -1 to 1 form? > > - Gundala Viswanath > Jakarta - Indonesia > -- Yihui Xie <[EMAIL PROTECTED]> Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086 Mobile: +86-15810805877 Homepage: http://www.yihui.name School of Statistics, Room 1037, Mingde Main Building, Renmin University of China, Beijing, 100872, China __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help please! How to code a mixed-model with 2 within-subject factors using lme or lmer?
Thanks for answering Mark! I tried with the coding of the interaction you suggested: > tfac<-with(vlt,interaction(Lobe,Tissue,drop=T)) > mod<-lme(Volume~Sex*Lobe*Tissue,random=~1|Subject/tfac,data=vlt) But is it normal that the DF are 2303? DF is 2303 even for the estimate of LobeO that has only 662 values (331 for Tissue=white and 331 for Tissue=grey). I'm not sure either that Sex, Lobe and Tissue are correctly handled why are there different estimates called Sex:LobeO, Sex:LobeP, etc, and not just Sex:Lobe as with aov()?. Why there's Tissuew, but not Sex1, for example? Thanks again! roberto ps1. How would you code this with lmer()? ps2. this is part of the output of mod<-lme: > summary(mod) Linear mixed-effects model fit by REML Data: vlt AIC BIClogLik 57528.35 57639.98 -28745.17 Random effects: Formula: ~1 | Subject (Intercept) StdDev:11294.65 Formula: ~1 | tfac %in% Subject (Intercept) Residual StdDev:10569.03 4587.472 Fixed effects: Volume ~ Sex * Lobe * Tissue Value Std.Error DFt-value p-value (Intercept)245224.61 1511.124 2303 162.27963 0. Sex 2800.01 1866.312 3291.50029 0.1345 LobeO -180794.83 1526.084 2303 -118.46975 0. LobeP -131609.27 1526.084 2303 -86.23984 0. LobeT -73189.97 1526.084 2303 -47.95932 0. Tissuew-72461.05 1526.084 2303 -47.48168 0. Sex:LobeO-663.27 1884.789 2303 -0.35191 0.7249 Sex:LobeP -2146.08 1884.789 2303 -1.13863 0.2550 Sex:LobeT1379.49 1884.789 23030.73191 0.4643 Sex:Tissuew 5387.65 1884.789 23032.85849 0.0043 LobeO:Tissuew 43296.99 2158.209 2303 20.06154 0. LobeP:Tissuew 50952.21 2158.209 2303 23.60856 0. LobeT:Tissuew -15959.31 2158.209 2303 -7.39470 0. Sex:LobeO:Tissuew -5228.66 2665.494 2303 -1.96161 0.0499 Sex:LobeP:Tissuew -1482.83 2665.494 2303 -0.55631 0.5781 Sex:LobeT:Tissuew -6037.49 2665.494 2303 -2.26506 0.0236 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help please! How to code a mixed-model with 2 within-subject factors using lme or lmer?
Hi Roberto, It's difficult to comment further on specifics without access to your data set. A general point is that the output from summary(aov.object) is not directly comparable with summary(lme.object). The latter gives you a summary of a fitted linear regression model, not an analysis of variance model, and what you "see" will depend on what contrasts were in place when the model was fitted. If you haven't changed these then they will be so-called treatment contrasts. What you are seeing for Lobe (which plainly is coded as a factor) in the output from summary(lme.object) are the regression coefficients for each level of Lobe relative to its reference/treatment/baseline level, which is your (Intercept). If you fitted your model with, say, Helmert or sum-to-zero contrasts then these values would change. To see what your current reference level is do levels(dataset$Lobe). See ?levels. What you want to look at to begin with is: anova(lme.object). HTH, Mark. roberto toro wrote: > > Thanks for answering Mark! > > I tried with the coding of the interaction you suggested: > >> tfac<-with(vlt,interaction(Lobe,Tissue,drop=T)) >> mod<-lme(Volume~Sex*Lobe*Tissue,random=~1|Subject/tfac,data=vlt) > > But is it normal that the DF are 2303? DF is 2303 even for the estimate of > LobeO that has only 662 values (331 for Tissue=white and 331 for > Tissue=grey). > I'm not sure either that Sex, Lobe and Tissue are correctly handled > why are > there different estimates called Sex:LobeO, Sex:LobeP, etc, and not just > Sex:Lobe as with aov()?. Why there's Tissuew, but not Sex1, for example? > > Thanks again! > roberto > > ps1. How would you code this with lmer()? > ps2. this is part of the output of mod<-lme: >> summary(mod) > Linear mixed-effects model fit by REML > Data: vlt >AIC BIClogLik > 57528.35 57639.98 -28745.17 > > Random effects: > Formula: ~1 | Subject > (Intercept) > StdDev:11294.65 > > Formula: ~1 | tfac %in% Subject > (Intercept) Residual > StdDev:10569.03 4587.472 > > Fixed effects: Volume ~ Sex * Lobe * Tissue >Value Std.Error DFt-value p-value > (Intercept)245224.61 1511.124 2303 162.27963 0. > Sex 2800.01 1866.312 3291.50029 0.1345 > LobeO -180794.83 1526.084 2303 -118.46975 0. > LobeP -131609.27 1526.084 2303 -86.23984 0. > LobeT -73189.97 1526.084 2303 -47.95932 0. > Tissuew-72461.05 1526.084 2303 -47.48168 0. > Sex:LobeO-663.27 1884.789 2303 -0.35191 0.7249 > Sex:LobeP -2146.08 1884.789 2303 -1.13863 0.2550 > Sex:LobeT1379.49 1884.789 23030.73191 0.4643 > Sex:Tissuew 5387.65 1884.789 23032.85849 0.0043 > LobeO:Tissuew 43296.99 2158.209 2303 20.06154 0. > LobeP:Tissuew 50952.21 2158.209 2303 23.60856 0. > LobeT:Tissuew -15959.31 2158.209 2303 -7.39470 0. > Sex:LobeO:Tissuew -5228.66 2665.494 2303 -1.96161 0.0499 > Sex:LobeP:Tissuew -1482.83 2665.494 2303 -0.55631 0.5781 > Sex:LobeT:Tissuew -6037.49 2665.494 2303 -2.26506 0.0236 > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://www.nabble.com/Help-please%21-How-to-code-a-mixed-model-with-2-within-subject-factors-using-lme-or-lmer--tp19480815p19481027.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ksvm accessing the slots of S4 object
I am using kernlab to build svm models. I am not sure how to access the different slots of the object. For instance if I want to get the nuber of support vectors for each of model I am building and store it in a vector. >ksvm.model <- ksvm(Class ~ ., data = somedata,kernel = "vanilladot", cross = >10, type ="C-svc") >names(attributes(ksvm.model)) [1] "param" "scaling""coef" "alphaindex" "b" [6] "obj""SVindex""nSV""prior" "prob.model" [11] "alpha" "type" "kernelf""kpar" "xmatrix" [16] "ymatrix""fitted" "lev""nclass" "error" [21] "cross" "n.action" "terms" "kcall" "class" >ksvm.model Support Vector Machine object of class "ksvm" SV type: C-svc (classification) parameter : cost C = 1 Linear (vanilla) kernel function. Number of Support Vectors : 144 Objective Function Value : -4.3162 Training error : 0 Cross validation error : 0.4 In the above dummy example how do I access the number of support vectors? I tried the following ksvm.model$nSV nSV(ksvm.model) Thanks ../Murli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question on glm.nb vs zeroinfl vs hurdle models
Good afternoon, I’m in need of an advice regarding a proper use of glm.nb, zeroinfl or hurdle with my dataframe. I can not provide a self-contained example, since I need an advice on this current dataset and its “contradictory” results. So i have a dataset which contains 1309 cases and 11 variables, highly right-skewed and heavily zeroinflated (with over 1100 cases that have 0 value for my variables both dependent and independent, eg: variable A has 1220 cases with 0 value, variable B has 1283 with 0 value and so on..) I tried to fit 3 models: glm.nb, zeroinfl and hurdle and I was expecting some “similar” results and similar conclusions. What was similar was log-likelihood (very close for all 3 models) and the number of predicted 0 (which was identical for each model), but what surprised me were the following results: -glm.nb identified as having an influence the same variables that were identified by the hurdle model in the zero-model; -zerinfl model identified also d variable as influential; Now my question is the following: having seen the vignette (Regression Models for Count Data in R) I noticed that glm.nb, hurdle and zeroinfl give similar results for the count model, while for the zero-component hurdle and zeroifl may give slightly more different results, while for my example the count model from glm.nb is similar to the zero-component part of hurdle and zeroinfl. Why is that? Is there a problem with the fact that my dataset is extremely zero-inflated, and there are few cases with values different from 0? Any kind of help would be most welcomed Thank you and have a great day ahead. > summary(aaa) Call: hurdle(formula = as.integer(x) ~ as.integer(a) + as.integer(b) + as.integer(c) + as.integer(d) + as.integer(e) + as.integer(f) + as.integer(g) + as.integer(h), data = dep, dist = "negbin") Count model coefficients (truncated negbin with log link): Estimate Std. Error z value Pr(>|z|) (Intercept)-0.021780.30753 -0.0710.944 as.integer(a) -0.488860.54023 -0.9050.366 as.integer(b)-0.095550.11688 -0.8170.414 as.integer(c) -0.086540.20809 -0.4160.678 as.integer(d) 0.174460.16956 1.0290.304 as.integer(e) 0.271800.55702 0.4880.626 as.integer(f)0.155120.42721 0.3630.717 as.integerg) -0.076870.21750 -0.3530.724 as.integer(h) -0.169060.44986 -0.3760.707 Log(theta) -0.762740.51800 -1.4720.141 Zero hurdle model coefficients (binomial with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept)-1.134980.07906 -14.356 < 2e-16 *** as.integer(a) -0.331340.30239 -1.096 0.27320 as.integer(b)-0.263940.08397 -3.143 0.00167 ** as.integer(c) 0.066890.12796 0.523 0.60115 as.integer(d) -0.120450.11984 -1.005 0.31486 as.integer(e)-0.793140.29106 -2.725 0.00643 ** as.integer(f) -0.285470.40790 -0.700 0.48402 as.integer(g) -0.331860.18887 -1.757 0.07890 . as.integer(h) -0.370080.31035 -1.192 0.23308 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Theta: count = 0.4664 Number of iterations in BFGS optimization: 28 Log-likelihood: -1073 on 19 Df > summary(a) Call: glm.nb(formula = as.integer(x) ~ as.integer(a) + as.integer(b) + as.integer(c) + as.integer(d) + as.integer(e) + as.integer(f) + as.integer(g) + as.integer(h), data = dep, init.theta = 0.187836108765364, link = log) Deviance Residuals: Min 1Q Median 3Q Max -0.8607 -0.7236 -0.6809 -0.4610 2.7575 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept)-0.563810.08820 -6.392 1.64e-10 *** as.integer(a) -0.515170.33477 -1.539 0.12384 as.integer(b)-0.218350.07250 -3.011 0.00260 ** as.integer(c) 0.089200.14546 0.613 0.53974 as.integer(d) -0.017420.10877 -0.160 0.87274 as.integer(e)-0.690850.23446 -2.946 0.00321 ** as.integer(f) -0.141820.42142 -0.337 0.73647 as.integer(g) -0.249760.15819 -1.579 0.11437 as.integer(h) -0.376520.30043 -1.253 0.21009 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for Negative Binomial(0.1878) family taken to be 1) Null deviance: 707.18 on 1308 degrees of freedom Residual deviance: 677.09 on 1300 degrees of freedom AIC: 2181.5 Number of Fisher Scoring iterations: 1 Theta: 0.1878 Std. Err.: 0.0186 Warning while fitting theta: alternation limit reached > summary(aa) Call: zeroinfl(formula = a
[R] Fetching a range of columns
Hello, I realize that using: x[x > 3 & x < 5] I can fetch all elements between 3 and 5. However I read in from a CSV file, and I would like to fetch all columns from within a range ( 842-2411). In teh past, I have done this to fetch just select few columns: data <- read.csv(filein, header=TRUE, nrows=320, skip=nskip) data_filter <- data[c(2,12,17)] write.table(data_filter, fileout, append = TRUE, sep= ",", row.names= FALSE, col.names = FALSE) nskip <- nskip+320 This time, however, instead of grabbing columns 2, 12, 17, I woudl like all columns in the range of 842-2411. I can't seem to do this correctly. Could somebody please provide some insight? Thanks in advance. -- Jason Thibodeau [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] library instal
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. --- On Fri, 9/12/08, gaurav1983 <[EMAIL PROTECTED]> wrote: > From: gaurav1983 <[EMAIL PROTECTED]> > Subject: [R] library instal > To: r-help@r-project.org > Received: Friday, September 12, 2008, 6:43 AM > I am finding real trouble in installing evd library in R for > linux > -- > View this message in context: > http://www.nabble.com/library-instal-tp19453453p19453453.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. __ [[elided Yahoo spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] difference of two data frames
Hello I have 2 data frames DF1 and DF2 where DF2 is a subset of DF1: DF1= data.frame(V1=1:6, V2= letters[1:6]) DF2= data.frame(V1=1:3, V2= letters[1:3]) How do I create a new data frame of the difference between DF1 and DF2 newDF=data.frame(V1=4:6, V2= letters[4:6]) In my real data, the rows are not in order as in the example I provided. Thanks much Joseph [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] difference of two data frames
Hi Joseph, Try this: DF1[!DF1$V1%in%DF2$V1,] subset(DF1,!V1%in%DF2$V1) HTH, Jorge On Sun, Sep 14, 2008 at 12:49 PM, joseph <[EMAIL PROTECTED]> wrote: > Hello > I have 2 data frames DF1 and DF2 where DF2 is a subset of DF1: > DF1= data.frame(V1=1:6, V2= letters[1:6]) > DF2= data.frame(V1=1:3, V2= letters[1:3]) > How do I create a new data frame of the difference between DF1 and DF2 > newDF=data.frame(V1=4:6, V2= letters[4:6]) > In my real data, the rows are not in order as in the example I provided. > Thanks much > Joseph > > > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] difference of two data frames
Hi Mark as you guessed it, I meant a dataframe of the rows in DF1 that are not in DF2 . Here is what I got: > complement<-setdiff(DF1$V2,DF2$V2) > DF1[,complement] Error in `[.data.frame`(DF1, , complement) : undefined columns selected > - Original Message From: Mark Leeds <[EMAIL PROTECTED]> To: joseph <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Sent: Sunday, September 14, 2008 10:07:48 AM Subject: RE: [R] difference of two data frames Hi: If you mean a dataframe of the rows in DF1 that are not in DF2 , then I think below will work for the letters, which , according to what I'm understanding, will also make it work for the rows so no need to consider the numbers ? complement<-setdiff(DF1$V2,DF2$V2) DFnew<=DF1[,complement) But, 3 things to consider: 1) I'm not sure if I understand the problem. 2) I'm also at home and I don't use R here so I can't test it. 3) I'm also not sure about the order of the setdiff operation so you may have to switch the order of the two columns I used. Atleast, it will get you started though and I'm confident someone else will answer. Good luck. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joseph Sent: Sunday, September 14, 2008 12:50 PM To: r-help@r-project.org Cc: r-help@r-project.org Subject: [R] difference of two data frames Hello I have 2 data frames DF1 and DF2 where DF2 is a subset of DF1: DF1= data.frame(V1=1:6, V2= letters[1:6]) DF2= data.frame(V1=1:3, V2= letters[1:3]) How do I create a new data frame of the difference between DF1 and DF2 newDF=data.frame(V1=4:6, V2= letters[4:6]) In my real data, the rows are not in order as in the example I provided. Thanks much Joseph [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] string functions
Hello, trying to locate all the string commands in the base version of R, can't seem to find an area that describes them. I am in need to do some serious parsing of text data to create my dataset. Is there a summary link to all the character operators? string manipulations that would help in parsing text. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] difference of two data frames
Hi Jorge both commands work; can you extend it to several coulmns? the reason I am asking is that in my real data the uniqueness of the rows is made of all the columns; in other words V1 might have duplicates. Thanks - Original Message From: Jorge Ivan Velez <[EMAIL PROTECTED]> To: joseph <[EMAIL PROTECTED]> Cc: r-help@r-project.org Sent: Sunday, September 14, 2008 10:23:33 AM Subject: Re: [R] difference of two data frames Hi Joseph, Try this: DF1[!DF1$V1%in%DF2$V1,] subset(DF1,!V1%in%DF2$V1) HTH, Jorge On Sun, Sep 14, 2008 at 12:49 PM, joseph <[EMAIL PROTECTED]> wrote: Hello I have 2 data frames DF1 and DF2 where DF2 is a subset of DF1: DF1= data.frame(V1=1:6, V2= letters[1:6]) DF2= data.frame(V1=1:3, V2= letters[1:3]) How do I create a new data frame of the difference between DF1 and DF2 newDF=data.frame(V1=4:6, V2= letters[4:6]) In my real data, the rows are not in order as in the example I provided. Thanks much Joseph [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fetching a range of columns
On Sep 14, 2008, at 12:22 PM, Jason Thibodeau wrote: Hello, I realize that using: x[x > 3 & x < 5] I can fetch all elements between 3 and 5. However I read in from a CSV file, and I would like to fetch all columns from within a range ( 842-2411). In teh past, I have done this to fetch just select few columns: data <- read.csv(filein, header=TRUE, nrows=320, skip=nskip) data_filter <- data[c(2,12,17)] write.table(data_filter, fileout, append = TRUE, sep= ",", row.names= FALSE, col.names = FALSE) nskip <- nskip+320 This time, however, instead of grabbing columns 2, 12, 17, I woudl like all columns in the range of 842-2411. I can't seem to do this correctly. Could somebody please provide some insight? Thanks in advance. Have your tried: data_filter <- data[seq(842,2411)] write.table(data_filter, fileout, append = TRUE, sep= ",", row.names= FALSE, col.names = FALSE) When I use that format on a dataframe I have lying around, I get the expected results and I do not find in testing that dataframes are challenged by assigning 5000 columns. -- David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fetching a range of columns
Have you tried: data_filter <- data[842:2411] Also if you have a lot of data to read, I would suggest that you use a connection, and it all the data is numeric, possibly 'scan'. If you do use a connection, this would eliminate having to 'skip' each time which could be time consuming on a large file. Since it appears that you are not writing out the column names in the output file, you could bypass the header line on the file by readLine after the open. So something like this might work: input <- file('yourfile','r') invisible(readLines(input, n=1)) # skip the header while (TRUE){ # read file x <- try(read.csv(input, n=320, header=FALSE), silent=TRUE) # catch EOF if (inherits(x, 'try-error')) break write.csv(...) } On Sun, Sep 14, 2008 at 12:22 PM, Jason Thibodeau <[EMAIL PROTECTED]> wrote: > Hello, > > I realize that using: x[x > 3 & x < 5] I can fetch all elements between 3 > and 5. However I read in from a CSV file, and I would like to fetch all > columns from within a range ( 842-2411). In teh past, I have done this to > fetch just select few columns: > > data <- read.csv(filein, header=TRUE, nrows=320, skip=nskip) >data_filter <- data[c(2,12,17)] >write.table(data_filter, fileout, append = TRUE, > sep= ",", row.names= FALSE, col.names = FALSE) >nskip <- nskip+320 > > This time, however, instead of grabbing columns 2, 12, 17, I woudl like all > columns in the range of 842-2411. I can't seem to do this correctly. Could > somebody please provide some insight? Thanks in advance. > > -- > Jason Thibodeau > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] string functions
On Sep 14, 2008, at 1:53 PM, zubin wrote: Hello, trying to locate all the string commands in the base version of R, can't seem to find an area that describes them. I am in need to do some serious parsing of text data to create my dataset. Is there a summary link to all the character operators? string manipulations that would help in parsing text. A bit of use of the ? operator on paste and strsplt produces (among other things: See Also String manipulation with as.character, substr, nchar, strsplit; further, cat which concatenates and writes to a file, and sprintf for C like string construction. See Also paste for the reverse, grep and sub for string search and manipulation; further nchar, substr. You might look at the results of: help.search("string") help.search("character") -- David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] string functions
Start with ?grep and then follow the "See Also". Exactly what type of serious parsing are you trying to do? R can do some, but if it is very complex, you might want to consider awk/perl. On Sun, Sep 14, 2008 at 1:53 PM, zubin <[EMAIL PROTECTED]> wrote: > Hello, trying to locate all the string commands in the base version of R, > can't seem to find an area that describes them. I am in need to do some > serious parsing of text data to create my dataset. Is there a summary link > to all the character operators? string manipulations that would help in > parsing text. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fetching a range of columns
Jim, this is a GREAT help. I was trying something similar before, but I was unable to detect EOF. Thanks for the help! Also, David, your suggestion worked perfectly. Thanks for all the help, everyone! On Sun, Sep 14, 2008 at 2:08 PM, jim holtman <[EMAIL PROTECTED]> wrote: > Have you tried: > > data_filter <- data[842:2411] > > Also if you have a lot of data to read, I would suggest that you use a > connection, and it all the data is numeric, possibly 'scan'. If you > do use a connection, this would eliminate having to 'skip' each time > which could be time consuming on a large file. Since it appears that > you are not writing out the column names in the output file, you could > bypass the header line on the file by readLine after the open. So > something like this might work: > > input <- file('yourfile','r') > invisible(readLines(input, n=1)) # skip the header > while (TRUE){ # read file >x <- try(read.csv(input, n=320, header=FALSE), silent=TRUE) # catch EOF >if (inherits(x, 'try-error')) break >write.csv(...) > } > > > > On Sun, Sep 14, 2008 at 12:22 PM, Jason Thibodeau <[EMAIL PROTECTED]> > wrote: > > Hello, > > > > I realize that using: x[x > 3 & x < 5] I can fetch all elements between 3 > > and 5. However I read in from a CSV file, and I would like to fetch all > > columns from within a range ( 842-2411). In teh past, I have done this to > > fetch just select few columns: > > > > data <- read.csv(filein, header=TRUE, nrows=320, skip=nskip) > >data_filter <- data[c(2,12,17)] > >write.table(data_filter, fileout, append = TRUE, > > sep= ",", row.names= FALSE, col.names = FALSE) > >nskip <- nskip+320 > > > > This time, however, instead of grabbing columns 2, 12, 17, I woudl like > all > > columns in the range of 842-2411. I can't seem to do this correctly. > Could > > somebody please provide some insight? Thanks in advance. > > > > -- > > Jason Thibodeau > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > -- Jason Thibodeau [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] string functions
Try this: help.search(keyword = "character", package = "base") Then read each of the pages listed to get info on the indicated command plus related commands also described on those pages (but not necessarily listed in the help.search list). You might also want to look at the gsubfn package and its vignette (i.e. its pdf document). The gsubfn and strapply commands in that package can be used for certain parsing tasks. Its home page is at: http://gsubfn.googlecode.com On Sun, Sep 14, 2008 at 1:53 PM, zubin <[EMAIL PROTECTED]> wrote: > Hello, trying to locate all the string commands in the base version of R, > can't seem to find an area that describes them. I am in need to do some > serious parsing of text data to create my dataset. Is there a summary link > to all the character operators? string manipulations that would help in > parsing text. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] difference of two data frames
It would be useful to have indexed both dataframes with a unique identifier, such as in rownames etc. Without that information, you could possibly try to use the same approach as duplicated() does by "pasting together a character representation of rows" using "|" (or any other separator). keys1 <- apply(DF1, 1, paste, collapse="|") keys1 [1] "1|a" "2|b" "3|c" "4|d" "5|e" "6|f" duplicated(keys1) [1] FALSE FALSE FALSE FALSE FALSE FALSE keys2 <- apply(DF2, 1, paste, collapse="|") keys2 [1] "1|a" "2|b" "3|c" duplicated(keys2) [1] FALSE FALSE FALSE The duplicated part is neccessary to ensure the key generated is truly unique. You might want to experiment and see if you can create a unique key using just a few columns. keys1 %in% keys2 [1] TRUE TRUE TRUE FALSE FALSE FALSE w <- setdiff( keys1, keys2 ) DF1[ w, ] V1 V2 4 4 d 5 5 e 6 6 f Regards, Adai joseph wrote: Hi Jorge both commands work; can you extend it to several coulmns? the reason I am asking is that in my real data the uniqueness of the rows is made of all the columns; in other words V1 might have duplicates. Thanks - Original Message From: Jorge Ivan Velez <[EMAIL PROTECTED]> To: joseph <[EMAIL PROTECTED]> Cc: r-help@r-project.org Sent: Sunday, September 14, 2008 10:23:33 AM Subject: Re: [R] difference of two data frames Hi Joseph, Try this: DF1[!DF1$V1%in%DF2$V1,] subset(DF1,!V1%in%DF2$V1) HTH, Jorge On Sun, Sep 14, 2008 at 12:49 PM, joseph <[EMAIL PROTECTED]> wrote: Hello I have 2 data frames DF1 and DF2 where DF2 is a subset of DF1: DF1= data.frame(V1=1:6, V2= letters[1:6]) DF2= data.frame(V1=1:3, V2= letters[1:3]) How do I create a new data frame of the difference between DF1 and DF2 newDF=data.frame(V1=4:6, V2= letters[4:6]) In my real data, the rows are not in order as in the example I provided. Thanks much Joseph [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] difference of two data frames
Actually you got it, the data sets you created are a perfect example (row#1 and row#2 in DF1 have the same V1 and differ only in V2) , but here is the problem: row#2 in DF1 exists in DF1 and not in DF2, however it does not show in the Difference. It seems to me that both V1 and V2 should be considered when calculating the difference. - Original Message From: Jorge Ivan Velez <[EMAIL PROTECTED]> To: joseph <[EMAIL PROTECTED]> Sent: Sunday, September 14, 2008 11:14:11 AM Subject: Re: [R] difference of two data frames Hi Joseph, I'm not sure if I understood your point, but try this: # Data sets DF1= data.frame(V1=c(1,1,2,3,3,4,5,5,6), V2= letters[1:9]) DF2= data.frame(V1=1:3, V2= letters[1:3]) # Difference DF1[! DF1$V1 %in% DF2$V1,] HTH, Jorge On Sun, Sep 14, 2008 at 1:57 PM, joseph <[EMAIL PROTECTED]> wrote: Hi Jorge both commands work; can you extend it to several coulmns? the reason I am asking is that in my real data the uniqueness of the rows is made of all the columns; in other words V1 might have duplicates. Thanks - Original Message From: Jorge Ivan Velez <[EMAIL PROTECTED]> To: joseph <[EMAIL PROTECTED]> Cc: r-help@r-project.org Sent: Sunday, September 14, 2008 10:23:33 AM Subject: Re: [R] difference of two data frames Hi Joseph, Try this: DF1[!DF1$V1%in%DF2$V1,] subset(DF1,!V1%in%DF2$V1) HTH, Jorge On Sun, Sep 14, 2008 at 12:49 PM, joseph <[EMAIL PROTECTED]> wrote: Hello I have 2 data frames DF1 and DF2 where DF2 is a subset of DF1: DF1= data.frame(V1=1:6, V2= letters[1:6]) DF2= data.frame(V1=1:3, V2= letters[1:3]) How do I create a new data frame of the difference between DF1 and DF2 newDF=data.frame(V1=4:6, V2= letters[4:6]) In my real data, the rows are not in order as in the example I provided. Thanks much Joseph [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] string functions
On Sep 14, 2008, at 1:53 PM, zubin wrote: Hello, trying to locate all the string commands in the base version of R, can't seem to find an area that describes them. I am in need to do some serious parsing of text data to create my dataset. Is there a summary link to all the character operators? string manipulations that would help in parsing text. A further thought would be to look at the Natural Language Processing TaskView: http://cran.r-project.org/web/views/NaturalLanguageProcessing.html -- David Winsemius, MD Heritage Laboratories __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fetching a range of columns
TEST_filter("line50grab.csv","line50grab_filterout.csv") Error in `[.data.frame`(data_tmp, seq(842, 2411)) : undefined columns selected I know my file has about 3000 columns. This happened when I used: data_tmp <- read.csv(filein, header=TRUE, nrows=10, skip=nskip) data_filter <- data_tmp[seq(842,2411)] write.table(data_filter, fileout, append = TRUE, sep= ",", row.names= FALSE, col.names = FALSE) Also using data_tmp[842:2411] did not yield any output being written to my file. I have another slightly unrelated problem, but I'll propose that after this one can be solved. Thanks a lot. On Sun, Sep 14, 2008 at 2:14 PM, Jason Thibodeau <[EMAIL PROTECTED]>wrote: > Jim, this is a GREAT help. I was trying something similar before, but I was > unable to detect EOF. Thanks for the help! > > Also, David, your suggestion worked perfectly. > > Thanks for all the help, everyone! > > > On Sun, Sep 14, 2008 at 2:08 PM, jim holtman <[EMAIL PROTECTED]> wrote: > >> Have you tried: >> >> data_filter <- data[842:2411] >> >> Also if you have a lot of data to read, I would suggest that you use a >> connection, and it all the data is numeric, possibly 'scan'. If you >> do use a connection, this would eliminate having to 'skip' each time >> which could be time consuming on a large file. Since it appears that >> you are not writing out the column names in the output file, you could >> bypass the header line on the file by readLine after the open. So >> something like this might work: >> >> input <- file('yourfile','r') >> invisible(readLines(input, n=1)) # skip the header >> while (TRUE){ # read file >>x <- try(read.csv(input, n=320, header=FALSE), silent=TRUE) # catch >> EOF >>if (inherits(x, 'try-error')) break >>write.csv(...) >> } >> >> >> >> On Sun, Sep 14, 2008 at 12:22 PM, Jason Thibodeau <[EMAIL PROTECTED]> >> wrote: >> > Hello, >> > >> > I realize that using: x[x > 3 & x < 5] I can fetch all elements between >> 3 >> > and 5. However I read in from a CSV file, and I would like to fetch all >> > columns from within a range ( 842-2411). In teh past, I have done this >> to >> > fetch just select few columns: >> > >> > data <- read.csv(filein, header=TRUE, nrows=320, skip=nskip) >> >data_filter <- data[c(2,12,17)] >> >write.table(data_filter, fileout, append = TRUE, >> > sep= ",", row.names= FALSE, col.names = FALSE) >> >nskip <- nskip+320 >> > >> > This time, however, instead of grabbing columns 2, 12, 17, I woudl like >> all >> > columns in the range of 842-2411. I can't seem to do this correctly. >> Could >> > somebody please provide some insight? Thanks in advance. >> > >> > -- >> > Jason Thibodeau >> > >> >[[alternative HTML version deleted]] >> > >> > __ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> >> >> -- >> Jim Holtman >> Cincinnati, OH >> +1 513 646 9390 >> >> What is the problem that you are trying to solve? >> > > > > -- > Jason Thibodeau > -- Jason Thibodeau [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fetching a range of columns
On Sep 14, 2008, at 4:01 PM, Jason Thibodeau wrote: TEST_filter("line50grab.csv","line50grab_filterout.csv") Error in `[.data.frame`(data_tmp, seq(842, 2411)) : undefined columns selected I am guessing that you wrapped some code into a function but you did not provide the function. You are not really following the posting guidelines here. I know my file has about 3000 columns. This happened when I used: data_tmp <- read.csv(filein, header=TRUE, nrows=10, skip=nskip) data_filter <- data_tmp[seq(842,2411)] write.table(data_filter, fileout, append = TRUE, sep= ",", row.names= FALSE, col.names = FALSE) Also using data_tmp[842:2411] did not yield any output being written to my file. Not a big surprise. Appears the error preceded the write.table call. I have another slightly unrelated problem, but I'll propose that after this one can be solved. If the problem is not with the syntax or semantics of TEST_filter as I suspect, then perhaps you should examine the input file from R's perspective with: ?count.fields Hard to tell without the actual code and sample data. -- David Winsemius Thanks a lot. On Sun, Sep 14, 2008 at 2:14 PM, Jason Thibodeau <[EMAIL PROTECTED]>wrote: Jim, this is a GREAT help. I was trying something similar before, but I was unable to detect EOF. Thanks for the help! Also, David, your suggestion worked perfectly. Thanks for all the help, everyone! On Sun, Sep 14, 2008 at 2:08 PM, jim holtman <[EMAIL PROTECTED]> wrote: Have you tried: data_filter <- data[842:2411] Also if you have a lot of data to read, I would suggest that you use a connection, and it all the data is numeric, possibly 'scan'. If you do use a connection, this would eliminate having to 'skip' each time which could be time consuming on a large file. Since it appears that you are not writing out the column names in the output file, you could bypass the header line on the file by readLine after the open. So something like this might work: input <- file('yourfile','r') invisible(readLines(input, n=1)) # skip the header while (TRUE){ # read file x <- try(read.csv(input, n=320, header=FALSE), silent=TRUE) # catch EOF if (inherits(x, 'try-error')) break write.csv(...) } On Sun, Sep 14, 2008 at 12:22 PM, Jason Thibodeau <[EMAIL PROTECTED] > wrote: Hello, I realize that using: x[x > 3 & x < 5] I can fetch all elements between 3 and 5. However I read in from a CSV file, and I would like to fetch all columns from within a range ( 842-2411). In teh past, I have done this to fetch just select few columns: data <- read.csv(filein, header=TRUE, nrows=320, skip=nskip) data_filter <- data[c(2,12,17)] write.table(data_filter, fileout, append = TRUE, sep= ",", row.names= FALSE, col.names = FALSE) nskip <- nskip+320 This time, however, instead of grabbing columns 2, 12, 17, I woudl like all columns in the range of 842-2411. I can't seem to do this correctly. Could somebody please provide some insight? Thanks in advance. -- Jason Thibodeau [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? -- Jason Thibodeau -- Jason Thibodeau [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] moving from aov() to lmer()
On Sat, 13 Sep 2008, roberto toro wrote: Hello, I've used this command to analyse changes in brain volume: mod1<-aov(Volume~Sex*Lobe*Tissue+Error(Subject/(Lobe*Tissue)),data.vslt) I'm comparing males/females. For every subject I have 8 volume measurements (4 different brain lobes and 2 different tissues (grey/white matter)). As aov() provides only type I anovas, I would like to use lmer() with type II, however, I have struggled to find the right syntaxis. How should I write the model I use with aov() using lmer()?? Specifying Subject as a random effect is straightforward mod2<-lmer(Volume~Sex*Lobe*Tissue+(1|Subject),data.vslt) but I can't figure out the /(Lobe*Tissue) part... You're trying to model a separate effect of lobe, of tissue, and of the interaction between lobe and tissue for each subject, so you want mod2<-lmer(Volume~Sex*Lobe*Tissue+(Lobe*Tissue|Subject),data.vslt) ...the resulting fixed effect for Lobe, Tissue, and L:T in the summary() then corresponds to the within-subjects effect aggregated (but not exactly AVERAGED) across subjects. So, it's not exactly providing you a Type II ANOVA...it's doing a mixed-effects model (or HLM, if you prefer), which as you've written it is a Type III analysis (though once again, not an ANOVA in the classical sense). To get something more akin to type II using the lmer function (and I trust someone will pipe up if there is a better way), you could first fit mod2.additive<-lmer(Volume~Sex*Lobe+Tissue+(Lobe+Tissue|Subject),data.vslt) ...and interpret the coefficients and effects provided by it, then fit the crossed model to get the coefficients and effects for the higher-order terms. I hope this made sense and that I have understood you correctly. --Adam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] histogram
i calculated the density and wanna do something like this separate in 0-19-29-39-49-59-69-79-99 and put in these spaces 8 densities .. 0.something i have the frequency in % and divided already in 20 or 10 to get the density i tried and tried..made breaks vector to separate but couldn't put the other vector with the frequency density onit directly anyone know how to do it?? tks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help please! How to code a mixed-model with 2 within-subject factors using lme or lmer?
On Sun, 14 Sep 2008, roberto toro wrote: Thanks for answering Mark! I tried with the coding of the interaction you suggested: tfac<-with(vlt,interaction(Lobe,Tissue,drop=T)) mod<-lme(Volume~Sex*Lobe*Tissue,random=~1|Subject/tfac,data=vlt) But is it normal that the DF are 2303? DF is 2303 even for the estimate of LobeO that has only 662 values (331 for Tissue=white and 331 for Tissue=grey). I'm not sure either that Sex, Lobe and Tissue are correctly handled why are there different estimates called Sex:LobeO, Sex:LobeP, etc, and not just Sex:Lobe as with aov()?. Why there's Tissuew, but not Sex1, for example? lme is basically doing a regression, not an ANOVA as you're used to it. You may want anova(mod) instead of summary(mod) to see aggregated effects. Or, you could define contrasts among your levels by assigning to contrasts(vlt$Lobe), for example. Also, in the above model, you're only looking at modeling a separate average volume for each subject-within-tfac; if I read you correctly, you actually want to model a lobe and tissue effect for each subject for each tfac, in which case you would want something like what was in my last post. --Adam Thanks again! roberto ps1. How would you code this with lmer()? ps2. this is part of the output of mod<-lme: summary(mod) Linear mixed-effects model fit by REML Data: vlt AIC BIClogLik 57528.35 57639.98 -28745.17 Random effects: Formula: ~1 | Subject (Intercept) StdDev:11294.65 Formula: ~1 | tfac %in% Subject (Intercept) Residual StdDev:10569.03 4587.472 Fixed effects: Volume ~ Sex * Lobe * Tissue Value Std.Error DFt-value p-value (Intercept)245224.61 1511.124 2303 162.27963 0. Sex 2800.01 1866.312 3291.50029 0.1345 LobeO -180794.83 1526.084 2303 -118.46975 0. LobeP -131609.27 1526.084 2303 -86.23984 0. LobeT -73189.97 1526.084 2303 -47.95932 0. Tissuew-72461.05 1526.084 2303 -47.48168 0. Sex:LobeO-663.27 1884.789 2303 -0.35191 0.7249 Sex:LobeP -2146.08 1884.789 2303 -1.13863 0.2550 Sex:LobeT1379.49 1884.789 23030.73191 0.4643 Sex:Tissuew 5387.65 1884.789 23032.85849 0.0043 LobeO:Tissuew 43296.99 2158.209 2303 20.06154 0. LobeP:Tissuew 50952.21 2158.209 2303 23.60856 0. LobeT:Tissuew -15959.31 2158.209 2303 -7.39470 0. Sex:LobeO:Tissuew -5228.66 2665.494 2303 -1.96161 0.0499 Sex:LobeP:Tissuew -1482.83 2665.494 2303 -0.55631 0.5781 Sex:LobeT:Tissuew -6037.49 2665.494 2303 -2.26506 0.0236 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fetching a range of columns
Hi Jason, data[] is a data frame, remember--you need to specify rows AND columns. So, data[,c(2,12,17)] is what you should be doing in the first place, and data[,842:2411] in the second place. Not sure if the help you needed was using the comma, or the : syntax, or if you're trying to read only certain columns during the read.csv process (which I don't think that's possible). --Adam On Sun, 14 Sep 2008, Jason Thibodeau wrote: Hello, I realize that using: x[x > 3 & x < 5] I can fetch all elements between 3 and 5. However I read in from a CSV file, and I would like to fetch all columns from within a range ( 842-2411). In teh past, I have done this to fetch just select few columns: data <- read.csv(filein, header=TRUE, nrows=320, skip=nskip) data_filter <- data[c(2,12,17)] write.table(data_filter, fileout, append = TRUE, sep= ",", row.names= FALSE, col.names = FALSE) nskip <- nskip+320 This time, however, instead of grabbing columns 2, 12, 17, I woudl like all columns in the range of 842-2411. I can't seem to do this correctly. Could somebody please provide some insight? Thanks in advance. -- Jason Thibodeau [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] k-sample Kolmogorov-Smirnov test?
Hello, I would like to conduct a k-sample K-S test, but cannot find reference to its implementation in R. Does anyone have experience with this? Thanks, Mark [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fetching a range of columns
On Sep 14, 2008, at 4:40 PM, Jason Thibodeau wrote: > I cannot provide (all) the sample data (NDA) but here is the entire > function: > TEST_filter <- function(filein,fileout) > > { > file.remove(fileout) > nskip<-0 > while(1) > { > data_tmp <- read.csv(filein, header=TRUE, > nrows=10, skip=nskip) > > data_filter <- data_tmp[842,2411] Looks like you forgot a few syntactically essential items here: data_tmp[842,2411] would only be the 842nd row in the 2411st column. And, since you only have 10 rows, you got an informative error. I > > write.table(data_filter, fileout, append = > TRUE, sep= ",", row.names= FALSE, col.names = FALSE) > nskip <- nskip+10 > } > > } You also say file.remove( fileout) , then you try to append to fileout. Does that make sense? -- David Winsemius, MD Heritage Laboratories > > > Thanks for the help. > > On Sun, Sep 14, 2008 at 4:24 PM, David Winsemius <[EMAIL PROTECTED] > > wrote: > > On Sep 14, 2008, at 4:01 PM, Jason Thibodeau wrote: > > TEST_filter("line50grab.csv","line50grab_filterout.csv") > Error in `[.data.frame`(data_tmp, seq(842, 2411)) : > undefined columns selected > > > I am guessing that you wrapped some code into a function but you did > not provide the function. You are not really following the posting > guidelines here. > > > > I know my file has about 3000 columns. > > This happened when I used: > data_tmp <- read.csv(filein, header=TRUE, nrows=10, skip=nskip) > data_filter <- data_tmp[seq(842,2411)] > write.table(data_filter, fileout, append = TRUE, > sep= ",", row.names= FALSE, col.names = FALSE) > > Also using data_tmp[842:2411] did not yield any output being written > to my > file. > > Not a big surprise. Appears the error preceded the write.table call. > > > I have another slightly unrelated problem, but I'll propose that > after this > one can be solved. > > If the problem is not with the syntax or semantics of TEST_filter as > I suspect, then perhaps you should examine the input file from R's > perspective with: > > ?count.fields > > Hard to tell without the actual code and sample data. > > -- > David Winsemius > > > > > Thanks a lot. > > On Sun, Sep 14, 2008 at 2:14 PM, Jason Thibodeau > <[EMAIL PROTECTED]>wrote: > > Jim, this is a GREAT help. I was trying something similar before, > but I was > unable to detect EOF. Thanks for the help! > > Also, David, your suggestion worked perfectly. > > Thanks for all the help, everyone! > > > On Sun, Sep 14, 2008 at 2:08 PM, jim holtman <[EMAIL PROTECTED]> > wrote: > > Have you tried: > > data_filter <- data[842:2411] > > Also if you have a lot of data to read, I would suggest that you use a > connection, and it all the data is numeric, possibly 'scan'. If you > do use a connection, this would eliminate having to 'skip' each time > which could be time consuming on a large file. Since it appears that > you are not writing out the column names in the output file, you could > bypass the header line on the file by readLine after the open. So > something like this might work: > > input <- file('yourfile','r') > invisible(readLines(input, n=1)) # skip the header > while (TRUE){ # read file > x <- try(read.csv(input, n=320, header=FALSE), silent=TRUE) # catch > EOF > if (inherits(x, 'try-error')) break > write.csv(...) > } > > > > On Sun, Sep 14, 2008 at 12:22 PM, Jason Thibodeau > <[EMAIL PROTECTED]> > wrote: > Hello, > > I realize that using: x[x > 3 & x < 5] I can fetch all elements > between > 3 > and 5. However I read in from a CSV file, and I would like to fetch > all > columns from within a range ( 842-2411). In teh past, I have done this > to > fetch just select few columns: > > data <- read.csv(filein, header=TRUE, nrows=320, skip=nskip) > data_filter <- data[c(2,12,17)] > write.table(data_filter, fileout, append = TRUE, > sep= ",", row.names= FALSE, col.names = FALSE) > nskip <- nskip+320 > > This time, however, instead of grabbing columns 2, 12, 17, I woudl > like > all > columns in the range of 842-2411. I can't seem to do this correctly. > Could > somebody please provide some insight? Thanks in advance. > > -- > Jason Thibodeau > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > > > > > -- > Jason Thibodeau > > > > > -- > Jason Thibodeau > >[[alternative HTML version deleted]] > > _
[R] using R for accessing web site data -
Hello, what's the most efficient way of using R to automate a data collection task i have: -Login into a web site using my ID and PWD -submit a query within the site using the search form after login -extract the result of the search data into R so i can cleanse and use for analysis kind of like a web scraping task, but like to do this in R. I checked out RCurl, this seems very low level? This leads to using R to perform mashups of various sites for data analysis. -zubin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fw: Complex sampling survey _ Use of survey package
On Fri, 12 Sep 2008, Ahoussou Sylvie wrote: -- From: "Ahoussou Sylvie" <[EMAIL PROTECTED]> Sent: Friday, September 12, 2008 9:48 AM To: "Thomas Lumley" <[EMAIL PROTECTED]> Subject: Re: [R] Complex sampling survey _ Use of survey package Thanks for your answer I think I made a mistake when I recopied the 5 first rows of my database here is the table with the comlums of interest num esp fpc1 Totanim Id_An 2045 G 551 12 10 2046 C 551 68 11 2070 G 551 9 50 2070 S 551 9 51 2070 S 551 9 52 yes Totanim is the total number of animals in the farm and num is the total number of herds Do you mean 'fpc1 is the total number of herds'? That is what your svydesign() call says. I keep on obtaining this error message clustot<-svydesign(id=~num+ ~ Id_An, fpc=~fpc1+~Totanim, data=tab1) Erreur dans as.fpc(fpc, strata, ids) : FPC implies >100% sampling in some strata. Well, we seem to have either a bug or a problem with the data. If you do options(error=recover) before the svydesign() call you can go into as.fpc() and look at the data. As an example; Error in as.fpc(fpc, strata, ids) : FPC implies >100% sampling in some strata. Enter a frame number, or 0 to exit 1: svydesign(id = ~dnum + snum, fpc = ~fpc1 + I(pmin(fpc2, 4)), data = apiclus2) 2: svydesign.default(id = ~dnum + snum, fpc = ~fpc1 + I(pmin(fpc2, 4)), data = apiclus2) 3: as.fpc(fpc, strata, ids) Selection: 3 Called from: eval(expr, envir, enclos) Browse[1]> which(sampsize>popsize, arr.ind=TRUE) row col 22 22 2 23 23 2 24 24 2 ... Browse[1]> sampsize[22,2] [1] 5 Browse[1]> popsize[22,2] [1] 4 Browse[1]> ids[22,] dnumsnum 22 200 200.841 So in this case one of the problems is in dnum 200, snum 841, where the population size was specified as 4 but the sample size is 5. -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] k-sample Kolmogorov-Smirnov test?
Maybe you should look a little harder. help.search("Kolmogorov") PLEASE do read the posting guide http://www.R-project.org/posting-guide.html --Adam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ward's Clustering Doubts
Hi Everybody, Now I have a doubt that is more statistical than R's technical. Im working with ecology of recent Foraminifera. At the lab we used to perform cluster analysis using 1-Pearsons R and Wards method (we already saw it in bibliography of the area) which renders good results with our biological data. Recently, using R Software (vegan and Cluster packages) which allows the combination of any kind of distances matrix with any clustering method, we tried to used Bray Curtis + Wards (which seem to be more appropriate to a matrix with a lot of zeros) and it renders a better result. Furthermore, the results agree with our hypothesis and with the results we have got with the Distance-based Redundancy Analysis - dbRDA or CAP. It means, the analysis (Q-mode) clusters the stations according to the main physical, sedimentary and biological characteristics of the study area. We received some critical comments noticing that Wards Method accepts Euclidean Distance only. So, we made the analysis again using Euclidean Distance but we dont get the better results we had using 1-Pearsons R + Wards or Bray Curtis + Wards (actually any other distance + method combination rendered better results). Trying to find answers in the specialized literature we just got little more confused because in any moment we saw something like "You must use it with Euclidean Distance" and like I said above we already saw in some articles from respected journals, other kind of distance associated with the Ward's Clustering method. Is it wrong or is it non sense to do the analysis in the way we were doing? The results with Wards combined with 1-Pearsons R or Bray Curtis fit better with our hypothesis and have excellent agglomerative coefficients , but we dont want to make inappropriate statistical procedures. I'm starting to realize how powerful R is, but it doesn't justify doing nonsense statistics... I hope one of you may help us! Thank you in advance. Rodrigo. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combining tables
Thanks Jim, it worked great! On Sun, 2008-09-14 at 21:27 -0400, jim holtman wrote: > try this: > > > c(tx,ty) > 1 2 3 3 4 > 3 2 1 4 1 > > z <- c(tx,ty) > > tapply(z, names(z), sum) > 1 2 3 4 > 3 2 5 1 > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] any package to do generalized linear mixed model?
I checked GlmmML package. However, it can only do binomial and poisson distribution. How about others such as gamma or neg binomial? Thank you so much! wensui __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nonlinear regression question&[EMAIL PROTECTED]
On Sep 14, 2008, at 6:53 PM, Esther Meenken wrote: I was unable to open this file Bill Venables' excellent "Exegeses on Linear Models" posted at http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.ps.gz I'd be very interested in reading it? It's a gzipped file that expands into a ps file. On my Mac running Leopard, Stuffit does the expansion and Preview does the viewing. You need a utility on whatever (unspecified) OS you are running that will un-gzip it, and then you need a Postscript viewer. If that OS happens to be a flavor of Windows, then I know from experience that Ghostscript and its associated viewer, Ghostview, will work for the second half of the process. Google is your friend. I suspect you need UnRAR, or something along those lines for the decompression step. -- David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NAs and stl
I would like to decompose the log of a time series. There will be time slots that are zero. In order to handle this I insert 'NA' for all of the zeros in the time series. Having a zero value is a very legitamate value. With those NAs in the time series stil requires an na.action argument because the default is na.fail which I don't want. I am afraid if I use na.omit it will skew the seasonality and trend because what if there are seasonal components which are zero? Anyway I was looking for recommendations for what the value of na.action should be for stl. Ideally I would like to just keep the NAs there, not flag an error yet still have those values "count" for the seasonal or trend calculations. Thank you. Kevin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] need help please (HoltWinters function)
every time i try to run HoltWinters i get this error message: > HoltWinters(z, seasonal="additive") Error in decompose(ts(x[1:wind], start = start(x), frequency = f), seasonal) : time series has no or less than 3 periods what's going on? somebody please help me. -- View this message in context: http://www.nabble.com/need-help-please-%28HoltWinters-function%29-tp19484728p19484728.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to draw a plot like this?
Hi there, I hope to draw a plot like this: http://www.sg-chem.net/swizard/Ru-bqdi-spectra.gif is it possible to draw it using R? thanks for any suggestions. regards, Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fetching a range of columns
On Sep 14, 2008, at 5:39 PM, Adam D. I. Kramer wrote: Hi Jason, data[] is a data frame, remember--you need to specify rows AND columns. So, data[,c(2,12,17)] is what you should be doing in the first place, and data[,842:2411] in the second place. Actually, the construction df[c(2,12,17)] will return just the 2nd, 12th and 17th named column vectors. Try it and see: df <- data.frame(a=1:10, b= 10:1, c=LETTERS[1:10]) df df[c(2,3)] Not sure if the help you needed was using the comma, or the : syntax, or if you're trying to read only certain columns during the read.csv process (which I don't think that's possible). ? colClasses # with the vector element of NULL for each unwanted column. I am not the person to be advising how to do this properly, as all of my efforts to use this facility to date have failed and I have resorted to reading in lines with as.is=TRUE and then post- processing. But the facility does exist. Maybe someone could give me a clue how one might construct a vector to send to colClasses inside read.table? > mt <- matrix(1:200,nrow=4) > write.table(file=file.choose(), as.data.frame(mt)) > read.table(file.choose()) V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 1 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101 105 109 113 2 2 6 10 14 18 22 26 30 34 38 42 46 50 54 58 62 66 70 74 78 82 86 90 94 98 102 106 110 114 3 3 7 11 15 19 23 27 31 35 39 43 47 51 55 59 63 67 71 75 79 83 87 91 95 99 103 107 111 115 4 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100 104 108 112 116 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 V41 V42 V43 V44 V45 V46 V47 V48 V49 V50 1 117 121 125 129 133 137 141 145 149 153 157 161 165 169 173 177 181 185 189 193 197 2 118 122 126 130 134 138 142 146 150 154 158 162 166 170 174 178 182 186 190 194 198 3 119 123 127 131 135 139 143 147 151 155 159 163 167 171 175 179 183 187 191 195 199 4 120 124 128 132 136 140 144 148 152 156 160 164 168 172 176 180 184 188 192 196 200 Not working efforts: tstdta <- read.table(file.choose(), colClasses = c(c(paste(rep("NULL", 49),sep=","),"numeric"),header=TRUE) tstdta <- read.table(file.choose(), colClasses = paste(rep("NULL", 49),"numeric",sep=","),header=TRUE) -- David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] making spearman correlation cor() call fail with log(0) as input
Hi Martin, I got my initial question fully answered. I do not have enough experience to to judge whether the behavior of R with regard to Inf is "excellent" or "better" than Perl. In my opinion, both Perl and R are great languages, designed for very different applications. So instead of me trying to impose The Perl Way upon R, I would like to say how very grateful I am to the contributors to the R core and other packages, and to the contributors to the R mailing lists. Because this is what I really feel. R and its packages have been very useful to me on countless occasions. Thank you, Martin and Greg! Best regards, Timur On Sat, Sep 13, 2008 at 8:48 AM, Martin Maechler <[EMAIL PROTECTED] > wrote: > > "TS" == Timur Shtatland <[EMAIL PROTECTED]> > > on Fri, 12 Sep 2008 11:52:25 -0400 writes: > >TS> I am more used to getting an error if you try to take >TS> the log of 0, like this (in Perl): > >TS> perl -le 'for my $num (1, 0, -1, -2) { print log $num; >TS> }' 0 Can't take log of 0 at -e line 1. > >TS> R is different. With R, you do not even get a *warning* >TS> about log(0). Only log() of negative number produces a >TS> warning: > > [] > > and why do you think the perl behavior to be better?? > R has been very carefully designed in such matters: > > The principle is that *limits* should work (using +/-Inf) were > possible. > For log(.) the limit only exists from the right and clearly is > -Inf, so that's a feature. > > BTW, S/R behavior of 1/0 |--> Inf could be considered as > more dangerous, since really the +Inf is the limit from the > right only with the limit from the left being ``quite > different''. > But no, I'm not proposing to change R here (and actually would > "fight" to keep it if that was necessary). > > >TS> I agree with you that Spearman's correlation's invariance to > monotone >TS> transformations is an advantage. It is R's happy >TS> attitude to -Inf and Inf that puzzled me at >TS> first. Anyhow, verifying and/or preprocessing the input >TS> to cor() is the answer to my questions. Thank you again >TS> for the help! > > So you now have understood that R's behavior of handling +/- Inf > in this respect is rather excellent than bogous ? > > Martin Maechler, ETH Zurich (and R-core team) > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] k-sample Kolmogorov-Smirnov test?
On 15/09/2008, at 10:02 AM, Adam D. I. Kramer wrote: Maybe you should look a little harder. help.search("Kolmogorov") PLEASE do read the posting guide http://www.R-project.org/posting- guide.html Please do read the guy's question!!! Hello, I would like to conduct a k-sample K-S test, but cannot find reference to its implementation in R. Does anyone have experience with this? It's about a ***k-sample*** K-S test. As far as I can discern, help.search("Kolmogorov") points one only to ks.test() which (again, as far as I can discern) effects only one or two sample KS tests. The multi-sample KS test does exist --- I must admit this was a new one on me. (But then, so many things are!) See e.g. JASA vol. 68, No. 344, pp. 994--997 ``Tables of Critical Values for a k-sample Kolmogorov-Smirnov Test Statistic'' by Edward H. Wolf and Joseph I. Naus. cheers, Rolf Turner P.S. RSiteSearch("Kolmogorov k-sample") turned up nothing useful. R. T. ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nonlinear regression question&[EMAIL PROTECTED]
Try this version: http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf On Sun, Sep 14, 2008 at 6:53 PM, Esther Meenken <[EMAIL PROTECTED]> wrote: > I was unable to open this file Bill Venables' excellent "Exegeses on > Linear Models" posted at > http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.ps.gz I'd be very > interested in reading it? > > Thanks > > Esther Meenken > Biometrician > Crop & Food Research > Private Bag 4704 > Christchurch > > TEL: (03) 325 9639 > FAX: (03) 325 2074 > EMAIL:[EMAIL PROTECTED] > > > > > Visit our website at http://www.crop.cri.nz > __ > CAUTION: The information contained in this email is privileged > and confidential. If you read this message and you are not the > intended recipient, you are hereby notified that any use, > dissemination, distribution or reproduction of all or part of the > contents is prohibited. If you receive this message in error, > please notify the sender immediately. > > Any opinions or views expressed in this message are those of the > individual sender and may not represent those of their employer. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ward's Clustering Doubts
Hi Everybody, Now I have a doubt that is more statistical than R's technical. Im working with ecology of recent Foraminifera. At the lab we used to perform cluster analysis using 1-Pearsons R and Wards method (we already saw it in bibliography of the area) which renders good results with our biological data. Recently, using R Software (vegan and Cluster packages) which allows the combination of any kind of distances matrix with any clustering method, we tried to used Bray Curtis + Wards (which seem to be more appropriate to a matrix with a lot of zeros) and it renders a better result. Furthermore, the results agree with our hypothesis and with the results we have got with the Distance-based Redundancy Analysis - dbRDA or CAP. It means, the analysis (Q-mode) clusters the stations according to the main physical, sedimentary and biological characteristics of the study area. We received some critical comments noticing that Wards Method accepts Euclidean Distance only. So, we made the analysis again using Euclidean Distance but we dont get the better results we had using 1-Pearsons R + Wards or Bray Curtis + Wards (actually any other distance + method combination rendered better results). Trying to find answers in the specialized literature we just got little more confused because in any moment we saw something like "You must use it with Euclidean Distance" and like I said above we already saw in some articles from respected journals, other kind of distance associated with the Ward's Clustering method. Is it wrong or is it non sense to do the analysis in the way we were doing? The results with Wards combined with 1-Pearsons R or Bray Curtis fit better with our hypothesis and have excellent agglomerative coefficients , but we dont want to make inappropriate statistical procedures. I'm starting to realize how powerful R is, but it doesn't justify doing nonsense statistics... I hope one of you may help us! Thank you in advance. Rodrigo. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Combining tables
Hello Say I have the following data, and it's distribution given by table(): > x <- c(1, 1, 1, 2, 2, 3) > tx <- table(x) > tx x 1 2 3 3 2 1 Now say I have new data, > y <- c(3, 3, 3, 3, 4) > ty <- table(y) > ty y 3 4 4 1 Is there a way to "combine" tx and ty in such a way to give me the distribution below? 1 2 3 4 3 2 5 1 Essentially what I'm looking for is something equivalent to table(c(x,y)), but x and y are too large and I'd like to avoid the concatenation. Thanks in advance, Andre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] k-sample Kolmogorov-Smirnov test?
On Sep 14, 2008, at 5:46 PM, Mark Na wrote: Hello, I would like to conduct a k-sample K-S test, but cannot find reference to its implementation in R. Does anyone have experience with this? Thanks, Mark I didn't have any luck with the method Kramer suggested, perhaps because I do not have a large number of packages installed. There is S-Plus code at the end of the article "k-Sample tests based on the likelihood ratio" Jin Zhang, Yuehua Wu; Computational Statistics & Data Analysis, Volume 51, Issue 9, 15 May 2007, Pages 4682-4691 -- David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] histogram
If I understand you correctly, you already pre-computed the frequencies and bin widths and want to display them as a histogram. If correct, then what you are asking for is analogous to what bxp() is to boxplot. I am not sure if such a function exists. Instead you can think of the task as drawing a bunch of rectangles (perhaps using symbols?). Or you can hack the hist() code and try br<- c(0,20,30,40,50,60,70,80,100) dens <- runif( length(br) - 1 ) r <- structure(list(breaks = br, density = dens), class = "histogram") plot(r, main="Felipe's Histogram") However, I do emphasize that this is a hack. If you have the original data that you used to calculate the densities, consider using the breaks argument with hist(). It is better to use tried and tested codes. Regards, Adai Felipe wrote: i calculated the density and wanna do something like this separate in 0-19-29-39-49-59-69-79-99 and put in these spaces 8 densities .. 0.something i have the frequency in % and divided already in 20 or 10 to get the density i tried and tried..made breaks vector to separate but couldn't put the other vector with the frequency density onit directly anyone know how to do it?? tks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.