[R] ROCR package question
I use ROCR to plot multiple runs' performance. Using the sample code as example: # plot ROC curves for several cross-validation runs (dotted # in grey), overlaid by the vertical average curve and boxplots # showing the vertical spread around the average. data(ROCR.xval) pred <- prediction(ROCR.xval$predictions, ROCR.xval$labels) perf <- performance(pred,"tpr","fpr") plot(perf,col="grey82",lty=3) plot(perf,lwd=3,avg="vertical",spread.estimate="boxplot",add=TRUE) I can follow the code and plot without any problem. However, I don't know how to extract the averaged ROC area under curve value. Can someone help? Thanks. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ROCR package question
Thanks for the reply. I am not sure I am following: 1. for the sample code. I tried p...@auc but get auc object not found 2. I am SPECIFICALLY interested in the averaged auc value of the multiple runs. How to get that out? I typed perf and it comes out as a list. 3. as for the plot using whisker plot to see the distribution of the multiple runs, the outliers outside the whisker is very annoying. How to get rid of the "outline" which is outside the whisker? I tried to use boxplot option and put in the following plot code as an option outline=FALSE and it did not work. Please help me with the specifics of the above 3 questions. Use code instead of description would be helpful. Thanks a lot in advance. >Waverley, >use @ (instead of $) to extract the slots from the performance object (it's S4 >class system). >HTH, > Tobias On Sat, Jul 25, 2009 at 8:20 AM, Waverley wrote: > I use ROCR to plot multiple runs' performance. Using the sample code > as example: > > # plot ROC curves for several cross-validation runs (dotted # in > grey), overlaid by the vertical average curve and boxplots # showing > the vertical spread around the average. > data(ROCR.xval) > pred <- prediction(ROCR.xval$predictions, ROCR.xval$labels) perf <- > performance(pred,"tpr","fpr") > plot(perf,col="grey82",lty=3) > plot(perf,lwd=3,avg="vertical",spread.estimate="boxplot",add=TRUE) > > I can follow the code and plot without any problem. However, I don't > know how to extract the averaged ROC area under curve value. > > Can someone help? > > Thanks. > > -- > Waverley @ Palo Alto > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ROCR package question
Thanks for the quick reply. That is very clear for my question 1, 2. How about question 3? When I plot, is there way not to show the whisker plot outliers for evaluating the multiple runs? I have tried to put the option from boxplot command outline=FALSE, however, it did not work. Can you help? Thanks again for your kind help. Waverley, see help('performance-class') for a description of the slots. Your AUCs will be in p...@y.values, which itself is a list (one list element per run). Thus, you can use functions like unlist or s/lapply to access them, e.g. mean(unlist(p...@y.values)) Kind regards, Tobias On Sat, Jul 25, 2009 at 5:44 PM, Waverley wrote: > Thanks for the reply. I am not sure I am following: > > 1. for the sample code. I tried p...@auc but get auc object not found > 2. I am SPECIFICALLY interested in the averaged auc value of the > multiple runs. How to get that out? I typed perf and it comes out as > a list. > 3. as for the plot using whisker plot to see the distribution of the > multiple runs, the outliers outside the whisker is very annoying. How > to get rid of the "outline" which is outside the whisker? I tried to > use boxplot option and put in the following plot code as an option > outline=FALSE and it did not work. > > Please help me with the specifics of the above 3 questions. Use code > instead of description would be helpful. > > Thanks a lot in advance. > > > >>Waverley, > >>use @ (instead of $) to extract the slots from the performance object (it's >>S4 class system). > >>HTH, >> Tobias > > On Sat, Jul 25, 2009 at 8:20 AM, Waverley wrote: >> I use ROCR to plot multiple runs' performance. Using the sample code >> as example: >> >> # plot ROC curves for several cross-validation runs (dotted # in >> grey), overlaid by the vertical average curve and boxplots # showing >> the vertical spread around the average. >> data(ROCR.xval) >> pred <- prediction(ROCR.xval$predictions, ROCR.xval$labels) perf <- >> performance(pred,"tpr","fpr") >> plot(perf,col="grey82",lty=3) >> plot(perf,lwd=3,avg="vertical",spread.estimate="boxplot",add=TRUE) >> >> I can follow the code and plot without any problem. However, I don't >> know how to extract the averaged ROC area under curve value. >> >> Can someone help? >> >> Thanks. >> >> -- >> Waverley @ Palo Alto >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > -- > Waverley @ Palo Alto > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to change the thickness of the lines of the boxplot outliers
Hi, I tried to use boxplot function. I am following the ?boxplot and can change the whisker box width using lwd parameter. However, when outline=TRUE, the thickness of the circle of the outliers is not proportionally changed when I change the line width of the whisker box. There must be another parameter for that. Unfortunately I don't know. please help and thanks much in advance. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to change the thickness of the lines of the boxplot outliers
Thanks Paul. I reproduced my problem using the example of the ?boxplot which is also at http://stat.ethz.ch/R-manual/R-patched/library/graphics/html/boxplot.html The sample code is as following: boxplot(count ~ spray, data = InsectSprays, col = "lightgray", lwd=40) lwd = 40 is kind of exaggerating but does show the case: the circles of outliers maintain the original length while the box becomes a black bulb. BTW: I am using linux redhat ES3.0 64 bit. R is 2.9.0 version. Thanks. >Hi >Waverley wrote: > Hi, > > I tried to use boxplot function. I am following the ?boxplot and can > change the whisker box width using lwd parameter. However, when > outline=TRUE, the thickness of the circle of the outliers is not > proportionally changed when I change the line width of the whisker > box. There must be another parameter for that. Unfortunately I don't > know. This sounds like a problem that was recently fixed in the development version. It would be useful if you could send me (directly if you like) a sample of code that produces the problem. Paul > please help and thanks much in advance. > > -- > Waverley @ Palo Alto > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 p...@stat.auckland.ac.nz http://www.stat.auckland.ac.nz/~paul/ -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to specify xlim in forestplot
Hi, I am using rmeta forestplot function. The values to plot and their 95% upper and lower are of both positive and negative values such that I need to specify the x axis range. Looks like forestplot can only allow positive values for x. Is that true? Can someone help me? in plot you can just specify xlim to define the x axis range. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reference of object in R
Is there a way to pass an object by reference, like that seen in C, to a function in R? Thanks in advance! -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pass object to function by reference in R
How? like that seen in C or C++. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PAMR package question: How to plot Estimated probabilities for the training data and test data
Hi, I have tried some time trying to figure out how to use pamr to plot multiclass Estimated probabilities for the training data and test data? Specifically, how to recreate the PAMR publication on PNAS with Tibshrani et al. The publication is as attached. The plot I want to do is Figure 5. I have downloaded the pamr package and the function which gives similar plot is pamr.plotcvprob but this is different from plot estimated probability. There is one function pamr.xl.plotcvprob.compute sounds like the one I am looking for but it is internal function and is not supposed to be called by user. Any R guru or expression analysis guru who are familiar with pamr can help me? Given saying that, I hope pamr author can make a public function to plot this like the figure 5 in their PNAS paper. After they should encourage user to recreate their nice work through easy APIs. Thanks. -- Waverley @ Palo Alto -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question how to get up triangle of a matrix
Is there a simple way to get up triangle of a matrix and return as a vector? Thanks much. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question of R to do perl matching and matched string extraction
Hi, I want to extract some of the substring via pattern recognition. But I don't know how to do it in R. In perl: my $url = "/pages-cell.net/deepan/sony/"; if($url =~ m/\/(.*)\//g) { my @result = $1; return @result; } How does the same work in R? Thanks much in advance -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about ROCR package
Hi, I have a question about ROCR package. I got the ROC curve plotted without any problem following the manual. However, I don't know to extract the values, e.g. y.values ( I think it is the area under the curve auc measure). The return is an object of class "performance" which have Slots and one of the slot is "y.values". I type the object and I can see them in screen. But I want to extract the value for further programming and computation. I did a summary of the object and it is a "S4" mode which I don't understand. Can someone help? Thanks a lot in advance. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to perform power analysis and sample size estimation/projection using R
Hi, I have a question in regarding to how to perform power analysis and sample size estimation/projection using R? I know power.t.test. It works really well with only one feature analysis. I have a set of features which collectively can discriminate binary classes. I can do power.t.test for each one feature to get a distribution for the sample size estimation to achieve certain power and significance. But how to evaluate such that this set of features is analyzed simultaneously as "one group" for the power analysis and sample size estimation. I also knew samr package has some utility you can do this but it seems does not work well in my situation. Please advise. Thanks a lot in advance for the help. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about normalization to a set of internal standards
Hi, I have a question of the method as how to normalize the data sets according to a set of the internal measurements. For example, I have performed two batches of experiments contrasting two different conditions (positive versus negative conditions): one at a time. 1. each experiment, I measure signals of variable v1 to v100. I want to understand v1 to v100 change under these two contrasting conditions 2. Also I know different variables v101 to v1110, total of 10 of them, although they are different from each other, but they would of the same or similar values under these two contrasting conditions 3. How do I do the internal normalization? How can I use the the variable v101 to v110 values to normalize the measures of v1 to v100 at either positive or negative condition to minimize batch effect? I hope the comparisons of values (v1 to v100) between two different conditions can be more accurate and robust to external noises. In general, I have a couple of matrices of the same dimensions and a reference matrix of values to be used as reference values to be normalize to. How should I do that? -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to normalize to a set of internal references
Thanks for the advice. My question is more on how to do this? Let me use a biology gene analysis example to illustrate: In biology, there are always some house keeping genes which differ little even at pathological conditions. We know that at different batches, there are external factors affect the measurements. For example, overall signal intensity might be different due to lab reagents. A simplified picture: Day 1: Using control samples, I have measured #1 to #110 genes and get data. Day 2: Using disease samples, I have measured again #1 to #110 genes and get data. For those two data sets, I noticed the overall signal intensity in Day 1, for each gene, is more than Day 2. I know, from biological literature, gene 101 to 110, are "house keeping" genes, should not change much between disease and control. My questions arise, technically, how do I use gene 101 to 110 values to adjust the signals of gene 1 to 100 such that the batch effect can be corrected. The differences revealing from the comparative analysis of 1 ~ 100 genes between disease and control are due to biology rather than lab artifacts. So the question is how to do that mathematically? If I have only one house keeping gene, then I can divide every gene to that to normalize, then compare. But now I have 10 genes which can be utilized for normalization. I assume, the more reference genes to be used, the better, under this context. Can you help again? Thanks much in advance. Waverley wrote: > Hi, > > I have a question of the method as how to normalize the data sets > according to a set of the internal measurements. > > For example, I have performed two batches of experiments contrasting > two different conditions (positive versus negative conditions): one at > a time. > > 1. each experiment, I measure signals of variable v1 to v100. I want > to understand v1 to v100 change under these two contrasting conditions > > 2. Also I know different variables v101 to v1110, total of 10 of them, > although they are different from each other, but they would of the > same or similar values under these two contrasting conditions > > 3. How do I do the internal normalization? How can I use the the > variable v101 to v110 values to normalize the measures of v1 to v100 > at either positive or negative condition to minimize batch effect? I > hope the comparisons of values (v1 to v100) between two different > conditions can be more accurate and robust to external noises. > > In general, I have a couple of matrices of the same dimensions and a > reference matrix of values to be used as reference values to be > normalize to. How should I do that? > I don't understand your problem well, but in general internal normalization is by and large an attempt to avoid appropriate modeling (e.g., incorporating block effects or certain covariates in a regression model), and results in overstated confidence of the final estimates by not taking into account the imprecision in the normalizing factors. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question about order
I have a data vector as following: > z [1] 183.1370 201.9610 113.7250 140.7840 156.2750 42.1569 42.1569 42.1569 [9] 240.1960 308.4310 42.1569 42.1569 42.1569 42.1569 42.1569 42.1569 [17] 42.1569 42.1569 42.1569 42.1569 279.8040 42.1569 42.1569 when I sort, it gave me the right order > sort(z) [1] 42.1569 42.1569 42.1569 42.1569 42.1569 42.1569 42.1569 42.1569 [9] 42.1569 42.1569 42.1569 42.1569 42.1569 42.1569 42.1569 113.7250 [17] 140.7840 156.2750 183.1370 201.9610 240.1960 279.8040 308.4310 BUT when I use the order, the returned index is strange and not right. You can check the first 4 values. > order (z) [1] 6 7 8 11 12 13 14 15 16 17 18 19 20 22 23 3 4 5 1 2 9 21 10 I am not sure why R does not order it correctly when handling a vector with repetitive values. I use just the first 4 values of z, then it ordered correctly. > order (z[1:4]) [1] 3 4 1 2 Can someone help? What is the problem here? Is this a R bug? How to order when handling a vector with repetitive values? -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to create ROC curve for 2 dimensional classifiers
Hi, I understand for 1 d classifiers, you can use ROCR package. Is there a package you can plot ROC curve for 2d classifiers? One of my colleagues asked me about this. I have been quite puzzled, conceptually, how you can do the ROC curve for 2d classifiers. Can someone share his/her knowledge or experience? Thanks in advance. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 1D classifier and 2D classifier
Hi, Is there any package which provides the functions of create one dimensional and/or Two dimensional classifiers? Thanks much. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question about matrix one column values matching a vector of values
Hi, I have a matrix a = matrix (1:16, 4, 4) b = c (2,3) I want to find out which rows of a, where a[,1] equals any values of b? I know that if b is only one value, e.g, b=2, then what I want is a[a[,1] == 2,] But what about if it is not one value but a vector of values? Thanks much in advance. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to get the value of aov summary into another variable
Hi, I have a question of aov. e.g. aov.ex = aov(x~y) summary(aov.ex) The aov summary will print to the screen. How can I extract the aov result, in particular the values of Pr(>F) and F value into a vector so that I can use them for other use? Thanks. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question about anova
Hi, I have a question regarding two way anova. I have a data sheet as following: ID -- representing the same sample gets several repeated measures at the same time point group -- treated vs untreated time -- different time point (3, 9, 15 hours) signal -- measurement signal I use the aov function and formula = signal ~ (group*time) + Error (ID/(group*time) Questions: 1. is this the right way Error (ID/(group*time) to handle multiplicated repeated measures of the same sample (by ID) at the same time point? I have checked on the web and it looks like there is some controvercial discussion. Can you educate me here? 2. In the result of the model.tables (aov.study, "means"), the means value under "group:time" of treated and untreated vs time point is different from what I have calculated by hand. It is fairly close to that value. Can someone explain why I have discrepancies here? 3. a R program tech question: I need to extract the p value from the summary(aov.study) but not sure how to do it. looks like the summary is not a data frame or list. I need to take the p value and continue to use it in the R program for some other reports. Thanks much in advance. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] need help explain the routine input parameters for seROC and cROC found in the R archive
Please help. I found the code in the archive. The author of this script says: "The first function (seROC) calculate the standard error of ROC curve, the second function (cROC) compare ROC curves." Can some one explain to me what are the na, nn and r parameters which are used as the input to the following two functions? Thanks much in advance. > From: Bernardo Rangel Tura > Date: Thu 16 Dec 2004 - 07:30:37 EST > > seROC<-function(AUC,na,nn){ > a<-AUC > q1<-a/(2-a) > q2<-(2*a^2)/(1+a) > se<-sqrt((a*(1-a)+(na-1)*(q1-a^2)+(nn-1)*(q2-a^2))/(nn*na)) > se > } > > cROC<-function(AUC1,na1,nn1,AUC2,na2,nn2,r){ > se1<-seROC(AUC1,na1,nn1) > se2<-seROC(AUC2,na2,nn2) > > sed<-sqrt(se1^2+se2^2-2*r*se1*se2) > zad<-(AUC1-AUC2)/sed > p<-dnorm(zad) > a<-list(zad,p) > a > } > -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to fetch rows with certain characteristics
Hi, I have a matrix, first column is of certain values, second column is the class labels or a factor. e.g. 1.2 1 1.3 1 1.3 1 1.5 1 2.1 2 2.0 2 9.9 2 1.4 3 1.8 3 1.9 3 I want to find out what is the min values of column 1 for each corresponding class (column 2). For the above example, I want to return a matrix of 1.2 1 2.0 2 1.3 3 Can someone suggest how to code for that? The second column can be of characters. Thanks much. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to fetch rows with certain characteristics
Thanks. That works. However, in my own case, there are more columns of other kinds of data. So to me, it is more important to get the row index of those that has the min values of particular column in particular class (which is another column). Can you help more as how to get those row index? One issue is that for some class they may share the same min value so that using %in% does not work. My goal is to reduce the original matrix size and get the result back in the original matrix format. Thanks. On Wed, Oct 28, 2009 at 11:55 PM, Ista Zahn wrote: > There are various ways, including > > x <- read.table(textConnection("1.2 1 > + 1.3 1 > + 1.3 1 > + 1.5 1 > + 2.1 2 > + 2.0 2 > + 9.9 2 > + 1.4 3 > + 1.8 3 > + 1.9 3") ) > > x <- as.matrix(x) > > x.min <- cbind(tapply(x[,1], x[,2], min), unique(x[,"V2"])) > > Most of that is just formatting it in the way you requested. All you > need to compute the values is > > tapply(x[,1], x[,2], min) > > -Ista > > On Thu, Oct 29, 2009 at 1:47 AM, Waverley @ Palo Alto > wrote: >> Hi, >> >> I have a matrix, first column is of certain values, second column is >> the class labels or a factor. >> e.g. >> >> 1.2 1 >> 1.3 1 >> 1.3 1 >> 1.5 1 >> 2.1 2 >> 2.0 2 >> 9.9 2 >> 1.4 3 >> 1.8 3 >> 1.9 3 >> >> I want to find out what is the min values of column 1 for each >> corresponding class (column 2). For the above example, I want to >> return a matrix of >> 1.2 1 >> 2.0 2 >> 1.3 3 >> >> Can someone suggest how to code for that? The second column can be of >> characters. >> >> Thanks much. >> >> >> -- >> Waverley @ Palo Alto >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Ista Zahn > Graduate student > University of Rochester > Department of Clinical and Social Psychology > http://yourpsyche.org > -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to fetch rows with certain characteristics
The reason %in% does not work is that there are might be values which are not min in other class which are the same as the min of different classes. In the example I provided before, this situation did not exist. See the new example: > + 1.2 1 > + 1.3 1 > + 1.5 1 > + 1.1 2 > + 1.2 2 > + 9.9 2 > + 0.1 3 > + 1.1 3 > + 1.9 3 if you are using %in%, then 1.2 2 1.1 3 will also show up in the final result. That is why I need those row index of the min value of each class. If I can get those, that would be best. Thanks. On Thu, Oct 29, 2009 at 11:58 AM, Ista Zahn wrote: > Hi, > I guess I don't understand why you think %in% won't work. > >> x <- read.table(textConnection("1.2 1 > + 1.2 1 > + 1.3 1 > + 1.5 1 > + 2.1 2 > + 2.0 2 > + 9.9 2 > + 1.4 3 > + 1.8 3 > + 1.9 3") ) >> x <- as.matrix(x) >> x.min <- tapply(x[,1], x[,2], min) >> x[x[,1] %in% x.min,] >> ## all matches > V1 V2 > [1,] 1.2 1 > [2,] 1.2 1 > [3,] 2.0 2 > [4,] 1.4 3 >> ## unique matches >> unique(x[x[,1] %in% x.min,]) > V1 V2 > [1,] 1.2 1 > [2,] 2.0 2 > [3,] 1.4 3 > > -Ista > On Thu, Oct 29, 2009 at 12:36 PM, Waverley @ Palo Alto > wrote: >> Thanks. That works. >> >> However, in my own case, there are more columns of other kinds of >> data. So to me, it is more important to get the row index of those >> that has the min values of particular column in particular class >> (which is another column). >> >> Can you help more as how to get those row index? One issue is that for >> some class they may share the same min value so that using %in% does >> not work. My goal is to reduce the original matrix size and get the >> result back in the original matrix format. >> >> >> Thanks. >> >> On Wed, Oct 28, 2009 at 11:55 PM, Ista Zahn wrote: >>> There are various ways, including >>> >>> x <- read.table(textConnection("1.2 1 >>> + 1.3 1 >>> + 1.3 1 >>> + 1.5 1 >>> + 2.1 2 >>> + 2.0 2 >>> + 9.9 2 >>> + 1.4 3 >>> + 1.8 3 >>> + 1.9 3") ) >>> >>> x <- as.matrix(x) >>> >>> x.min <- cbind(tapply(x[,1], x[,2], min), unique(x[,"V2"])) >>> >>> Most of that is just formatting it in the way you requested. All you >>> need to compute the values is >>> >>> tapply(x[,1], x[,2], min) >>> >>> -Ista >>> >>> On Thu, Oct 29, 2009 at 1:47 AM, Waverley @ Palo Alto >>> wrote: >>>> Hi, >>>> >>>> I have a matrix, first column is of certain values, second column is >>>> the class labels or a factor. >>>> e.g. >>>> >>>> 1.2 1 >>>> 1.3 1 >>>> 1.3 1 >>>> 1.5 1 >>>> 2.1 2 >>>> 2.0 2 >>>> 9.9 2 >>>> 1.4 3 >>>> 1.8 3 >>>> 1.9 3 >>>> >>>> I want to find out what is the min values of column 1 for each >>>> corresponding class (column 2). For the above example, I want to >>>> return a matrix of >>>> 1.2 1 >>>> 2.0 2 >>>> 1.3 3 >>>> >>>> Can someone suggest how to code for that? The second column can be of >>>> characters. >>>> >>>> Thanks much. >>>> >>>> >>>> -- >>>> Waverley @ Palo Alto >>>> >>>> __ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> >>> -- >>> Ista Zahn >>> Graduate student >>> University of Rochester >>> Department of Clinical and Social Psychology >>> http://yourpsyche.org >>> >> >> >> >> -- >> Waverley @ Palo Alto >> > > > > -- > Ista Zahn > Graduate student > University of Rochester > Department of Clinical and Social Psychology > http://yourpsyche.org > -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question about function heatmap
Hi, I am using the function heatmap(stats) to draw a microarray heatmap, columns are samples and rows are gene features. I did a 2D clustering during the heatmap drawing. The features and samples indeed cluster into several blocks both vertically and horizontally. I can get the index of re-ordered rows and columns after the heatmap drawing by typing the the return variable of the heatmap function. However, I cannot separate these index by the the dendro tree. All the indexes labeled at the bottom and right of the plot all jammed together. I cannot by looking at the plot to find where the borders are. Can someone help? Essentially I want the dendro tree of the genes which are grouped after the clustering so that, e.g., I want to check whether genes clustered together are in the same pathway etc. Thanks in advance. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question to use R plot GO pie chart
Hi, I have a list of IPI gene IDs. I want to find out whether there is a package which can map the gene ontology to these IPIs, and plot the pie chart to demonstrate the molecular function distributions. The input is like the following gene IPI IDs: IPI:IPI8860.1|SWISS-PROT:Q9BXJ4-1|TREMBL:Q542Y2|ENSEMBL:ENSP0231338;EN IPI:IPI00019922.5|SWISS-PROT:Q8N0Y2-1|TREMBL:Q53F81|ENSEMBL:ENSP0338860;ENSP0375594|REFSEQ:NP_060807|H-INV:HIT28861|VEGA:OTTHUMP0078377 Tax_Id=9606 Gene_Symbol=ZN IPI:IPI00647423.2|SWISS-PROT:Q8N819-1|REFSEQ:NP_001073870|VEGA:OTTHUMP0076687 Tax_Id=9606 Gene_Symbol=FLJ40125 Isoform 1 of IPI:IPI00219000.2|SWISS-PROT:P27658|TREMBL:Q53XI6|ENSEMBL:ENSP0261037|REFS IPI:IPI00291878.4|SWISS-PROT:P35247|ENSEMBL:ENSP0361366|REFSEQ:NP_003010|H-INV:HIT39466|VEGA:OTTHUMP0019944 IPI:IPI00013945.1|SWISS-PROT:P07911-1|TREMBL:Q8NHW8|ENSEMBL:ENSP0306279|RE IPI:IPI0634.1|SWISS-PROT:Q16204|TREMBL:Q6GSG7|ENSEMBL:ENSP0263102|REFS I want to plot the pie chart of these gene distribution in the GO molecular function as a pie chart. An example is shown in the following link http://www.proteomesci.com/content/7/1/6/figure/F2?highres=y Can some one help? Thanks much in advance. Merry Christmas!! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] solicit help to read in 384 plate color image
Hi, I am doing an experiment which results different colors of different intensities in the 384 micro titer plate. I took a picture of the plate by scanning in the image as a jpeg file and now I want to 1. read in the image file 2. grid the content 3. need to extract the intensity and color of each well. Is there a R package I can use for that? there is a package called "gitter" which is almost satisfying my needs. However, it can only read in grey values and no colors. If someone has the code, please share. -- Thanks. waverley [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] program R using mac Xcode
Hi, I am starting to use Xcode a lot for C/C++ programming. Can you do R programming in Xcode? If can, how to configure to enable this? Much thank in advance. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to implement string pattern extraction in R
Hi, In perl, to get a substring matching a particular pattern can be implemented like the following example: $x = ".txt"; if ($x=~ /(.*?)\.txt/){ $prefix = $1; } So how to do the same thing in R? Can someone provide me the code sample? Thanks much in advance. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to implement string pattern extraction in R
Thanks for the reply to pointing me to the grep functions. I have checked the readme page http://pbil.univ-lyon1.fr/library/base/html/grep.html before I sent the help request. Just don't know how to extract a substring matching a pattern out of a string. Can someone give me the example code similar to that in perl to extract the prefix out of the string. Thanks much. On Sun, Aug 22, 2010 at 3:05 PM, Waverley @ Palo Alto wrote: > Hi, > > In perl, to get a substring matching a particular pattern can be > implemented like the following example: > > $x = ".txt"; > if ($x=~ /(.*?)\.txt/){ > $prefix = $1; > } > > So how to do the same thing in R? > > Can someone provide me the code sample? > > Thanks much in advance. > > -- > Waverley @ Palo Alto > -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R program google search
Hi, Can someone help as how to use R to program google search in the R code? I know that other languages can allow or have the google search API If someone can give me some links or sample code I would greatly appreciate. Thanks. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R program google search
My question is how to use R to program google search. I found this information: "The SOAP Search API was created for developers and researchers interested in using Google Search as a resource in their applications." Unfortunately google no longer supports that. They are supporting the AJAX Search API. What about R? Thanks. On Fri, Sep 3, 2010 at 2:23 PM, Waverley @ Palo Alto wrote: > Hi, > > Can someone help as how to use R to program google search in the R > code? I know that other languages can allow or have the google search > API > > If someone can give me some links or sample code I would greatly appreciate. > > Thanks. > > -- > Waverley @ Palo Alto > -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R program google search
Hi, I have sent some request as how to embed the Google search API in R? I remember on one mailing list people talked about this previously using R. I did some analysis on this and found that google API (SOAP based) has retired and was replaced with the AJAX search API. I found the following perl code which does the search using google AJAX API to obtain the google search result? Has anyone know how to do the equivalent in R? I am attaching the perl code as following: Thanks much in advance. #!/usr/bin/perl # This example request includes an optional API key which you will need to # remove or replace with your own key. # Read more about why it's useful to have an API key. # The request also includes the userip parameter which provides the end # user's IP address. Doing so will help distinguish this legitimate # server-side traffic from traffic which doesn't come from an end-user. ## my $url = "http://ajax.googleapis.com/ajax/services/search/web?v=1.0&"; . "q=Paris%20Hilton&key="get your own key from google"&userip="your machine IP""; # Load our modules # Please note that you MUST have LWP::UserAgent and JSON installed to use this # You can get both from CPAN. use LWP::UserAgent; use JSON; # Initialize the UserAgent object and send the request. # Notice that referer is set manually to a URL string. my $ua = LWP::UserAgent->new(); $ua->default_header("HTTP_REFERER" => "your website"); my $body = $ua->get($url); # process the json string my $json = from_json($body->decoded_content); # have some fun with the results my $i = 0; foreach my $result (@{$json->{responseData}->{results}}){ $i++; print $i.". " . $result->{titleNoFormatting} . "(" . $result->{url} . ")\n"; # etc } if(!$i){ print "Sorry, but there were no results.\n"; } On Fri, Sep 3, 2010 at 2:23 PM, Waverley @ Palo Alto wrote: > Hi, > > Can someone help as how to use R to program google search in the R > code? I know that other languages can allow or have the google search > API > > If someone can give me some links or sample code I would greatly appreciate. > > Thanks. > > -- > Waverley @ Palo Alto > -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] User R to create MySQL database and table
Hi, I am thinking about using R to create a database, then create table in MySQL server. Can I do that using RMySQL package? I am familiar with RMySQL, and in the online help most of the sample code assumes the database exists and transact with the table inside the database. Can someone provide me some sample code to create a database and table? Specifically create a database first, then create a table inside the database. Thanks a lot in advance. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] online recursive lssvm r package
Hi, I am looking for a R package which can do online recursive training, e.g. online recursive LSSVM algorithm. Can some one help? Thanks. Waverley. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.