[R] help in function in R akin to macro in SAS

2010-04-06 Thread Changbin Du
Dear Whom it may concern, I need help to figure the "macro" function in R: I need to plot the different data sets by a plotxyf function, I want the title to be different for different data set. # get the data set final.xyf<- xyf(data=as.matrix(my.final), Y=classvec2classmat(final$outcome), xwe

[R] help output figures in R

2010-04-06 Thread Changbin Du
erent files, if I want output pdf file with the same name as for each dataset I feed to the function somfunc. HOw should I DO? THANKS! -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856

[R] help with "macro" in R

2010-04-06 Thread Changbin Du
library(lattice) hisfunc<- function (vari) { histogram(~ vari|target, data=total, type="count", layout=c(1,3), labels=TRUE, main="Histograms by target", col="skyblue") } hisfunc(total$acid) HI, guys, I am using the hisfunc to get histograms for different variables, for the title of the histogr

Re: [R] help output figures in R

2010-04-06 Thread Changbin Du
Use package sos or use something like > RSiteSearch('') to generate potential hits. > > HTH, > Dennis > > On Tue, Apr 6, 2010 at 3:00 PM, Changbin Du wrote: > >> somfunc<- function (file) { >> >> aa_som<-scale(file) >> >> >>

[R] help in histogram

2010-04-07 Thread Changbin Du
x<- sample(1:14, 319, rep=T) hist(x, freq=F, xlab='',ylab="Percent of Total", col="skyblue", labels=TRUE, right=FALSE,main="Position of Hypothetical Protein") Is there is way to round the labels to 2 decimal digits, for example, 0.088 is changed to 0.09. Thanks! -- Sincerely, Changbin --

[R] label the bars by the percentage values in the conditional histogram?

2010-04-07 Thread Changbin Du
HI, Dear R-community: I have the following codes to plot the conditional histogram, is a way to label the bars by the percentage values in the conditional histogram? h<- sample(1:14, 319, rep=T) c<- sample(1:14, 608, rep=T) n<- sample(1:14, 1140, rep=T) vt<-c(h, c, n) ta<-rep(c("h", "c", "n"), c

Re: [R] help in histogram

2010-04-07 Thread Changbin Du
densities, converted to > character, to the labels argument (instead of just labels=TRUE): > > > hist(x, freq=F, xlab='',ylab="Percent of Total", col="skyblue", > labels=as.character(round(foo$density,2)), right=FALSE,main="Position of > Hypotheti

[R] help in attach function

2010-04-07 Thread Changbin Du
Hi, r-community, This morning, I MET the following problem several times when I try to attach the data set. When I closed the current console and reopen the R console, the problem disappear. BUt with the time passed on, the problem occurs again. Can anyone help me with this? > attach(total)

[R] attach

2010-04-07 Thread Changbin Du
I found the following message maybe help. And I will try it. Hi there, I have just found that the ``attach'' function can get you into trouble when called many times. For example, you have a simulation routine called ``f()'', in which you used ``attach'' and no corresponding ``detach''. Then y

Re: [R] help in attach function

2010-04-08 Thread Changbin Du
Thanks so much! Duncan, I appreciated! On Thu, Apr 8, 2010 at 5:03 AM, Duncan Murdoch wrote: > On 07/04/2010 4:24 PM, Changbin Du wrote: > >> Hi, r-community, >> >> This morning, I MET the following problem several times when I try to >> attach >> the data

[R] r-loop

2010-04-15 Thread Changbin Du
HI, Dear community, I am building the following loop, ww<-function(file) { lossw<-vector() for (x in seq(0.1, 0.9, by=0.1)) { cat('xweight ', x, '\n') lossw[i] <- cross.validation(file, x)$avg } return(lossw) } MY question is how to index the lossw[

Re: [R] r-loop

2010-04-15 Thread Changbin Du
Thanks so much, Marius! It works! On Thu, Apr 15, 2010 at 1:35 PM, Marius 't Hart wrote: > lossw[i] <- cross.validation(file, x)$avg > > change to: > > lossw <- append(lossw,cross.validation(file, x)$avg) > > > > > Changbin Du wrote: > >> HI,

[R] help in output file

2010-04-19 Thread Changbin Du
HI, Dear R-community, I AM using the following codes to grow tree and plot tree: # Classification Tree with rpart library(rpart) pdf(file="/home/cdu/changbin/dimer_tree.pdf") # grow tree fit.dimer <- rpart(outcome ~ ., method="class", data=p.dimer[,2:402]) plotcp(fit.dimer) # visualize cross-v

[R] label the bars by the percentage values in the conditional histogram?

2010-04-19 Thread Changbin Du
HI, Dear R community, HOW to LABEL the bars by the percentage values in the conditional histogram: Thanks so much! h<- sample(1:14, 319, rep=T) c<- sample(1:14, 608, rep=T) n<- sample(1:14, 1140, rep=T) vt<-c(h, c, n) ta<-rep(c("h", "c", "n"), c(319, 608, 1140)) to<-data.frame(vt,ta) library(

[R] label the bars by the percentage values in the conditional histogram?

2010-04-20 Thread Changbin Du
HI, Dear R Community, Does anyone know how to label the values in the conditional histogram? Thanks so much!!! h<- sample(1:14, 319, rep=T) c<- sample(1:14, 608, rep=T) n<- sample(1:14, 1140, rep=T) vt<-c(h, c, n) ta<-rep(c("h", "c", "n"), c(319, 608, 1140)) to<-data.frame(vt,ta) library(lattic

[R] ?rpart

2010-04-21 Thread Changbin Du
HI, Dear R community, Last friday, I used the codes, it works, but today, it does not run? > fit.dimer <- rpart(outcome ~., method="class", data=p.df) Error in `[.data.frame`(frame, predictors) : undefined columns selected DOEs anyone have comments or suggestions? Thanks in advance! -- S

Re: [R] ?rpart

2010-04-21 Thread Changbin Du
Yes, outcome is there. On Wed, Apr 21, 2010 at 3:47 PM, Steve Lianoglou < mailinglist.honey...@gmail.com> wrote: > Hi, > > On Wed, Apr 21, 2010 at 5:20 PM, Changbin Du wrote: > > HI, Dear R community, > > > > Last friday, I used the codes, it

Re: [R] help in conditional histogram

2010-04-23 Thread Changbin Du
hpaths() [1] ".GlobalEnv" "/home/cdu/library/lattice" [3] "/home/cdu/library/stats" "/home/cdu/library/graphics" [5] "/home/cdu/library/grDevices" "/home/cdu/library/utils" [7] "/home/cdu/library/datasets"

[R] boosting with decision tree

2010-04-25 Thread Changbin Du
Hi, Dear R community, Does anyone know how to constructdecision tree with boosting? Is any tutorial I can read? -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman

[R] R.GBM package

2010-04-26 Thread Changbin Du
HI, Dear Greg, I AM A NEW to GBM package. Can boosting decision tree be implemented in 'gbm' package? Or 'gbm' can only be used for regression? IF can, DO I need to combine the rpart and gbm command? Thanks so much! -- Sincerely, Changbin -- [[alternative HTML version deleted]] _

Re: [R] R.GBM package

2010-04-26 Thread Changbin Du
need to combine rpart and > gbm. You're best bet is to just load the package and run a demo > >demo(bernoulli). > > -- > *From:* Changbin Du [mailto:changb...@gmail.com] > *Sent:* Monday, April 26, 2010 9:48 AM > *To:* r-help@r-project.org > *Cc:* Ridgeway, Greg &

Re: [R] R.GBM package

2010-04-26 Thread Changbin Du
Got it, thanks so much! Greg. On Mon, Apr 26, 2010 at 11:02 AM, Ridgeway, Greg wrote: > Y~X1+X2+X3 is the standard R formula syntax. It simply means "Y is > predicted by X1 and X2 and X3". > > Greg > > ------ > *From:* Changbin Du [mail

Re: [R] help in conditional histogram

2010-04-28 Thread Changbin Du
istogram"? you need to > use the one from the lattice package: > > rm(panel.histogram) > > > > On 24 April 2010 01:48, Changbin Du wrote: > > Dear Dr. Sarkar, > > > > When I try to run the codes, I found the following problem: > > > > > &

[R] relative influence plot

2010-04-28 Thread Changbin Du
HI, Dear Greg, I have one question about the variable relative influence plot: THE following is the rel.inf value of 25 variables, but wen I plot, not all the variables are labeled. i.e. num_genes, wg, hydrophob_per etc are not labeled on the y-axis. also the variables are labeled vertically,

[R] variable importance in Random Forest

2010-04-28 Thread Changbin Du
HI, Dear Andy, I run the RandomFOrest in R, and get the following resutls in variable importance: What is the meaning of MeanDecreaseAccuracy and MeanDecreaseGini? I found they are raw values, they are not scaled to 1, right? Which column if most similar to the variable rel.influence in Boosti

Re: [R] variable importance in Random Forest

2010-04-29 Thread Changbin Du
ure is at all comparable to the two in RF. > > Andy > > -- > *From:* Changbin Du [mailto:changb...@gmail.com] > *Sent:* Wednesday, April 28, 2010 8:58 PM > *To:* Liaw, Andy > *Cc:* r-help@r-project.org > *Subject:* variable importance in Random Forest >

[R] RandomForest diagnostics plot

2010-04-29 Thread Changbin Du
HI, Andy, ON the RandomForest diagnostics plot, three lines were drawn. ON the help document of plot.randomForest, only the error rate and MSE are described. I dont know which line is which? Can you help me with that? Thanks so much! -- Sincerely, Changbin -- [[alternative HTML ver

[R] can not print probabilities in svm of e1071

2010-04-29 Thread Changbin Du
> x <- train[,c( 2:18, 20:21, 24, 27:31)] > y <- train$out > > svm.pr <- svm(x, y, probability = TRUE, method="C-classification", kernel="radial", cost=bestc, gamma=bestg, cross=10) > > pred <- predict(svm.pr, valid[,c( 2:18, 20:21, 24, 27:31)], decision.values = TRUE, probability = TRUE) > at

Re: [R] can not print probabilities in svm of e1071

2010-04-29 Thread Changbin Du
decision.values = TRUE, > probability = TRUE) > library(ROCR) > svm.roc <- prediction(attributes(svm.pred)$decision.values, test.set) > svm.auc <- performance(svm.roc, 'tpr', 'fpr') > plot(svm.auc) > > > On Thu, Apr 29, 2010 at 4:17 PM, C

Re: [R] can not print probabilities in svm of e1071

2010-04-29 Thread Changbin Du
ming that it is in the first column. I think this is a > typo in my example :) > > On Thu, Apr 29, 2010 at 5:13 PM, Changbin Du wrote: > > > > > > HI, Saeed, > > > > Thanks so much for the help, I run your code and found the following > > problem, do

[R] predict.gbm

2010-04-29 Thread Changbin Du
Hi, Dear Greg, I have one question, if I boosting decision tree, the distribution = "bernoulli", after that, I use predict.gbm to predict the fitted value for new data set. predict(gbm1, newdata, n.trees=best..iteration, type="response") If type="response" then gbm converts back to the same scal

[R] decisions.values meaning in SVM

2010-04-30 Thread Changbin Du
97841 172 1.0192415342 176 1.3536947861 215 0.9405960067 222 0.9792365851 255 1.2270351367 267 1.7377883390 279 1.1427732884 282 1.2548137295 292 1.1336236065 320 0.4953096976 333 1.0867080386 338 2.5335080606 -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 4

[R] ROC curve in randomForest

2010-04-30 Thread Changbin Du
[ind == 1,], control = cforest_unbiased(mtry = ncol(BreastCancer)-2)) 036x.cf.pred <- predict(x.cf, newdata=BreastCancer[ind == 2,]) 037x.cf.prob <- 1- unlist(treeresponse(x.cf, BreastCancer[ind == 2,]), use.names=F)[seq(1,nrow(BreastCancer[ind == 2,])*2,2)] 038 -- Sinc

[R] bag.fraction in gbm package

2010-05-01 Thread Changbin Du
Hi, Dear Greg, Sorry to bother you again. I have several questions about the 'gbm' package. if the train.fraction is less than 1 (ie. 0.5) , then the* first* 50% will be used to fit the model, the other 50% can be used to estimate the performance. if bag.fraction is 0.5, then gbm use the* rando

Re: [R] bag.fraction in gbm package

2010-05-01 Thread Changbin Du
Thanks, it really helps! On Sat, May 1, 2010 at 2:34 PM, Ridgeway, Greg wrote: > See friedman's paper "stochastic gradient boosting" > > Greg > > ------ > *From*: Changbin Du > *To*: Ridgeway, Greg > *Cc*: r-help@r-project

[R] output from the gbm package

2010-06-15 Thread Changbin Du
at least once? IF SOME obs are not selected, how to calculate the training error? Thanks? -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 [[alternative HTML version deleted

[R] nnet

2010-06-17 Thread Changbin Du
HI, Dear R community, I am using the nnet to fit a neural network model to do classification on binary target variable (0, 1). I am using the following codes: nnet.fit<-nnet(as.factor(out) ~ ., data=train, size=5, rang=0.3, decay=5e-4, maxit=500) I want to know what is the activation function f

[R] help with neural network nnet package

2010-06-17 Thread Changbin Du
HI, Dear R community, I am using the nnet to fit a neural network model to do classification on binary target variable (0, 1). I am using the following codes: nnet.fit<-nnet(as.factor(out) ~ ., data=train, size=5, rang=0.3, decay=5e-4, maxit=500) I want to know what is the activation function f

[R] help with nnet

2010-06-17 Thread Changbin Du
> nnet.fit<-nnet(as.factor(out) ~ ., data=all_h, size=5, rang=0.3, decay=5e-4, maxit=500) # model fitting > summary(nnet.fit) a 23-5-1 network with 126 weights options were - entropy fitting decay=5e-04 HI, Guys, I can not find the manual to describe how the model is built, is there a more

[R] question about boosting(Adaboosting. M1)

2010-06-19 Thread Changbin Du
HI, Guys, I am trying to use the AdaBoosting. M.1 algorithm to integrate three models. I found the sum of weights for each model is not equal to one. How to deal with this? Thanks, any response or suggestions are appreciated! -- Sincerely, Changbin -- [[alternative HTML version del

[R] help in SVM

2010-06-24 Thread Changbin Du
HI, GUYS, I used the following codes to run SVM and get prediction on new data set hh. dim(all_h) [1] 2034 24 dim(hh)# it contains all the variables besides the variables in all_h data set. [1] 640 415 require(e1071) svm.tune<-tune(svm, as.factor(out) ~ ., data=all_h, ranges=list(gamma

[R] how to distinguish bi-mode distribution from mono-mode distribution

2010-06-29 Thread Changbin Du
HI, Dear community, How to distinguish bi-mode distribution from mono-mode distribution? I have only the histograms of 3500 data set. Thanks! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list

Re: [R] ROC curve in R

2010-07-01 Thread Changbin Du
posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 [[alternative HTML version deleted]] _

[R] help with the xtable package

2010-07-01 Thread Changbin Du
HI, Dear R community, I am using the xtable to create the table, but how can I see the table? The following is the codes I used: > data(tli) > tli.table <- xtable(tli[1:10, ]) > digits(tli.table)[c(2, 6)] <- 0 > print(tli.table, floating = FALSE) % latex table generated in R 2.11.0 by xtable

[R] strange data set output

2010-07-02 Thread Changbin Du
Hi, Dear Community, My data set logit.pred contains 2 columns and 1400 rows. When I want to use the first column, it is very strange. Where the $ come out? Thanks so much! > dim(logit.pred) [1] 14002 > head(logit.pred) tree.pred valid.out 754 0.6550606 1 1080 0.6353524

[R] help with predict.lda

2010-07-03 Thread Changbin Du
HI, Dear community, I am using the linear discriminant analysis to build model and make new predictions: > dim(train) #training data [1] 1272 22 > dim(valid) # validation data [1] 140 22 lda.fit <- lda(out ~ ., data=train, na.action="na.omit", CV=TRUE) # model fitting of linear discriminan

Re: [R] help with predict.lda

2010-07-06 Thread Changbin Du
Thanks all so much for your help! I went out for 2 days vacation and could not reply your guys email. Yes, the CV=False works. Thanks again! On Sun, Jul 4, 2010 at 2:47 AM, Peter Ehlers wrote: > On 2010-07-03 21:33, Changbin Du wrote: > >> HI, Dear community, >> >&g

[R] How to read this file into R.

2010-09-24 Thread Changbin Du
Dear community, I have one file named ca_boost_feature.txt, Feature selection (Boosting:0.0025,5)! H.2.C C.1.D C.3.R E.0.N C.2.S C.0.G H.3.G log file: ep If I want to use the second line of this file, how to read it into R? varr<-read.table("/home/cdu/operon/carbonic/ca_boost_feature.txt", sep

Re: [R] How to read this file into R.

2010-09-24 Thread Changbin Du
quot; > > Hope this helps. >- Phil Spector > Statistical Computing Facility > Department of Statistics > UC Berkeley >

[R] need help with nnet

2010-10-12 Thread Changbin Du
HI, Dear R community, My data set has 2409 variables, the last one is response variable. I have used the nnet after feature selection and works. But this time, I am using nnet to fit a model without feature selection. I got the following error information: > dim(train) [1] 1827 2409 nnet.fit<

Re: [R] need help with nnet

2010-10-12 Thread Changbin Du
Thanks, Claudia! On Tue, Oct 12, 2010 at 9:54 AM, Claudia Beleites wrote: > I'm not sure how much fun it is to fit > 7000 weights with 1800 samples, > but you can tell nnet to allow more weights with MaxNWts, see ?nnet > > > > On 10/12/2010 06:45 PM, Changbin Du wrote:

Re: [R] Random Forest AUC

2010-10-23 Thread Changbin Du
I think you should use 10 fold cross validation to judge your performance on the validation parts. What you did will be overfitted for sure, you test on the same training set used for your model buliding. On Sat, Oct 23, 2010 at 6:39 AM, mxkuhn wrote: > I think the issue is that you really can'

[R] help with adding lines to current plot

2010-10-25 Thread Changbin Du
HI, Dear R community, I am using the following codes to plot, however, the lines code works. But the line was not drawn on the previous plot and did not shown up. How comes? # specify the data for missense simulation x <- seq(0,10, by=1) y <- c(0.952, 0.947, 0.943, 0.941, 0.933, 0.932, 0.939, 0

Re: [R] help with adding lines to current plot

2010-10-25 Thread Changbin Du
red", yaxt="n", lty=3, xlab="", > ylab="", ylim = c(min(c(y, z)), max(c(y, z > > # add x vs. fp > lines(x, z, type="b", pch=22, col="blue", lty=2) > > > Cheers, > > Josh > > On Mon, Oct 25, 2010 at 2:38 PM,

[R] svm online course in R

2010-10-28 Thread Changbin Du
Hi, Dear Community, Several days ago, I received one email about the online svm course in R, I try to find it. Can someone forward the information to me. Thanks! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-pro

Re: [R] online course: SVM in R with Lutz Hamel at statistics.com

2010-10-28 Thread Changbin Du
Sorry, I was a little numb at that time. On Thu, Oct 28, 2010 at 10:45 AM, David Winsemius wrote: > In a sense you deserve what you have asked for. You have asked thousands of > people to send you a copy when you could have instead searched the archives > yourself and gotten a much quicker ans

[R] how to save this result in a vector

2010-10-31 Thread Changbin Du
HI, Dear R community, I have the following codes to calculate the commulative coverage. I want to save the output in a vector, How to do this? test<-seq(10, 342, by=2) #cover is a vector cover_per<-function (cover) { for (i in min(cover):max(cover)) {print(100*sum(ifelse(cover >= i, 1, 0))/lengt

Re: [R] how to save this result in a vector

2010-10-31 Thread Changbin Du
min(data):max(data)) { x<-(100*sum(ifelse(data >= i, 1, 0))/length(data)) output<-c(output, x) } return(output) } result<-cover_per(test) On Sun, Oct 31, 2010 at 5:46 PM, David Winsemius wrote: > > On Oct 31,

Re: [R] how to save this result in a vector

2010-10-31 Thread Changbin Du
gt; for (i in min(cover):max(cover)) { > output[j] <- 100*sum(ifelse(cover >= i, 1, 0))/length(cover) >j <- j + 1 > } > return(output) > } > > Josh > -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr

Re: [R] how to save this result in a vector

2010-11-01 Thread Changbin Du
gt; On 11/1/2010 2:24 AM, Changbin Du wrote: > > Thanks Joshua! Yes, i is not going up sequentially by 1, as i here is > the > > raw number of reads for each DNA base. Thanks so much for the great help! > > > > > > On Sun, Oct 31, 2010 at 6:03 PM, Joshua Wiley >w

[R] how to work with long vectors

2010-11-04 Thread Changbin Du
HI, Dear R community, I have one data set like this, What I want to do is to calculate the cumulative coverage. The following codes works for small data set (#rows = 100), but when feed the whole data set, it still running after 24 hours. Can someone give some suggestions for long vector? id

Re: [R] how to work with long vectors

2010-11-04 Thread Changbin Du
647 > 5 Contig79:5 17 50.0 > 6 Contig79:620 58.82353 > 7 Contig79:725 73.52941 > 8 Contig79:827 79.41176 > 9 Contig79:932 94.11765 > 10 Contig79:1033 97.05882 > 11 Contig79:1134 100.0 > > > On Thu, Nov 4, 2010 at 1

Re: [R] how to work with long vectors

2010-11-04 Thread Changbin Du
! On Thu, Nov 4, 2010 at 9:12 AM, Henrique Dallazuanna wrote: > Try this: > > rev(100 * cumsum(matt$reads > 1) / length(matt$reads) ) > > On Thu, Nov 4, 2010 at 1:46 PM, Changbin Du wrote: > >> HI, Dear R community, >> >> I have one data set like t

Re: [R] how to work with long vectors

2010-11-04 Thread Changbin Du
Thanks Martin, I will try this. On Thu, Nov 4, 2010 at 10:06 AM, Martin Morgan wrote: > On 11/04/2010 09:45 AM, Changbin Du wrote: > > Thanks, Jim! > > > > This is not what I want, What I want is calculate the percentage of > reads > > bigger or equal to that re

Re: [R] how to work with long vectors

2010-11-04 Thread Changbin Du
pector > Statistical Computing Facility > Department of Statistics > UC Berkeley > spec...@stat.berkeley.edu > > > > > > >

Re: [R] how to work with long vectors

2010-11-05 Thread Changbin Du
umeric(l) + for(i in 1:l)output[i] = sum(data >= data[i]) + 100 * output / l + } > result3<-cover_per_2(cover) On Thu, Nov 4, 2010 at 10:37 AM, Changbin Du wrote: > Thanks Phil, that is great! I WILL try this and let you know how it goes. > > > > > On Thu, N

Re: [R] how to work with long vectors

2010-11-05 Thread Changbin Du
Thanks Martin! I will try it and will let your guys know how it goes. On Fri, Nov 5, 2010 at 9:42 AM, Martin Morgan wrote: > On 11/05/2010 09:13 AM, Changbin Du wrote: > > HI, Phil, > > > > I used the following codes and run it overnight for 15 hours, this > morning, &g

Re: [R] how to work with long vectors

2010-11-05 Thread Changbin Du
0.02 > > identical(v_3,v) > [1] TRUE > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > > -Original Message- > > From: r-help-boun...@r-project.org > > [mailto:r-help-boun...@r-project.org] On Behalf Of Changbin Du > > Sent: Frida

[R] X11 module cannot be loaded

2010-11-16 Thread Changbin Du
HI, Dear R community, I have used the following codes this morning, but this afternoon, I got the following errors: > x <- seq(0,10, by=1) > y <- c(0.952, 0.947, 0.943, 0.941, 0.933, 0.932, 0.939, 0.932, 0.924, 0.918, 0.920) # missense > z <- c(0.068, 0.082, 0.080, 0.099, 0.108, 0.107, 0.101, 0.1

[R] my function does not work for large data set

2010-12-16 Thread Changbin Du
er.nn)) * Error in unlist(X, recursive = FALSE, use.names = FALSE) : negative length vectors are not allowed* Thanks so much! -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 [[alter

Re: [R] my function does not work for large data set

2010-12-16 Thread Changbin Du
Thanks Jim1 I will split the data and rum again. On Thu, Dec 16, 2010 at 2:57 PM, Jim Holtman wrote: > I think that your object exceeds the limit of 2^31 elements. > > Sent from my iPad > > On Dec 16, 2010, at 17:44, Changbin Du wrote: > > > Dear R community, > &g

Re: [R] auc function

2011-01-20 Thread Changbin Du
ttp://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 [[alternative HTML v

[R] change the for loops with lapply

2010-09-07 Thread Changbin Du
cv.fold<-function(i, size=3, rang=0.3){ cat('Fold ', i, '\n') out.fold.c <-((i-1)*c.each.part +1):(i*c.each.part) out.fold.n <-((i-1)*n.each.part +1):(i*n.each.part) train.cv <- n.cc[-out.fold.c, c(2:2401, 2417)] train.nv <- n.nn[-out.fold.n, c(2:2401, 2417)]

Re: [R] change the for loops with lapply

2010-09-07 Thread Changbin Du
Thanks so much, David! The following codes works! result.fun <- lapply(1:2, function(i) cv.fold(i, 3, 0.3)) On Tue, Sep 7, 2010 at 3:35 PM, David Winsemius wrote: > > On Sep 7, 2010, at 5:43 PM, Changbin Du wrote: > > cv.fold<-function(i, size=3, rang=0.3){ >>

Re: [R] randomforests - how to classify

2010-05-04 Thread Changbin Du
use (as.factor(target) ~., data =your data, ...) On Tue, May 4, 2010 at 12:07 PM, pdb wrote: > > Hi, > > I'm experimenting with random forests and want to perform a binary > classification task. > I've tried some of the sample codes in the help files and things run, but I > get a message to

[R] sort the data set by one variable

2010-05-05 Thread Changbin Du
> #sort the data by predicted probability > b.order<-bo.id.pred[(order(-predict)),] > b.order[1:20,] gene_idpredict 43 637882902 0.07823997 53 638101634 0.66256490 61 639084581 0.08587504 41 637832824 0.02461066 25 637261662 0.11613879 22 637240022 0.06350477 62 639084582 0.02238538 63 639

Re: [R] sort the data set by one variable

2010-05-05 Thread Changbin Du
2658 > 22 637240022 0.06350477 > 44 637943079 0.04532625 > 24 637261661 0.02561841 > 41 637832824 0.02461066 > 62 639084582 0.02238538 > 13 637047086 0.01493464 > 49 638072100 0.01391633 > 74 639787397 0.01283783 > > > I would check your data. Do 'str

Re: [R] sort the data set by one variable

2010-05-05 Thread Changbin Du
- Phil Spector > Statistical Computing Facility > Department of Statistics > UC Berkeley > spec...@stat.berkeley.edu > &g

[R] probabilities in svm output in e1071 package

2010-05-05 Thread Changbin Du
variable. I trained the model svm.fit in training data. And want to predict the out in the new data set hh. WHy the probabilities are both 0 in 1 and 0 class? -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927

Re: [R] probabilities in svm output in e1071 package

2010-05-05 Thread Changbin Du
Thanks Steve! I will try and let you know how it comes. On Wed, May 5, 2010 at 6:07 PM, Steve Lianoglou < mailinglist.honey...@gmail.com> wrote: > Hi Changbin, > > On Wed, May 5, 2010 at 6:46 PM, Changbin Du wrote: > > svm.fit<-svm(as.factor(out) ~ ., data=all_h,

Re: [R] probabilities in svm output in e1071 package

2010-05-05 Thread Changbin Du
Thanks, Steve and David! svm.fit<-svm(as.factor(out) ~ ., data=all_h, method="C-classification", kernel="radial", cost=bestc, gamma=bestg, cross=10, probability=TRUE) It works this time! On Wed, May 5, 2010 at 6:24 PM, Changbin Du wrote: > Thanks Steve! > >

[R] how to extract the variables used in decision tree

2010-05-11 Thread Changbin Du
HI, Dear R community, How to extract the variables actually used in tree construction? I want to extract these variables and combine other variable as my features in next step model building. > printcp(fit.dimer) Classification tree: rpart(formula = outcome ~ ., data = p_df, method = "class") V

[R] exact the variables used in tree construction

2010-05-12 Thread Changbin Du
> fit.dimer <- rpart(as.factor(out) ~ ., method="class", data=p_df) > > fit.dimer$frame[, "var"] [1] NE WC TA WG WD WW WC [11]CT FC YG QT [21] NW DP DY SK [31] 401 Levels: AA AC AD AE AF AG AH AI AK AL AM AN AP AQ AR AS AT A

Re: [R] exact the variables used in tree construction

2010-05-12 Thread Changbin Du
Thanks so much, David! On Wed, May 12, 2010 at 2:52 PM, David Winsemius wrote: > > On May 12, 2010, at 5:31 PM, Changbin Du wrote: > > fit.dimer <- rpart(as.factor(out) ~ ., method="class", data=p_df) >>> >>> fit.dimer$frame[, "var"] >&

Re: [R] "rpart": how to use each variable only once?

2010-05-14 Thread Changbin Du
is this random decision tree, I dont know is there any package can run it. If you know, please let me know. On Fri, May 14, 2010 at 10:23 AM, Shi, Tao wrote: > Hi list, > > Is there a way in "rpart" to force the variables only used once when doing > the splits? > > This is how the question cam

[R] get the row sums

2010-05-18 Thread Changbin Du
gave me errors. CAN someone help me with this? -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 [[alternative HTML version deleted]] __ R-help@r-proje

Re: [R] get the row sums

2010-05-18 Thread Changbin Du
Thanks, David! Yes, I found it just as you said. It works now after change to numeric. On Tue, May 18, 2010 at 1:53 PM, David Winsemius wrote: > > On May 18, 2010, at 4:32 PM, Changbin Du wrote: > > head(en.id.pr) >>> >>valid.gene_id b.pred rf.pred svm.pred &

[R] col allocation is not right

2010-05-19 Thread Changbin Du
plot(svm.auc, col=2, main="ROC curves comparing classification performance\n of six machine learning models") legend(0.5, 0.6, c(ns, nb, nr, nt, nl,ne), 2:6, 9) # Draw a legend. plot(bo.auc, col=3, add=T) # add=TRUE draws on the existing chart plot(rf.auc, col=4, add=T) plot(tree.auc, col=5, add=T

Re: [R] col allocation is not right

2010-05-19 Thread Changbin Du
- Phil Spector > Statistical Computing Facility > Department of Statistics > UC Berkeley > spec...@stat.berkeley.edu > > > > On Wed, 19 May

[R] ROC curve

2010-05-23 Thread Changbin Du
HI, Dear R community, I want to know how to select the optimal decision threshold from the ROC curve? At what threshold will give the highest accuracy? Thanks! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-projec

[R] R eat my data

2010-05-25 Thread Changbin Du
F, fill=T) > dim(gene_name) [1] 10683 -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 [[alternative HTML version deleted]] __ R-help@r-p

Re: [R] R eat my data

2010-05-25 Thread Changbin Du
ount.fields("id_name_gh5.txt")) > Regards Mohamed > > > > > Changbin Du a écrit : > > HI, Dear R community, >> >> My original file has 1932 lines, but when I read into R, it changed to >> 1068 >> lines, how comes? >> >> >>

Re: [R] R eat my data

2010-05-25 Thread Changbin Du
:42 AM, Changbin Du wrote: > > HI, Dear R community, >> >> My original file has 1932 lines, but when I read into R, it changed to >> 1068 >> lines, how comes? >> > > We are being asked to investigate this quest, how? > > Have you looked at the last line to see

Re: [R] R eat my data

2010-05-25 Thread Changbin Du
c...@nuuk:~/operon$ grep '^#' id_name_gh5.txt c...@nuuk:~/operon$ no lines starts with # On Tue, May 25, 2010 at 9:11 AM, Barry Rowlingson < b.rowling...@lancaster.ac.uk> wrote: > On Tue, May 25, 2010 at 4:42 PM, Changbin Du wrote: > > HI, Dear R community, > >

Re: [R] R eat my data

2010-05-25 Thread Changbin Du
o > > tail(gene)_name, 2) > > Come on, man, show some initiative. > > On May 25, 2010, at 12:12 PM, Changbin Du wrote: > > 644727344ABC-2 type transporterABC-2 type transporter > 644727345conserved hypothetical proteinconserved hypothetical > protein &g

Re: [R] R eat my data

2010-05-25 Thread Changbin Du
ggest). I encounter this all the time. So try to be very thorough about > your search (the first place I'll look for is the line where R stop reading. > See if any thing strange there.) > > Also, changing "read.table" to "read.delim" often works. > &

Re: [R] R eat my data

2010-05-25 Thread Changbin Du
> your search (the first place I'll look for is the line where R stop reading. > See if any thing strange there.) > > Also, changing "read.table" to "read.delim" often works. > > ...Tao > > > > > > - Original Message > > From

Re: [R] R eat my data

2010-05-25 Thread Changbin Du
t.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnu

[R] how to Store loop output from a function

2010-05-26 Thread Changbin Du
valid.out 987 Fold 8 Dim of tree.pred 1128 2 length of valid.out 1128 Fold 9 Dim of tree.pred 1269 2 length of valid.out 1269 Fold 10 Dim of tree.pred 1410 2 length of valid.out 1410 Minsplit 5 Minbucket 5 10-cross validation is done! if use return, it will print on the screen, you still can no

Re: [R] how to Store loop output from a function

2010-05-26 Thread Changbin Du
ent to the global environment rather than the > function's environment. In general this seems risky though as your > function could be overwriting data in your main workspace without you > knowing it. > > HTH, > > Josh > > > > On Wed, May 26, 2010 at 9:26 AM, Changb

  1   2   >