Dear Whom it may concern,
I need help to figure the "macro" function in R: I need to plot the
different data sets by a plotxyf function, I want the title to be different
for different data set.
# get the data set
final.xyf<- xyf(data=as.matrix(my.final),
Y=classvec2classmat(final$outcome), xwe
erent files, if I want output pdf file with the same name
as for each dataset I feed to the function somfunc.
HOw should I DO?
THANKS!
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome Institute
Bldg 400 Rm 457
2800 Mitchell Dr
Walnut Creet, CA 94598
Phone: 925-927-2856
library(lattice)
hisfunc<- function (vari) {
histogram(~ vari|target, data=total, type="count", layout=c(1,3),
labels=TRUE, main="Histograms by target", col="skyblue")
}
hisfunc(total$acid)
HI, guys,
I am using the hisfunc to get histograms for different variables, for the
title of the histogr
Use package sos or use something like
> RSiteSearch('') to generate potential hits.
>
> HTH,
> Dennis
>
> On Tue, Apr 6, 2010 at 3:00 PM, Changbin Du wrote:
>
>> somfunc<- function (file) {
>>
>> aa_som<-scale(file)
>>
>>
>>
x<- sample(1:14, 319, rep=T)
hist(x, freq=F, xlab='',ylab="Percent of Total", col="skyblue",
labels=TRUE, right=FALSE,main="Position of Hypothetical Protein")
Is there is way to round the labels to 2 decimal digits, for example, 0.088
is changed to 0.09.
Thanks!
--
Sincerely,
Changbin
--
HI, Dear R-community:
I have the following codes to plot the conditional histogram, is a way to
label the bars by the percentage values in the conditional histogram?
h<- sample(1:14, 319, rep=T)
c<- sample(1:14, 608, rep=T)
n<- sample(1:14, 1140, rep=T)
vt<-c(h, c, n)
ta<-rep(c("h", "c", "n"), c
densities, converted to
> character, to the labels argument (instead of just labels=TRUE):
>
>
> hist(x, freq=F, xlab='',ylab="Percent of Total", col="skyblue",
> labels=as.character(round(foo$density,2)), right=FALSE,main="Position of
> Hypotheti
Hi, r-community,
This morning, I MET the following problem several times when I try to attach
the data set.
When I closed the current console and reopen the R console, the problem
disappear. BUt with the time passed on, the problem occurs again.
Can anyone help me with this?
> attach(total)
I found the following message maybe help. And I will try it.
Hi there,
I have just found that the ``attach'' function
can get you into trouble when called many times.
For example, you have a simulation routine called ``f()'',
in which you used ``attach'' and no corresponding ``detach''.
Then y
Thanks so much! Duncan, I appreciated!
On Thu, Apr 8, 2010 at 5:03 AM, Duncan Murdoch wrote:
> On 07/04/2010 4:24 PM, Changbin Du wrote:
>
>> Hi, r-community,
>>
>> This morning, I MET the following problem several times when I try to
>> attach
>> the data
HI, Dear community,
I am building the following loop,
ww<-function(file) {
lossw<-vector()
for (x in seq(0.1, 0.9, by=0.1)) {
cat('xweight ', x, '\n')
lossw[i] <- cross.validation(file, x)$avg
}
return(lossw) }
MY question is how to index the lossw[
Thanks so much, Marius! It works!
On Thu, Apr 15, 2010 at 1:35 PM, Marius 't Hart wrote:
> lossw[i] <- cross.validation(file, x)$avg
>
> change to:
>
> lossw <- append(lossw,cross.validation(file, x)$avg)
>
>
>
>
> Changbin Du wrote:
>
>> HI,
HI, Dear R-community,
I AM using the following codes to grow tree and plot tree:
# Classification Tree with rpart
library(rpart)
pdf(file="/home/cdu/changbin/dimer_tree.pdf")
# grow tree
fit.dimer <- rpart(outcome ~ ., method="class", data=p.dimer[,2:402])
plotcp(fit.dimer) # visualize cross-v
HI, Dear R community,
HOW to LABEL the bars by the percentage values in the conditional
histogram: Thanks so much!
h<- sample(1:14, 319, rep=T)
c<- sample(1:14, 608, rep=T)
n<- sample(1:14, 1140, rep=T)
vt<-c(h, c, n)
ta<-rep(c("h", "c", "n"), c(319, 608, 1140))
to<-data.frame(vt,ta)
library(
HI, Dear R Community,
Does anyone know how to label the values in the conditional histogram?
Thanks so much!!!
h<- sample(1:14, 319, rep=T)
c<- sample(1:14, 608, rep=T)
n<- sample(1:14, 1140, rep=T)
vt<-c(h, c, n)
ta<-rep(c("h", "c", "n"), c(319, 608, 1140))
to<-data.frame(vt,ta)
library(lattic
HI, Dear R community,
Last friday, I used the codes, it works, but today, it does not run?
> fit.dimer <- rpart(outcome ~., method="class", data=p.df)
Error in `[.data.frame`(frame, predictors) : undefined columns selected
DOEs anyone have comments or suggestions? Thanks in advance!
--
S
Yes, outcome is there.
On Wed, Apr 21, 2010 at 3:47 PM, Steve Lianoglou <
mailinglist.honey...@gmail.com> wrote:
> Hi,
>
> On Wed, Apr 21, 2010 at 5:20 PM, Changbin Du wrote:
> > HI, Dear R community,
> >
> > Last friday, I used the codes, it
hpaths()
[1] ".GlobalEnv" "/home/cdu/library/lattice"
[3] "/home/cdu/library/stats" "/home/cdu/library/graphics"
[5] "/home/cdu/library/grDevices" "/home/cdu/library/utils"
[7] "/home/cdu/library/datasets"
Hi, Dear R community,
Does anyone know how to constructdecision tree with boosting? Is any
tutorial I can read?
--
Sincerely,
Changbin
--
[[alternative HTML version deleted]]
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman
HI, Dear Greg,
I AM A NEW to GBM package. Can boosting decision tree be implemented in
'gbm' package? Or 'gbm' can only be used for regression?
IF can, DO I need to combine the rpart and gbm command?
Thanks so much!
--
Sincerely,
Changbin
--
[[alternative HTML version deleted]]
_
need to combine rpart and
> gbm. You're best bet is to just load the package and run a demo
> >demo(bernoulli).
>
> --
> *From:* Changbin Du [mailto:changb...@gmail.com]
> *Sent:* Monday, April 26, 2010 9:48 AM
> *To:* r-help@r-project.org
> *Cc:* Ridgeway, Greg
&
Got it, thanks so much! Greg.
On Mon, Apr 26, 2010 at 11:02 AM, Ridgeway, Greg wrote:
> Y~X1+X2+X3 is the standard R formula syntax. It simply means "Y is
> predicted by X1 and X2 and X3".
>
> Greg
>
> ------
> *From:* Changbin Du [mail
istogram"? you need to
> use the one from the lattice package:
>
> rm(panel.histogram)
>
>
>
> On 24 April 2010 01:48, Changbin Du wrote:
> > Dear Dr. Sarkar,
> >
> > When I try to run the codes, I found the following problem:
> >
> >
> &
HI, Dear Greg,
I have one question about the variable relative influence plot: THE
following is the rel.inf value of 25 variables, but wen I plot, not all the
variables are labeled.
i.e. num_genes, wg, hydrophob_per etc are not labeled on the y-axis. also
the variables are labeled vertically,
HI, Dear Andy,
I run the RandomFOrest in R, and get the following resutls in variable
importance:
What is the meaning of MeanDecreaseAccuracy and MeanDecreaseGini?
I found they are raw values, they are not scaled to 1, right?
Which column if most similar to the variable rel.influence in Boosti
ure is at all comparable to the two in RF.
>
> Andy
>
> --
> *From:* Changbin Du [mailto:changb...@gmail.com]
> *Sent:* Wednesday, April 28, 2010 8:58 PM
> *To:* Liaw, Andy
> *Cc:* r-help@r-project.org
> *Subject:* variable importance in Random Forest
>
HI, Andy,
ON the RandomForest diagnostics plot, three lines were drawn.
ON the help document of plot.randomForest, only the error rate and MSE are
described. I dont know which line is which?
Can you help me with that?
Thanks so much!
--
Sincerely,
Changbin
--
[[alternative HTML ver
> x <- train[,c( 2:18, 20:21, 24, 27:31)]
> y <- train$out
>
> svm.pr <- svm(x, y, probability = TRUE, method="C-classification",
kernel="radial", cost=bestc, gamma=bestg, cross=10)
>
> pred <- predict(svm.pr, valid[,c( 2:18, 20:21, 24, 27:31)],
decision.values = TRUE, probability = TRUE)
> at
decision.values = TRUE,
> probability = TRUE)
> library(ROCR)
> svm.roc <- prediction(attributes(svm.pred)$decision.values, test.set)
> svm.auc <- performance(svm.roc, 'tpr', 'fpr')
> plot(svm.auc)
>
>
> On Thu, Apr 29, 2010 at 4:17 PM, C
ming that it is in the first column. I think this is a
> typo in my example :)
>
> On Thu, Apr 29, 2010 at 5:13 PM, Changbin Du wrote:
> >
> >
> > HI, Saeed,
> >
> > Thanks so much for the help, I run your code and found the following
> > problem, do
Hi, Dear Greg,
I have one question, if I boosting decision tree, the distribution =
"bernoulli", after that, I use predict.gbm to predict the fitted value for
new data set.
predict(gbm1, newdata, n.trees=best..iteration, type="response")
If type="response" then gbm converts back to the same scal
97841
172 1.0192415342
176 1.3536947861
215 0.9405960067
222 0.9792365851
255 1.2270351367
267 1.7377883390
279 1.1427732884
282 1.2548137295
292 1.1336236065
320 0.4953096976
333 1.0867080386
338 2.5335080606
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome Institute
Bldg 4
[ind == 1,], control =
cforest_unbiased(mtry = ncol(BreastCancer)-2))
036x.cf.pred <- predict(x.cf, newdata=BreastCancer[ind == 2,])
037x.cf.prob <- 1- unlist(treeresponse(x.cf, BreastCancer[ind == 2,]),
use.names=F)[seq(1,nrow(BreastCancer[ind == 2,])*2,2)]
038
--
Sinc
Hi, Dear Greg,
Sorry to bother you again.
I have several questions about the 'gbm' package.
if the train.fraction is less than 1 (ie. 0.5) , then the* first* 50% will
be used to fit the model, the other 50% can be used to estimate the
performance.
if bag.fraction is 0.5, then gbm use the* rando
Thanks, it really helps!
On Sat, May 1, 2010 at 2:34 PM, Ridgeway, Greg wrote:
> See friedman's paper "stochastic gradient boosting"
>
> Greg
>
> ------
> *From*: Changbin Du
> *To*: Ridgeway, Greg
> *Cc*: r-help@r-project
at least once? IF SOME obs are not selected, how to
calculate the training error?
Thanks?
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome Institute
Bldg 400 Rm 457
2800 Mitchell Dr
Walnut Creet, CA 94598
Phone: 925-927-2856
[[alternative HTML version deleted
HI, Dear R community,
I am using the nnet to fit a neural network model to do classification on
binary target variable (0, 1). I am using the following codes:
nnet.fit<-nnet(as.factor(out) ~ ., data=train, size=5, rang=0.3,
decay=5e-4, maxit=500)
I want to know what is the activation function f
HI, Dear R community,
I am using the nnet to fit a neural network model to do classification on
binary target variable (0, 1). I am using the following codes:
nnet.fit<-nnet(as.factor(out) ~ ., data=train, size=5, rang=0.3,
decay=5e-4, maxit=500)
I want to know what is the activation function f
> nnet.fit<-nnet(as.factor(out) ~ ., data=all_h, size=5, rang=0.3,
decay=5e-4, maxit=500) # model fitting
> summary(nnet.fit)
a 23-5-1 network with 126 weights
options were - entropy fitting decay=5e-04
HI, Guys,
I can not find the manual to describe how the model is built, is there a
more
HI, Guys,
I am trying to use the AdaBoosting. M.1 algorithm to integrate three models.
I found the sum of weights for each model is not equal to one.
How to deal with this?
Thanks, any response or suggestions are appreciated!
--
Sincerely,
Changbin
--
[[alternative HTML version del
HI, GUYS,
I used the following codes to run SVM and get prediction on new data set hh.
dim(all_h)
[1] 2034 24
dim(hh)# it contains all the variables besides the variables in all_h
data set.
[1] 640 415
require(e1071)
svm.tune<-tune(svm, as.factor(out) ~ ., data=all_h,
ranges=list(gamma
HI, Dear community,
How to distinguish bi-mode distribution from mono-mode distribution? I have
only the histograms of 3500 data set.
Thanks!
--
Sincerely,
Changbin
--
[[alternative HTML version deleted]]
__
R-help@r-project.org mailing list
posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome Institute
Bldg 400 Rm 457
2800 Mitchell Dr
Walnut Creet, CA 94598
Phone: 925-927-2856
[[alternative HTML version deleted]]
_
HI, Dear R community,
I am using the xtable to create the table, but how can I see the table?
The following is the codes I used:
> data(tli)
> tli.table <- xtable(tli[1:10, ])
> digits(tli.table)[c(2, 6)] <- 0
> print(tli.table, floating = FALSE)
% latex table generated in R 2.11.0 by xtable
Hi, Dear Community,
My data set logit.pred contains 2 columns and 1400 rows. When I want to use
the first column, it is very strange. Where the $ come out? Thanks so much!
> dim(logit.pred)
[1] 14002
> head(logit.pred)
tree.pred valid.out
754 0.6550606 1
1080 0.6353524
HI, Dear community,
I am using the linear discriminant analysis to build model and make new
predictions:
> dim(train) #training data
[1] 1272 22
> dim(valid) # validation data
[1] 140 22
lda.fit <- lda(out ~ ., data=train, na.action="na.omit", CV=TRUE) # model
fitting of linear discriminan
Thanks all so much for your help! I went out for 2 days vacation and could
not reply your guys email. Yes, the CV=False works.
Thanks again!
On Sun, Jul 4, 2010 at 2:47 AM, Peter Ehlers wrote:
> On 2010-07-03 21:33, Changbin Du wrote:
>
>> HI, Dear community,
>>
>&g
Dear community,
I have one file named ca_boost_feature.txt,
Feature selection (Boosting:0.0025,5)!
H.2.C C.1.D C.3.R E.0.N C.2.S C.0.G H.3.G
log file: ep
If I want to use the second line of this file, how to read it into R?
varr<-read.table("/home/cdu/operon/carbonic/ca_boost_feature.txt", sep
quot;
>
> Hope this helps.
>- Phil Spector
> Statistical Computing Facility
> Department of Statistics
> UC Berkeley
>
HI, Dear R community,
My data set has 2409 variables, the last one is response variable. I have
used the nnet after feature selection and works. But this time, I am using
nnet to fit a model without feature selection. I got the following error
information:
> dim(train)
[1] 1827 2409
nnet.fit<
Thanks, Claudia!
On Tue, Oct 12, 2010 at 9:54 AM, Claudia Beleites wrote:
> I'm not sure how much fun it is to fit > 7000 weights with 1800 samples,
> but you can tell nnet to allow more weights with MaxNWts, see ?nnet
>
>
>
> On 10/12/2010 06:45 PM, Changbin Du wrote:
I think you should use 10 fold cross validation to judge your performance on
the validation parts. What you did will be overfitted for sure, you test on
the same training set used for your model buliding.
On Sat, Oct 23, 2010 at 6:39 AM, mxkuhn wrote:
> I think the issue is that you really can'
HI, Dear R community,
I am using the following codes to plot, however, the lines code works. But
the line was not drawn on the previous plot and did not shown up.
How comes?
# specify the data for missense simulation
x <- seq(0,10, by=1)
y <- c(0.952, 0.947, 0.943, 0.941, 0.933, 0.932, 0.939, 0
red", yaxt="n", lty=3, xlab="",
> ylab="", ylim = c(min(c(y, z)), max(c(y, z
>
> # add x vs. fp
> lines(x, z, type="b", pch=22, col="blue", lty=2)
>
>
> Cheers,
>
> Josh
>
> On Mon, Oct 25, 2010 at 2:38 PM,
Hi, Dear Community,
Several days ago, I received one email about the online svm course in R, I
try to find it. Can someone forward the information to me.
Thanks!
--
Sincerely,
Changbin
--
[[alternative HTML version deleted]]
__
R-help@r-pro
Sorry, I was a little numb at that time.
On Thu, Oct 28, 2010 at 10:45 AM, David Winsemius wrote:
> In a sense you deserve what you have asked for. You have asked thousands of
> people to send you a copy when you could have instead searched the archives
> yourself and gotten a much quicker ans
HI, Dear R community,
I have the following codes to calculate the commulative coverage. I want to
save the output in a vector, How to do this?
test<-seq(10, 342, by=2)
#cover is a vector
cover_per<-function (cover) {
for (i in min(cover):max(cover)) {print(100*sum(ifelse(cover >= i, 1,
0))/lengt
min(data):max(data)) {
x<-(100*sum(ifelse(data >= i, 1, 0))/length(data))
output<-c(output, x)
}
return(output)
}
result<-cover_per(test)
On Sun, Oct 31, 2010 at 5:46 PM, David Winsemius wrote:
>
> On Oct 31,
gt; for (i in min(cover):max(cover)) {
> output[j] <- 100*sum(ifelse(cover >= i, 1, 0))/length(cover)
>j <- j + 1
> }
> return(output)
> }
>
> Josh
>
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome Institute
Bldg 400 Rm 457
2800 Mitchell Dr
gt; On 11/1/2010 2:24 AM, Changbin Du wrote:
> > Thanks Joshua! Yes, i is not going up sequentially by 1, as i here is
> the
> > raw number of reads for each DNA base. Thanks so much for the great help!
> >
> >
> > On Sun, Oct 31, 2010 at 6:03 PM, Joshua Wiley >w
HI, Dear R community,
I have one data set like this, What I want to do is to calculate the
cumulative coverage. The following codes works for small data set (#rows =
100), but when feed the whole data set, it still running after 24 hours.
Can someone give some suggestions for long vector?
id
647
> 5 Contig79:5 17 50.0
> 6 Contig79:620 58.82353
> 7 Contig79:725 73.52941
> 8 Contig79:827 79.41176
> 9 Contig79:932 94.11765
> 10 Contig79:1033 97.05882
> 11 Contig79:1134 100.0
>
>
> On Thu, Nov 4, 2010 at 1
!
On Thu, Nov 4, 2010 at 9:12 AM, Henrique Dallazuanna wrote:
> Try this:
>
> rev(100 * cumsum(matt$reads > 1) / length(matt$reads) )
>
> On Thu, Nov 4, 2010 at 1:46 PM, Changbin Du wrote:
>
>> HI, Dear R community,
>>
>> I have one data set like t
Thanks Martin, I will try this.
On Thu, Nov 4, 2010 at 10:06 AM, Martin Morgan wrote:
> On 11/04/2010 09:45 AM, Changbin Du wrote:
> > Thanks, Jim!
> >
> > This is not what I want, What I want is calculate the percentage of
> reads
> > bigger or equal to that re
pector
> Statistical Computing Facility
> Department of Statistics
> UC Berkeley
> spec...@stat.berkeley.edu
>
>
>
>
>
>
>
umeric(l)
+ for(i in 1:l)output[i] = sum(data >= data[i])
+ 100 * output / l
+ }
> result3<-cover_per_2(cover)
On Thu, Nov 4, 2010 at 10:37 AM, Changbin Du wrote:
> Thanks Phil, that is great! I WILL try this and let you know how it goes.
>
>
>
>
> On Thu, N
Thanks Martin! I will try it and will let your guys know how it goes.
On Fri, Nov 5, 2010 at 9:42 AM, Martin Morgan wrote:
> On 11/05/2010 09:13 AM, Changbin Du wrote:
> > HI, Phil,
> >
> > I used the following codes and run it overnight for 15 hours, this
> morning,
&g
0.02
> > identical(v_3,v)
> [1] TRUE
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
> > -Original Message-
> > From: r-help-boun...@r-project.org
> > [mailto:r-help-boun...@r-project.org] On Behalf Of Changbin Du
> > Sent: Frida
HI, Dear R community,
I have used the following codes this morning, but this afternoon, I got the
following errors:
> x <- seq(0,10, by=1)
> y <- c(0.952, 0.947, 0.943, 0.941, 0.933, 0.932, 0.939, 0.932, 0.924,
0.918, 0.920) # missense
> z <- c(0.068, 0.082, 0.080, 0.099, 0.108, 0.107, 0.101, 0.1
er.nn))
*
Error in unlist(X, recursive = FALSE, use.names = FALSE) :
negative length vectors are not allowed*
Thanks so much!
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome Institute
Bldg 400 Rm 457
2800 Mitchell Dr
Walnut Creet, CA 94598
Phone: 925-927-2856
[[alter
Thanks Jim1 I will split the data and rum again.
On Thu, Dec 16, 2010 at 2:57 PM, Jim Holtman wrote:
> I think that your object exceeds the limit of 2^31 elements.
>
> Sent from my iPad
>
> On Dec 16, 2010, at 17:44, Changbin Du wrote:
>
> > Dear R community,
> &g
ttp://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome Institute
Bldg 400 Rm 457
2800 Mitchell Dr
Walnut Creet, CA 94598
Phone: 925-927-2856
[[alternative HTML v
cv.fold<-function(i, size=3, rang=0.3){
cat('Fold ', i, '\n')
out.fold.c <-((i-1)*c.each.part +1):(i*c.each.part)
out.fold.n <-((i-1)*n.each.part +1):(i*n.each.part)
train.cv <- n.cc[-out.fold.c, c(2:2401, 2417)]
train.nv <- n.nn[-out.fold.n, c(2:2401, 2417)]
Thanks so much, David!
The following codes works!
result.fun <- lapply(1:2, function(i) cv.fold(i, 3, 0.3))
On Tue, Sep 7, 2010 at 3:35 PM, David Winsemius wrote:
>
> On Sep 7, 2010, at 5:43 PM, Changbin Du wrote:
>
> cv.fold<-function(i, size=3, rang=0.3){
>>
use (as.factor(target) ~., data =your data, ...)
On Tue, May 4, 2010 at 12:07 PM, pdb wrote:
>
> Hi,
>
> I'm experimenting with random forests and want to perform a binary
> classification task.
> I've tried some of the sample codes in the help files and things run, but I
> get a message to
> #sort the data by predicted probability
> b.order<-bo.id.pred[(order(-predict)),]
> b.order[1:20,]
gene_idpredict
43 637882902 0.07823997
53 638101634 0.66256490
61 639084581 0.08587504
41 637832824 0.02461066
25 637261662 0.11613879
22 637240022 0.06350477
62 639084582 0.02238538
63 639
2658
> 22 637240022 0.06350477
> 44 637943079 0.04532625
> 24 637261661 0.02561841
> 41 637832824 0.02461066
> 62 639084582 0.02238538
> 13 637047086 0.01493464
> 49 638072100 0.01391633
> 74 639787397 0.01283783
> >
> I would check your data. Do 'str
- Phil Spector
> Statistical Computing Facility
> Department of Statistics
> UC Berkeley
> spec...@stat.berkeley.edu
>
&g
variable. I trained the model svm.fit
in training data. And want to predict the out in the new data set hh.
WHy the probabilities are both 0 in 1 and 0 class?
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome Institute
Bldg 400 Rm 457
2800 Mitchell Dr
Walnut Creet, CA 94598
Phone: 925-927
Thanks Steve!
I will try and let you know how it comes.
On Wed, May 5, 2010 at 6:07 PM, Steve Lianoglou <
mailinglist.honey...@gmail.com> wrote:
> Hi Changbin,
>
> On Wed, May 5, 2010 at 6:46 PM, Changbin Du wrote:
> > svm.fit<-svm(as.factor(out) ~ ., data=all_h,
Thanks, Steve and David!
svm.fit<-svm(as.factor(out) ~ ., data=all_h, method="C-classification",
kernel="radial", cost=bestc, gamma=bestg, cross=10, probability=TRUE)
It works this time!
On Wed, May 5, 2010 at 6:24 PM, Changbin Du wrote:
> Thanks Steve!
>
>
HI, Dear R community,
How to extract the variables actually used in tree construction? I want to
extract these variables and combine other variable as my features in next
step model building.
> printcp(fit.dimer)
Classification tree:
rpart(formula = outcome ~ ., data = p_df, method = "class")
V
> fit.dimer <- rpart(as.factor(out) ~ ., method="class", data=p_df)
>
> fit.dimer$frame[, "var"]
[1] NE WC TA WG WD WW WC
[11]CT FC YG QT
[21] NW DP DY SK
[31]
401 Levels: AA AC AD AE AF AG AH AI AK AL AM AN AP AQ AR AS AT A
Thanks so much, David!
On Wed, May 12, 2010 at 2:52 PM, David Winsemius wrote:
>
> On May 12, 2010, at 5:31 PM, Changbin Du wrote:
>
> fit.dimer <- rpart(as.factor(out) ~ ., method="class", data=p_df)
>>>
>>> fit.dimer$frame[, "var"]
>&
is this random decision tree, I dont know is there any package can run it.
If you know, please let me know.
On Fri, May 14, 2010 at 10:23 AM, Shi, Tao wrote:
> Hi list,
>
> Is there a way in "rpart" to force the variables only used once when doing
> the splits?
>
> This is how the question cam
gave me errors.
CAN someone help me with this?
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome Institute
Bldg 400 Rm 457
2800 Mitchell Dr
Walnut Creet, CA 94598
Phone: 925-927-2856
[[alternative HTML version deleted]]
__
R-help@r-proje
Thanks, David!
Yes, I found it just as you said. It works now after change to numeric.
On Tue, May 18, 2010 at 1:53 PM, David Winsemius wrote:
>
> On May 18, 2010, at 4:32 PM, Changbin Du wrote:
>
> head(en.id.pr)
>>>
>>valid.gene_id b.pred rf.pred svm.pred
&
plot(svm.auc, col=2, main="ROC curves comparing classification performance\n
of six machine learning models")
legend(0.5, 0.6, c(ns, nb, nr, nt, nl,ne), 2:6, 9) # Draw a legend.
plot(bo.auc, col=3, add=T) # add=TRUE draws on the existing chart
plot(rf.auc, col=4, add=T)
plot(tree.auc, col=5, add=T
- Phil Spector
> Statistical Computing Facility
> Department of Statistics
> UC Berkeley
> spec...@stat.berkeley.edu
>
>
>
> On Wed, 19 May
HI, Dear R community,
I want to know how to select the optimal decision threshold from the ROC
curve? At what threshold will give the highest accuracy?
Thanks!
--
Sincerely,
Changbin
--
[[alternative HTML version deleted]]
__
R-help@r-projec
F, fill=T)
> dim(gene_name)
[1] 10683
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome Institute
Bldg 400 Rm 457
2800 Mitchell Dr
Walnut Creet, CA 94598
Phone: 925-927-2856
[[alternative HTML version deleted]]
__
R-help@r-p
ount.fields("id_name_gh5.txt"))
> Regards Mohamed
>
>
>
>
> Changbin Du a écrit :
>
> HI, Dear R community,
>>
>> My original file has 1932 lines, but when I read into R, it changed to
>> 1068
>> lines, how comes?
>>
>>
>>
:42 AM, Changbin Du wrote:
>
> HI, Dear R community,
>>
>> My original file has 1932 lines, but when I read into R, it changed to
>> 1068
>> lines, how comes?
>>
>
> We are being asked to investigate this quest, how?
>
> Have you looked at the last line to see
c...@nuuk:~/operon$ grep '^#' id_name_gh5.txt
c...@nuuk:~/operon$
no lines starts with #
On Tue, May 25, 2010 at 9:11 AM, Barry Rowlingson <
b.rowling...@lancaster.ac.uk> wrote:
> On Tue, May 25, 2010 at 4:42 PM, Changbin Du wrote:
> > HI, Dear R community,
> >
o
>
> tail(gene)_name, 2)
>
> Come on, man, show some initiative.
>
> On May 25, 2010, at 12:12 PM, Changbin Du wrote:
>
> 644727344ABC-2 type transporterABC-2 type transporter
> 644727345conserved hypothetical proteinconserved hypothetical
> protein
&g
ggest). I encounter this all the time. So try to be very thorough about
> your search (the first place I'll look for is the line where R stop reading.
> See if any thing strange there.)
>
> Also, changing "read.table" to "read.delim" often works.
>
&
> your search (the first place I'll look for is the line where R stop reading.
> See if any thing strange there.)
>
> Also, changing "read.table" to "read.delim" often works.
>
> ...Tao
>
>
>
>
>
> - Original Message
> > From
t.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome Institute
Bldg 400 Rm 457
2800 Mitchell Dr
Walnu
valid.out 987
Fold 8
Dim of tree.pred 1128 2 length of valid.out 1128
Fold 9
Dim of tree.pred 1269 2 length of valid.out 1269
Fold 10
Dim of tree.pred 1410 2 length of valid.out 1410
Minsplit 5 Minbucket 5
10-cross validation is done!
if use return, it will print on the screen, you still can no
ent to the global environment rather than the
> function's environment. In general this seems risky though as your
> function could be overwriting data in your main workspace without you
> knowing it.
>
> HTH,
>
> Josh
>
>
>
> On Wed, May 26, 2010 at 9:26 AM, Changb
1 - 100 of 139 matches
Mail list logo