This topic refer to independent variables reduction, as we know ,a lot of
method can do with it,however, for pre-processing independent varibles, a
method like the sentence below can reduce many variable, How can I
understand it?
what is significant correlation at 5% level, what is the criterion
Hello,
I am learning caret package, and I want to use the RFE to reduce the
feature. I want to use RFE coupled Random Forest (RFE+FR) to complete this
task. As we know, there are a number of pre-defined sets of functions, like
random Forest(rfFuncs), however,I want to tune the parameters (mtr
I want to get the plot like this,
http://n4.nabble.com/file/n1839303/%25E9%25A2%2591%25E7%258E%2587%25E5%2588%2586%25E5%25B8%2583%25E5%259B%25BE%25E6%25A0%2587%25E5%2587%2586.jpg
%E9%A2%91%E7%8E%87%E5%88%86%E5%B8%83%E5%9B%BE%E6%A0%87%E5%87%86.jpg
not this, http://n4.nabble.com/file/n1839303/R.jp
thanks for your help. I can have a try.
--
View this message in context:
http://n4.nabble.com/how-can-I-plot-the-histogram-like-this-using-R-tp1839303p1839534.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org mailin
thank you, I will try this function barplot.
--
View this message in context:
http://n4.nabble.com/how-can-I-plot-the-histogram-like-this-using-R-tp1839303p1839541.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org m
Thanks for your reply, I just want to get the figure like y1.jpg using the
data from y1.txt.
Through the figure I want to obtain the split point like y1.jpg, and
consider 2.5 as the plit point. This figure is drawn by other people, I
just want to draw it using R, but I can not, so I hope, frie
thanks, it is ok!
--
View this message in context:
http://n4.nabble.com/how-can-I-plot-the-histogram-like-this-using-R-tp1839303p2013782.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org mailing list
https://stat.et
Usage
data(gasoline)
Format
A data frame with 60 observations on the following 2 variables.
octane
a numeric vector. The octane number.
NIR
a matrix with 401 columns. The NIR spectrum
and I see the gasoline data to see below
NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.1694 nm NIR.1
Steve Lianoglou-6 wrote:
>
> Hi,
>
> On Oct 22, 2009, at 2:35 PM, bbslover wrote:
>
>> Usage
>> data(gasoline)
>> Format
>> A data frame with 60 observations on the following 2 variables.
>> octane
>> a numeric vector. The octane number.
&g
I have read that one ,I want to this method to be used to my data.but I donot
know how to put my data into R.
James W. MacDonald wrote:
>
>
>
> bbslover wrote:
>>
>>
>> Steve Lianoglou-6 wrote:
>>> Hi,
>>>
>>> On Oct 22, 2009, at 2
I have try it, past can add to wanted letter, but can not past the colume
names. May be I should learn it hard.
Don MacQueen wrote:
>
> At 4:57 AM -0700 10/23/09, bbslover wrote:
>>Steve Lianoglou-6 wrote:
>>>
>>> Hi,
>>>
>>> On Oct 22, 2
thank you Don MacQueen , I will try it.
Don MacQueen wrote:
>
> At 4:57 AM -0700 10/23/09, bbslover wrote:
>>Steve Lianoglou-6 wrote:
>>>
>>> Hi,
>>>
>>> On Oct 22, 2009, at 2:35 PM, bbslover wrote:
>>>
>>>> Usage
>>
there are many R packages, yesterday, 2031 but today 2033 packages. how can I
kown which package is added, or updated?
--
View this message in context:
http://www.nabble.com/how-can-I-kown-which-package-is-added%2C-or-updated--tp26037150p26037150.html
Sent from the R help mailing list archive at
It is so dramatical. Thank Gabor Grothendieck . I got it.
Gabor Grothendieck wrote:
>
> Google for
> CRANberries aggregates
> and check first hit.
>
> On Sat, Oct 24, 2009 at 4:44 AM, bbslover wrote:
>>
>> there are many R packages, yesterday, 2031 but tod
1
>> str(df)
> 'data.frame': 5 obs. of 2 variables:
> $ x : int 1 2 3 4 5
> $ matrix.rnorm.10...5..2.: AsIs [1:5, 1:2] 0.187703.... -0.66264
> -0.82334 -0.37255 -0.28700 ...
>>
>
> Regards
> Petr
>
>
In my disk C:/ have a a.csv file, I want to read it to R, importantly, when
I use x=read.csv("C:/a.csv") ,the x format is data.frame, I want to it to
become matrix format, how can I do it ?
thank you!
--
View this message in context:
http://old.nabble.com/how-can-I-convert-.csv-format-to
thank you for your help,it is a good way.
Steven Kang wrote:
>
> can try
>
> matrix.x <- as.matrix(x)
>
> On Mon, Nov 2, 2009 at 8:38 PM, bbslover wrote:
>
>>
>> In my disk C:/ have a a.csv file, I want to read it to R, importantly,
>> when
>
hello,
my problem is like this: now after processing the varibles, the remaining
160 varibles(independent) and a dependent y. when I used PLS method, with 10
components, the good r2 can be obtained. but I donot know how can I express
my equation with the less varibles and the y. It is better to
weight of
> each variable in the PC.
>
> HTH
>
> Rick
>
> --
> From: "bbslover"
> Sent: Wednesday, November 04, 2009 10:23 AM
> To:
> Subject: [R] variable selectin---reduce the numbers of initial variable
&g
>>
>> http://www2.research.att.com/~volinsky/bma.html
>>
>> But of course, you must do what you think is better for your problem.
>> By the way what is the dimension of your problem?
>>
>> HTH,
>>
>> Rick
>> --
e.g.
a=
a b c d e
1 1 1 3 1 1
2 1 2 3 4 5
3 1 3 3 8 3
4 1 4 3 3 5
5 1 1 3 1 1I want to delete colume a and colume c, because they
have the same values in every row, then ,I want to get this data.frame .
b=
b d e
1 1 1 1
2 2 4 5
3 3 8 3
4 4 3 5
5 1 1 1the following i
my programe is below:
a=c(1,2,1,1,1); b=c(1,2,3,4,1); c=c(3,4,3,3,3); d=c(1,2,3,5,1);
e=c(1,5,3,5,1)
data.f=data.frame(a,b,c,d,e)
origin.data<-data.f
cor.matrix<-cor(origin.data)
origin.cor<-cor.matrix
m<-0
for(i in 1:(cor.matrix[1]-1))
{
for(j in (i+1):(cor.matrix[2]))
{
if (cor.matri
r.matrix)[1])+1
>
> for column ids use modulus instead of integer divison.
>
> (which(cor.matrix >=0.95) %% dim(cor.matrix)[1])
>
> There are probably better ways than this.
>
> Nikhil
>
> but probably a better way to do this would be
>
> On 6 Nov 200
rm(list=ls())
yx.df<-read.csv("c:/MK-2-72.csv",sep=',',header=T,dec='.')
dim(yx.df)
#get X matrix
y<-yx.df[,1]
x<-yx.df[,2:643]
#conver to matrix
mat<-as.matrix(x)
#get row number
rownum<-nrow(mat)
#remove the constant parameters
mat1<-mat[,apply(mat,2,function(.col)!(all(.col[1]==.col[2:rownum]))
ok,I understand your means, maybe PLS is better for my aim. but I have done
that, also bad. the most questions for me is how to select less variables
from the independent to fit dependent. GA maybe is good way, but I do not
learn it well.
Ben Bolker wrote:
>
> bbslover yeah.net&g
my code is not right below:
rm(list=ls())
#define data.frame
a=c(1,2,3,5,6); b=c(1,2,3,4,7); c=c(1,2,3,4,8); d=c(1,2,3,5,1);
e=c(1,2,3,5,7)
data.f=data.frame(a,b,c,d,e)
#backup data.f
origin.data<-data.f
#get correlation matrix
cor.matrix<-cor(origin.data)
#backup corre
Dear all,
I am learning the subselect package in R, now I want to use GA to select
some potent variable, but some questions are puzzled.
what i want to resolve is that I have one column dependent y and 219
columns independent x. A total 72 observations is contained in the
dataset. I want t
http://old.nabble.com/file/p26443595/Edragonr.txt Edragonr.txt
HI all,
I have a 72*495 matrix, and the first column is the response, and the
remaining are independences. Final I want to select some independence to fit
y, but there are so many independences, the fit result is not meaning, so
Hi,all friends,
Please help me understand this sentence below:
“From this set, 858 columns not significantly correlated with the
response variable TBG at the 5% level were removed, leaving a set of 390
columns.” and “ the F-test's value for the one-parameter correlation with
the descriptor i
as known, svm need tune some parameters like cost,gamma and epsilon to get
better performance,but one question appear, how can i monitor the
performance . generally speaking ,we chose the cross-validation MSE in the
training set, but It seems svm can not return the cross-validation MSE
value, we
t;
> http://www.jstatsoft.org/v28/i05/paper
>
> about the package.
>
> Max
>
> On Fri, Dec 18, 2009 at 12:26 PM, bbslover wrote:
>>
>> as known, svm need tune some parameters like cost,gamma and epsilon to
>> get
>> better performance,but one question appear,
ec 18, 2009 at 12:26 PM, bbslover <[hidden email]> wrote:
>
> as known, svm need tune some parameters like cost,gamma and epsilon to get
> better performance,but one question appear, how can i monitor the
> performance . generally speaking ,we chose the cross-validation MSE in
Hello, all
I have a lot of independents and one dependent, finally, I want to build
one model using them, and predict the new samples value, that is regression.
before it, I must remove some independents according to some criterion:
1. constant values independent. 2. variant near zero. 3
I want to split my whole dateset to training set and test set, building model
in training set, and validate model using test set. Now, How can I split my
dataset to them reasonally. Please give me a hand, It is better to give me
some R code.
and I see some ways like using SOM to project whole ind
Thank you for all help. It is helpful for me.
Max Kuhn wrote:
>
>> I noticed Max already pointed you to the caret package.
>>
>> Load the library and look at the help for the createFolds function, eg:
>>
>> library(caret)
>> ?createFolds
>
> I think that the createDataPartition function in care
I am learning the package "caret", after I do the "rfe" function, I get the
error ,as follows:
Error in `[.data.frame`(x, , retained, drop = FALSE) :
undefined columns selected
In addition: Warning message:
In predict.lm(object, x) :
prediction from a rank-deficient fit may be misleading
I
o unique solution, so many of the parameter
> estimates are NA.
>
> Either create a modified version of lmFuncs that suits your needs or
> remove variables prior to modeling (or try some other method that
> doesn't require more samples than predictors, such as the lasso or
> elasticne
eds or
remove variables prior to modeling (or try some other method that
doesn't require more samples than predictors, such as the lasso or
elasticnet).
Max
On Fri, Jan 1, 2010 at 10:14 PM, bbslover <[hidden email]> wrote:
>
> I am learning the package "caret", a
http://n4.nabble.com/file/n998182/pca.jpg pca.jpg
http://n4.nabble.com/file/n998182/som.jpg som.jpg
http://n4.nabble.com/file/n998182/all%2Bindepents.xls all+indepents.xls
As we know, som is a good tool to cluster hign demension to 2D and show as
a 2D picture, just like in the attachment pic
now I am learining random forest and using random forest package, I can get
the OOB error rates, and test set rate, now I want to get the training set
error rate, how can I do?
pgp.rf<-randomForest(x.tr,y.tr,x.ts,y.ts,ntree=1e3,keep.forest=FALSE,do.trace=1e2)
using the code can get oob and t
late Error
Rates in the training set ?
From: bbslover
>
> now I am learining random forest and using random forest
> package, I can get
> the OOB error rates, and test set rate, now I want to get the
> training set
> error rate, how can I do?
>
> pgp.rf<-randomF
Hello,
I am learning randomForest, now I want to boxplot mse and mtry using 20
5-fold cross-validation(using median value), but I have no a good method to
do it, except a not good method.
randomforest package itself did not contain cross-validating method, and
caret package contain cross vali
åéæ¶é´:2010å¹´1æ14æ¥ ææå
æ¶ä»¶äºº:bbslover
主é¢:Re: [R] Help, How can I boxplot mse and mtry using 20 5-fold
cross-validation?
In caret, see ?trainControl. Use returnResamp = "all"
Max
On Wed, Jan 13, 2010 at 9:47 AM, bbslover <[hidden email]> wrot
http://r.789695.n4.nabble.com/file/n3031344/RSV.Rdata RSV.Rdata
I want to split my dataset to training set and test set using
kennard-stone(KS) algorithm, it is lucky there is R packages soil.spec to
implement it.
but when I used it to my dataset, it does not work, who can help me, how
reasons i
http://r.789695.n4.nabble.com/file/n3032045/rsv1.txt rsv1.txt
I am very grateful to David's suggestion, here , I upload my dataset
"rsv1.txt", also the question,
ks<-ken.sto(rsv1,per="TRUE",per.n=0.3,va="FALSE",sav="FALSE")
it does not work, all results are NULL, i do not known why it is ?
http://r.789695.n4.nabble.com/file/n3060425/fig_1.png fig. 1
http://r.789695.n4.nabble.com/file/n3060425/fig_2.png fig. 2
I want to the picture like the above one, the origin crossover together,
while the following picture can be obtained by default and the origin is
detached, but throgut pulli
thanks, I succeed.
kevin
--
View this message in context:
http://r.789695.n4.nabble.com/how-to-get-the-plot-like-the-attachment-tp3060425p3061217.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org mailing list
http
Dear all,
My double for loop as follows, but it is little efficient, I hope all
friends can give me a "vectorized" program to replace my code. thanks
x: is a matrix 202*263, that is 202 samples, and 263 independent variables
num.compd<-nrow(x); # number of compounds
diss.all<-0
for( i in 1:nu
thanks for your help, it is great. In addition, In the beginning, the format
of x is dataframe, and i run my code, it is so slow, after your help, I
change x for matirx, it is so quick. I am very grateful your kind help, and
your code is so good!
kevin
--
View this message in context:
http://r.
thanks for your help. I am sorry I do not full understand your code, so i can
not correct using your code to my data. here is the attachment of my data,
and what I want to compute is the equation in the word document of the
attachment:
the code form Berend can get the answer i want to get.
http:
Thank Berend,
It seems like that it is better to attach a PDF file for avoiding messy
code.
Yes, I want to obtain is Tanimoto coefficient and your web site "wikipedia"
is about this coefficient. I also search R site about tanimoto coefficient
and learn it more.
About your code, I has saved and
Hello, all experts,
My major is computer-aied drug design ( main QSAR).
Now, my paper need be reviesed, and one reviewer ask me do genetic algorithm
coupled with gaussian process method (GA+GP).
my data:
training set: 191*106
test set: 73*106
here, I need use GA+GP to do variable selection
1. is there some criterion to estimate overfitting? e.g. R2 and Q2 in the
training set, as well as R2 in the test set, when means overfitting. for
example, in my data, I have R2=0.94 for the training set and for the test
set R2=0.70, is overfitting?
2. in this scatter, can one say this overfi
> a<-1:5
> b<-2:6
> plot(a,b)
Error in function (width, height, pointsize, record, rescale, xpinch, :
Graphics API version mismatch
before, R 2.10 , plot() is ok. Now, R 2.11.0 does not work
--
View this message in context:
http://r.789695.n4.nabble.com/update-R-2-11-0-there-is-error-wh
thanks for your suggestion.
many I need to learn indeed. I will buy that good book.
kevin
--
View this message in context:
http://r.789695.n4.nabble.com/How-to-estimate-whether-overfitting-tp2164417p2164847.html
Sent from the R help mailing list archive at Nabble.com.
thank you, I have downloaded it. studying
--
View this message in context:
http://r.789695.n4.nabble.com/How-to-estimate-whether-overfitting-tp2164417p2164932.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org mailin
many thanks . I can try to use test set with 100 samples.
anther question is that how can I rationally split my data to training set
and test set? (training set with 108 samples, and test set with 100 samples)
as I know, the test set should the same distribute to the training set. and
what met
now. it is ok. I uninstall R2.11.0, then delete an packages in the library,
and install again R2.11.0. ok, it does works.
thank you!
--
View this message in context:
http://r.789695.n4.nabble.com/update-R-2-11-0-there-is-error-when-using-plot-how-can-I-do-tp2164517p2165235.html
Sent from th
58 matches
Mail list logo