> > There is a gap between plots (white spaces) when I draw the boxplot. I
> want
> > to reduce the spaces between the plots so that the boxplot takes a small
There is a gap between plots (white spaces) when I draw the boxplot. I want
to reduce the spaces between the plots so that the boxplot takes a small
space. When I just adjust "boxwex" values, the length of individual plots
increases/decreases but the white spaces is still there.
My code:
Hello everyone
I am using nested resampling in caret (5-fold outer and bootstrap inner
resampling) and by default, it shows the "Accuracy" metric. How can I use
it for the ROC/AUC metric?
My code is:
index <- createDataPartition(d$isKilled , p = .70,list = FALSE)
tr <- d[
218.9856 0.9141754 152.2099
#24extratrees 7 222.8540 0.9412682 152.4614
#24extratrees 8 228.1156 0.9423414 161.8456
#24extratrees 9 226.6182 0.9408306 160.5264
#24extratrees 10 226.92
I cannot run the "ranger" model with caret. I am only using the farff and
caret libraries and the following code:
boot <- trainControl(method = "cv", number=10)
Recently I updated R and now when I try to open R studio, I get
The R session failed to start.
[No error available]
The R session process exited with code -1073740791.
[No errors emitted]
[No output emitted]
I have the following data and I need to use a boxplot which displays the
variables (RF, Ranger, SVM, KNN) with one color, variables (RF_boot,
Ranger_boot, SVM_boot, KNN_boot) with another color and the variables
(RF_LOO, SVM_LOO, Ranger_LOO, KNN_LOO) with another color.
Why do I get the following error when my variable in the 'if statement' has
no missing values.
I check with is.na(my variable) and it has no missing values
Error in if (fraction <= 1) { : missing value where TRUE/FALSE needed
Hello everyone
I want to know
(1) In which cases, we need to use set.seed while building ML models?
(2) Which is the exact location we need to put the set.seed function i.e.
when we split data into train/test sets, or just before we train a model?
Thank you
Hi everyone
I use na.omit to remove NAs but still it gives me error
Error in if (fraction <= 1) { : missing value where TRUE/FALSE needed
My data is:
data.frame': 340 obs. of 15 variables:
$ DepthTree: num 1 1 1 1 1 1 1 1 1 1 ...
$ NumSubclass : num 0 0 0 0 0 0 0 0 0 0 ...
$ McCabe
I have values like:
var= c(0, 0, 0,0, 0, 14, 0, 14, 0, 2, 3)
I want to show these values in a boxplot
boxplot (var)
However, the boxplot shows only the zero values and the value till 14 are
shown as outliers.. I want to show all my values as boxplot, can I do that?
If I have to select specific rows and all columns of a dataframe, I use:
rows= df[1:12 , ]
However, how to select rows if our required rows are not in sequence i.e.
if we need to select row numbers 1,2,5,7, and 12..
Hello everyone
While I am working to identify the correlated metrics on my data, I am
getting the following error? My data has no missing or Inf values as all
other operations (model training etc) could be done without any errors.
Error in hclust(as.dist(1 - abs(cor(data, method = cor_method))),
Thank you all.. I installed vctrs and install_dependencies and now its
Hi, suddenly the packages I installed not working. It gives me the error:
Error: package or namespace load failed for ‘farff’ in loadNamespace(i,
c(lib.loc, .libPaths()), versionCheck = vI[[i]]):
namespace ‘rlang’ 0.4.7 is already loaded, but >= 0.4.10 is required
I have two files of my data: part1 and part2. I used part1 to train and
test my data and got the RMSE value. Now I need to compare my obtained
result (RMSE value) with unseen data i.e. part2 of data. How can I use the
same training data of part1 and part2 data as test data.
Hi to everyone
I just wanted to know if the dalex package of the mlr3 provides the same
functions (i.e. variable importance) as provided by the scott-knott-esd
Hi, If I am using caret and MAE metric, how can we get median of MAE in
easy manner.
cart <-train(Result ~ ., data = tr,
method = "rf",
metric = "MAE",
preProc = c("center", "scale", "nzv"),
trControl = ctr)
Hi, I am using caret package and using nested resampling method (i.e. 5
fold for outer fold). I am getting the RMSE and MAE values, which are
default to the caret library.
My question is how can we implement the Mean Magnitude of Relative Error
(MRE and MMRE) with caret. My code is the following:
Hello everyone
I m using the parameters optimization of svm in mlr3. I am setting the
parameters of 'C' and 'type'.
search_space = paradox::ParamSet$new(
params = list(paradox::ParamInt$new("C", lower = 1, upper = 7),
paradox::ParamFct$new(id = "type", default = "eps-svr",
levels = c("eps-sv
Hi, I am using the hyperparameters tuning of GBM but it gives me the
following error: (For some other datasets, my code works, so the problem
would be the dataset size I guess).
Error in gbm.fit(x = x, y = y, offset = offset, distribution =
distribution, :
The data set is too small or the subsa
parameter value for 'type' is not set at all. Try setting 'type' to a value
that satisfies the condition.
I have set the parameter values as following:
search_space = paradox::ParamSet$new(
params = list(paradox::ParamInt$new("C", lower = 1, upper = 7)))
ast 1, so searching
> below 1 is probably your issue.
> Also, logically, zero nearest neighbors doesn't seem to make a lot of
> sense.
> Pat
Thank you for your response.
Are you certain that k = 0 is a legitimate setting?
Since, the default value of k is 1, I wanted to search between the values
of 0 to 3.
Milne, Do you mean I have to provide both the lower and upper bounds
greater than 1 in order to get rid of this error?
I am using mlr3 'fast nearest neighbor' leaner i.e. fnn
Its parameter is 'k' which has a default value of 1. When I use tuning
using random search, I set the parameter of k as: lower= 0, upper=3
But it gives an error message
Error in self$assert(xs) :
Assertion on 'xs' failed: k: Element 1 is
I am using nlr3 'fast nearest neighbor' leaner i.e. fnn
Its parameter is km which has a default value of 1. When I use tuning using
random search, I set the parameter of k as: lower= 0, upper=3
But it gives an error message
Error in self$assert(xs) :
Assertion on 'xs' failed: k: Element 1 is n
Hello everyone,
I want to evaluate the cart learner with different (i.e. ideally 10
different) resampling methods. My questions are:
(1) How should I specify the list of resampling methods?
(2) Should I specify the resampling lists in the beginning or in the
Hello everyone,
I performed a parameters optimization in mlr3 and then pass it to the
benchmark to compare the optimized and a baseline learner, but it give an
error message and does not recognize the optimized (i.e. at) learner.
task=TaskRegr$new("df", df, target = "a
I have the following data from resample
svm= svm$resample$RMSE
nn= nn$resample$RMSE
we perform the statistical tests like
wilcox.test(svm, nn)
I have a question, can we perform the scott-knot ESD test here? if yes, how?
ct <- train(`Bug class` ~ ., data = tr, method = "pls", metric = "AUC", preProc
= c("center", "scale", "nzv"), trControl = boot3)
I get the following error when I use the ROSE class balancing method but
when I use other methods like SMOTE, up, down, I do not get any error
Something is wrong; all the ROC metric values are missing:
ROC Sens Spec
Min. : NA Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA 1st Qu
I have a model NN, which has 10 piece of data in 10 folds of test (ts) data
such as 0.1, 0.5, 0.3 etc.
And another model SVM, which also have this type of information. I usually
visualize it like:
boxplot (NN, SVM)
I have two questions?
(1) I want to ask how can I visualize them via ggplot?
What is the problem is the code..
index <- createDataPartition(d$Defective, p = .70,list = FALSE)
tr <- d[index, ]
ts <- d[-index, ]
index_2 <- createFolds(tr$Defective, returnTrain = TRUE, list = TRUE)
ctrl <- trainControl(method = "boot", number=100, index = index_2,
If we have to perform 100 iterations of the out-of-sample bootstrap, do we
need to just enter the number 100 as:
trainControl (method="boot", number = 100)
I am using feature selection using AutoSpearman but I have no idea what to
do after the 'plotVarClus' function, which is just a graph and red bars are
going out of the input metrics.
My question is what is the next step to perform the feature selection
(exclude correlated metrics).
I used
If I have one dataset , then we can perform statistical tests as:
svm <- train(bug ~ ., data = tr,
rf <- train(bug ~ ., data = tr,
wilcox.test(svm$resamples$RMSE, rf$resamplesRMSE)
But if we have
I have a regression based data where I get the RMSE results as:
I want to know how can I make it so that its values comes as 0-1
I plot the boxplot for it to indicate their RMSE values and used,
ylim=(0,1), but the boxplot which works for RMSE values like 3500
I have a boxplot using the following code and I get the boxplots, but at
the x axis, instead of 1,2,3, how can I get my own captions like , random,
grid, genetic, pso etc.
Hello to all.
I have a small confusion, kindly if you could suggest something?
I need to compare two algorithms, CART and NNET in R. The results show that
NNET has better MAE value, but I want to find if there is any significance
difference between the results of both algorithms? The dataset I us
I have continuous data I.e regression based. If I have residuals for
algorithm 1 and algorithm 2 like this,
X=resid(alg1) and y=resid (alg2)
Can we perform the Wilcoxon test as.
I am using Simulated annealing to tune the parameters of xgbtree for
regression dataset. When I run the code to tune the parameters of SVM and
RF, it works but when I run the same code for xgbTree, it gives stops and
give error:
Something is wrong; all the MAE metric values are missing:
I am using the following code to tune the 4 parameters of Gradient Boosting
algorithm using Simulated annealing (optim). When I run the program, after
few seconds it stops and displays the following error:
I point out here that the same code works for RF ( mtry parameter) and SVM
Hi, I am using Simulated Annealing to tune the parameters of the R.Forest.
The code I use is below:
My question is that did I do correctly when I used the values 1:24 in the
optim function of the Simulated annealing? I used 1:24 because my number of
trControl = ctrol)
scale", "zv"),
trControl = ctrol)
When I run my code, I get the following error and suddenly the execution
of the script stops. Where in my data is the problem?
Something is wrong; all the MAE metric values are missing:
Min. : NA Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA 1s
Hi , I know nobody will respond to my query as I asked few in the past and
no answer received for any of my questions. However, I am asking here with
the hope it will get responded.
I am using bayesian optimization to tune the parameter of mtry for random
forest but it gives me the error: Error in
I am using xgboost hyperparameter tuning for the value of rmse. I used the
following method which returns two values ; mean which is like - 1200.12
and Best value like - 960. I guess the first value is average value and the
second is the best rmse value returned by the algorithm.
Now if we have to
How the hyperparameter settings via adaptive resampling is different from
the one obtained from grid search or random search? Both grid and random
searches provide best parameters values and the same is achieved using
adaptive resampling.
Hello to all. I am new to R language, just read few tutorials.
I have a question, we use trainControl and train function for k fold cross
validation and we don't need to predict the data separately using predict
function for test data.
But how when we have a separate test set? How can we use the
