Dear List,
Java Exception error while reading large data in R from DB using RJDBC.
I am trying to read large data from DB table(Vectorwise), using RJDBC
connection.
I have tested the connection with small size data and was able to fetch DB
tables using same connection(conn as in my code).
Pleas
Hi,
I am new to R for solving optimization problems, I have set of communication
channels with limited capacity with two types of costs, fixed and variable
cost. Each channel has expected gain for a single communication.
I want to determine optimal number of communications for each channel
maximiz
Hi List,
I am new to R, this may be simple.
I want to store directory path as parameter which in turn to be used while
reading and writing data from csv files.
How I can use dir defined in the below mentioned example while reading the
csv file.
Example:
dir <- "C:/Users/Desktop" #location of
Hi List,
Working on the large data frame (number of records=35000 and number of
variables=160).
Using redun() for dropping variables before using into model.
V <- redun(~., data = data.frame, r2 = 0.8)
It takes enormously high time for execution, is there anything wrong in the
script?
Suggest an
Hi List,
Being new to R, I am trying to apply boot.stepAIC() for Model selection by
bootstrapping the stepAIC() procedure. I had gone through the discussion in
various thread on the variable selection methods. Understood the pros and
cons of various method, also going through the regression modelli
Dear R-users,
I would like to determine the probability of event at specific time using
cox model fit. On the development sample data I am able to get the
probability of a event at time point(t).
I need probability score of a event at specific time, using scoring scoring
dataset which will have o
Hello List
I am trying to create and assign variable names in loop, but not able to get
expected variable names.
Here is the sample code
n = 10
set.seed(1)
x1 = rnorm(n,0)
x2 = rnorm(n,0)
samp_data <- data.frame(x1,x2)
for( i in 1:3) {
label <- paste("score", i, sep="_")
assign(label
Dear All,
I am new to R, I have one question which might be easy.
I have a large data with more than 250 variable, i am reducing number of
variables by redun function as in the example below,
n <- 100
x1 <- runif(n)
x2 <- runif(n)
x3 <- x1 + x2 + runif(n)/10
x4 <- x1 + x2 + x3 + runif(n)/10
x5 <
Hi All,
I am working on the dataset in which some of the variables have more than
one observations with outliers .
I am using below mentioned sample script
library(outliers)
x1 <- c(10, 10, 11, 12, 13, 14, 14, 10, 11, 13, 12, 13, 10, 19, 18, 17,
10099, 10099, 10098)
outlier_tf1 = outlier(x1,l
Hi Michael,
Thanks for the help.
Yes, I have gone through the document for ?outlier. As it removes one
outlier at a time, being new to R, I was woondering is there any function
available for removing multiple outliers whithout calling say rm.outlier for
n number of time because n is not finite he
Dear All,
I have got the limits for removing extreme values for each variables using
following function .
f=function(x){quantile(x, c(0.25, 0.75),na.rm = TRUE) - matrix(IQR(x,na.rm =
TRUE) * c(1.5), nrow = 1) %*% c(-1, 1)}
#Example:
n <- 100
x1 <- runif(n)
x2 <- runif(n)
x3 <- x1 + x2 + runif(
Hi David,
Thanks for the reply,
f=function(x){quantile(x, c(0.25, 0.75),na.rm = TRUE) - matrix(IQR(x,na.rm =
TRUE) * c(1.5), nrow = 1) %*% c(-1, 1)}
Here parameter 1.5 is set for example in the above function as argument, it
can be even more may be 3.0 after analyzing actual data. Here expecta
When data contains both factor and numeric variables, how to get quartiles
for all numeric variables?
n <- 100
x1 <- runif(n)
x2 <- runif(n)
x3 <- x1 + x2 + runif(n)/10
x4 <- x1 + x2 + x3 + runif(n)/10
x5 <- factor(sample(c('a','b','c'),n,replace=TRUE))
x6 <- factor(1*(x5=='a' | x5=='c'))
dat
I need to deciles data containing more than one variables using any one
variable. I am using script below :
id <-c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)
tot <-c(1230, 1230, 2345, 3456, 456, 4356, 123, 124, 987, 785, 5646, 345,
2345, 3456, 456, 4356, 123, 124, 987, 785)
data <-
Hi Experts,
I am new to R, using decision tree model for getting segmentation rules.
A) Using behavioural data (attributes defining customer behaviour, ( example
balances, number of accounts etc.)
1. Clustering: Cluster behavioural data to suitable number of clusters
2. Decision Tree: Using rpart
Hi,
Thanks for the responce, code for each case is as:
c_c_factor <- 0.001
min_obs_split <- 80
A)
fit <- rpart(segment ~., method="class",
control=rpart.control(minsplit=min_obs_split, cp=c_c_factor),
data=Beh_cluster_out)
B)
fit <- rpart(segment ~., method="class",
Hi Experts,
This may be simple question, I want to create new variable "seg" and assign
values to it based on some conditions satisfied by each observation.
Here is the example:
##Below are the conditions
##if variable x2 gt 0 and x3 gt 200 then seg should take value 1,
##if variable x2 gt 100
I have got solution using within function as below
dd$Seg <- 1
dd <- within(dd, Seg[x2> 0 & x3> 200] <- 1)
dd <- within(dd, Seg[x2> 100 & x3> 300] <- 2)
dd <- within(dd, Seg[x2> 200 & x3> 400] <- 3)
dd <- within(dd, Seg[x2> 300 & x3> 500] <- 4)
I sthere any better way of doing it!!
--
View th
Hi List,
I am reading table from postgres database into R session using RJDBC, table
contains 150 columns and 20 rows.
Sample code is as below, which works fine with smaller tables.
db_driver <- mydir$db_driver
db_jar_fi
Hi All,
This might be simple question, I need to retrive data for modelling from the
databases. Eveytime date values changes so I countnot fix date value in the
code, it is required to pass as parameter.
When I pass the date as parameter, it throws error.
(ERROR: column "start_dt" does not exist
Dear List,
Couple of issues while using functions from “BCA” library:
1. I am trying to use “lift.chart” function from “BCA” library, but facing
issues while using model where model formula is passed as formula object in
glm.
When model formula is written as text, then it works fine. In my case
21 matches
Mail list logo