What is the current best package for manipulating HDF5 data files?
I tried "hdf5" a long time ago, but I ran into memory problems. "h5r" is on
CRAN now, and "rhdf5" is part of bioconductor.
Ideally, I'd like to read simple vectors or tables, either the entire thing
or a subset of rows. I don't ne
I'm trying to understand how the earth package treats linearly
dependent regressors. I was surprised when switching between two
linearly-dependent terms gave different results. Here's an example:
> library(earth)
> cars2 <- transform(cars, speed2=100-speed)
> earth(dist ~ speed, data=cars2)
Select
I'm having trouble calling randomForest::partialPlot programmatically.
It tries to use name of the (R) variable as the data column name.
Example:
library(randomForest)
iris.rf <- randomForest(Species ~ ., data=iris, importance=TRUE,
proximity=TRUE)
partialPlot(iris.rf, iris, Sepal.Width)
syrvn writes:
> If I store the path in a variable/object and call the perl script again it
> does not run and I don't know how to overcome that issue.
>
> p1 <- "../path1"
> p2 <- "../path2"
> p3 <- "../path3"
>
> system("perl p1 p2 p3")
You want something like:
system(paste("perl", p1, p2, p
B77S writes:
> That is odd, I noticed some weird sorting with merge() a while back too and
> always am careful with it now. Fortunately, sort=FALSE seems to work the
> way one would think most of the time.
Thanks for checking. Is this on a more recent version of R than 2.10.1?
(I'm half-hoping
`merge` lists sorted as if by character, not by the actual class of the
by-columns.
> tmp <- merge(data.frame(f=ordered(c("a","b","b","a","b"),
levels=c("b","a")),
x=1:5),
data.frame(f=ordered(c("a","b"),
Duncan Murdoch writes:
> On 07/09/2010 4:18 PM, Johann Hibschman wrote:
>> Going through that code, I settled on the following function to remove
>> all but the most needed components:
>>
>> ## Strip down a glm object, until it can only be used for prediction,
>
David Winsemius writes:
> Just tested my theory and it seems to be holding up. Took the example
> on the predict help page, set three of the variable length components
> not needed in the predict operations to NULL and the code still runs
> fine. It does not appear that either predict.glm or pred
David Winsemius writes:
> On Sep 7, 2010, at 11:02 AM, Johann Hibschman wrote:
>> Even so, I would prefer to only save the coefficients
>
> Have you read through the Value section of glm's help page?
>
> ...and
>
> ?coef
I have; it's easy to get the coeffici
Is there any package that assists in saving and reconstituting glm and
nls fits without bringing along the accompanying data? A quick search
on CRAN didn't turn up anything.
If not, how do other people deal with saving the coefficients of model
fits?
For example, I've run a glm fit that has 23 c
Allan Engelhardt writes:
> ### Method 2
> ## Setup
> file <- paste("/proc", Sys.getpid(), "stat", sep = "/")
> what <- vector("list", 44); what[[23]] <- integer(0)
> ## In your logging routine
> vsz <- scan(file, what = what, quiet = TRUE)[[23]]/1024
> cat("Virtual size: ", vsz, "\n", sep = "")
jim holtman writes:
> ?memory.size
Only works on Windows. I guess I should have specified; this is on Linux.
Thanks,
Johann
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www
Is there any way to get the current memory used by R without running
gc()?
I'd like to include the memory usage in logging output, but calling gc()
to get that information takes long enough to be noticeable (~ 6 s with ~
20 GB of ram in use), which goes against the point of logging.
Thanks,
Johan
Duncan Murdoch writes:
> On 29/07/2010 6:18 PM, chipmaney wrote:
>>
>> -Why does R recognize '[' as a function?
>
> Because it is a function.
More explicitly, '[' is a string. sapply then calls match.fun to look
up that string to get the function named '['.
>> -Why does it need the quotes?
>
Greigiano Jose Alves writes:
> I am working on an article forecasting, which use the dynamic linear model,
> a model state space. I am wondering all the commands in R, to represent the
> linear dynamic model and Kalman filter.
> I am available for any questions.
There are a few libraries out the
I'd like to access data in my R session from elsewhere via HTTP. My
particular use case would be R on Linux, accessing data from Windows,
all on the same intranet.
Naively, when I say this, I imagine something like:
> theData <- big.calculation.returning.data.frame()
> startHttpServer(port=8675)
When I run predict.Arima in my code, I get warnings like:
Warning message:
In cbind(intercept = rep(1, n), xreg) :
number of rows of result is not a multiple of vector length (arg 1)
I think this is because I'm not running predict.Arima in the same
environment that I did the fit, so the d
prem_R writes:
> I'm running predictive analytics using R and to calibrate my model i
> used to adjust the variables used in the model and the problem happens
> here.R just runs out of memory .I tried garbage cleaning also.
I'm analyzing a 8 GB data set using R, so it can certainly handle large
to interpret those numbers.
Clearly, I'm being bitten by the floating-point representation, but
the only "complex" thing I did was to manually lag a time series by
assigning date <- date + 1/12, and I was hoping that the yearmon class
would apply some magic to normalize the repr
x.
Also, you will be overwriting the same file, called "c_num.csv", on
each iteration.
You should try something more like:
for (n in num) {
c.n <- c[c$b==n,]
write.csv(c.n, file=paste("c:/c_", n, ".csv", sep="")
}
I hope that helps.
Cheers,
Johan
I'm using 2.10.0 on Linux (64 bit), and I just noticed that random
numbers are occasionally added to the text of names in vectors. It's
happened to me in two separate, long-running R sessions, but I can't
find a way to reproduce it in a smaller setting.
The code I'm using is
> diag.gam.2 <- mdl.r
I'm trying to plot the "marginals" of a fit: the aggregated value of
the actual and predicted vs. a cut/bucketed dimension. (The data set
is huge, so just plotting the raw points would be unintelligible.) I'd
also like to plot the number of points in each bucket (or, rather, the
sum of the weights
22 matches
Mail list logo