[R] best HDF5 package: h5r or rhdf5?

2012-11-27 Thread Johann Hibschman
What is the current best package for manipulating HDF5 data files? I tried "hdf5" a long time ago, but I ran into memory problems. "h5r" is on CRAN now, and "rhdf5" is part of bioconductor. Ideally, I'd like to read simple vectors or tables, either the entire thing or a subset of rows. I don't ne

[R] earth and linearly dependent regressors

2012-03-22 Thread Johann Hibschman
I'm trying to understand how the earth package treats linearly dependent regressors. I was surprised when switching between two linearly-dependent terms gave different results. Here's an example: > library(earth) > cars2 <- transform(cars, speed2=100-speed) > earth(dist ~ speed, data=cars2) Select

[R] substitute games with randomForest::partialPlot

2011-09-14 Thread Johann Hibschman
I'm having trouble calling randomForest::partialPlot programmatically. It tries to use name of the (R) variable as the data column name. Example: library(randomForest) iris.rf <- randomForest(Species ~ ., data=iris, importance=TRUE, proximity=TRUE) partialPlot(iris.rf, iris, Sepal.Width)

Re: [R] R system command does not work with objects/variables

2011-08-23 Thread Johann Hibschman
syrvn writes: > If I store the path in a variable/object and call the perl script again it > does not run and I don't know how to overcome that issue. > > p1 <- "../path1" > p2 <- "../path2" > p3 <- "../path3" > > system("perl p1 p2 p3") You want something like: system(paste("perl", p1, p2, p

Re: [R] unexpected sort order with merge

2011-04-07 Thread Johann Hibschman
B77S writes: > That is odd, I noticed some weird sorting with merge() a while back too and > always am careful with it now. Fortunately, sort=FALSE seems to work the > way one would think most of the time. Thanks for checking. Is this on a more recent version of R than 2.10.1? (I'm half-hoping

[R] unexpected sort order with merge

2011-04-06 Thread Johann Hibschman
`merge` lists sorted as if by character, not by the actual class of the by-columns. > tmp <- merge(data.frame(f=ordered(c("a","b","b","a","b"), levels=c("b","a")), x=1:5), data.frame(f=ordered(c("a","b"),

Re: [R] Saving fits (glm, nls) without data

2010-09-07 Thread Johann Hibschman
Duncan Murdoch writes: > On 07/09/2010 4:18 PM, Johann Hibschman wrote: >> Going through that code, I settled on the following function to remove >> all but the most needed components: >> >> ## Strip down a glm object, until it can only be used for prediction, >

Re: [R] Saving fits (glm, nls) without data

2010-09-07 Thread Johann Hibschman
David Winsemius writes: > Just tested my theory and it seems to be holding up. Took the example > on the predict help page, set three of the variable length components > not needed in the predict operations to NULL and the code still runs > fine. It does not appear that either predict.glm or pred

Re: [R] Saving fits (glm, nls) without data

2010-09-07 Thread Johann Hibschman
David Winsemius writes: > On Sep 7, 2010, at 11:02 AM, Johann Hibschman wrote: >> Even so, I would prefer to only save the coefficients > > Have you read through the Value section of glm's help page? > > ...and > > ?coef I have; it's easy to get the coeffici

[R] Saving fits (glm, nls) without data

2010-09-07 Thread Johann Hibschman
Is there any package that assists in saving and reconstituting glm and nls fits without bringing along the accompanying data? A quick search on CRAN didn't turn up anything. If not, how do other people deal with saving the coefficients of model fits? For example, I've run a glm fit that has 23 c

Re: [R] memory use without running gc()

2010-08-10 Thread Johann Hibschman
Allan Engelhardt writes: > ### Method 2 > ## Setup > file <- paste("/proc", Sys.getpid(), "stat", sep = "/") > what <- vector("list", 44); what[[23]] <- integer(0) > ## In your logging routine > vsz <- scan(file, what = what, quiet = TRUE)[[23]]/1024 > cat("Virtual size: ", vsz, "\n", sep = "")

Re: [R] memory use without running gc()

2010-08-06 Thread Johann Hibschman
jim holtman writes: > ?memory.size Only works on Windows. I guess I should have specified; this is on Linux. Thanks, Johann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www

[R] memory use without running gc()

2010-08-06 Thread Johann Hibschman
Is there any way to get the current memory used by R without running gc()? I'd like to include the memory usage in logging output, but calling gc() to get that information takes long enough to be noticeable (~ 6 s with ~ 20 GB of ram in use), which goes against the point of logging. Thanks, Johan

Re: [R] Using '[' as a function

2010-08-03 Thread Johann Hibschman
Duncan Murdoch writes: > On 29/07/2010 6:18 PM, chipmaney wrote: >> >> -Why does R recognize '[' as a function? > > Because it is a function. More explicitly, '[' is a string. sapply then calls match.fun to look up that string to get the function named '['. >> -Why does it need the quotes? >

Re: [R] Kalman Filter

2010-05-30 Thread Johann Hibschman
Greigiano Jose Alves writes: > I am working on an article forecasting, which use the dynamic linear model, > a model state space. I am wondering all the commands in R, to represent the > linear dynamic model and Kalman filter. > I am available for any questions. There are a few libraries out the

[R] export R data as web service

2010-04-01 Thread Johann Hibschman
I'd like to access data in my R session from elsewhere via HTTP. My particular use case would be R on Linux, accessing data from Windows, all on the same intranet. Naively, when I say this, I imagine something like: > theData <- big.calculation.returning.data.frame() > startHttpServer(port=8675)

[R] predict.Arima: warnings from xreg magic

2010-03-31 Thread Johann Hibschman
When I run predict.Arima in my code, I get warnings like: Warning message: In cbind(intercept = rep(1, n), xreg) : number of rows of result is not a multiple of vector length (arg 1) I think this is because I'm not running predict.Arima in the same environment that I did the fit, so the d

Re: [R] R Memory Problem

2010-01-25 Thread Johann Hibschman
prem_R writes: > I'm running predictive analytics using R and to calibrate my model i > used to adjust the variables used in the model and the problem happens > here.R just runs out of memory .I tried garbage cleaning also. I'm analyzing a 8 GB data set using R, so it can certainly handle large

[R] zoo: bug with unique for yearmon

2009-11-09 Thread Johann Hibschman
to interpret those numbers. Clearly, I'm being bitten by the floating-point representation, but the only "complex" thing I did was to manually lag a time series by assigning date <- date + 1/12, and I was hoping that the yearmon class would apply some magic to normalize the repr

Re: [R] Outputing multilple subsets

2009-11-08 Thread Johann Hibschman
x. Also, you will be overwriting the same file, called "c_num.csv", on each iteration. You should try something more like: for (n in num) { c.n <- c[c$b==n,] write.csv(c.n, file=paste("c:/c_", n, ".csv", sep="") } I hope that helps. Cheers, Johan

[R] random text added to names (bug with 2.10.0?)

2009-11-03 Thread Johann Hibschman
I'm using 2.10.0 on Linux (64 bit), and I just noticed that random numbers are occasionally added to the text of names in vectors. It's happened to me in two separate, long-running R sessions, but I can't find a way to reproduce it in a smaller setting. The code I'm using is > diag.gam.2 <- mdl.r

[R] Plotting fit marginals, multiple plots on same x-axis

2009-10-08 Thread Johann Hibschman
I'm trying to plot the "marginals" of a fit: the aggregated value of the actual and predicted vs. a cut/bucketed dimension. (The data set is huge, so just plotting the raw points would be unintelligible.) I'd also like to plot the number of points in each bucket (or, rather, the sum of the weights