[R] Large data sets with R (binding to hadoop available?)

2008-08-21 Thread Avram Aelony
Dear R community, I find R fantastic and use R whenever I can for my data analytic needs. Certain data sets, however, are so large that other tools seem to be needed to pre-process data such that it can be brought into R for further analysis. Questions I have for the many expert contrib

Re: [R] Large data sets with R (binding to hadoop available?)

2008-08-29 Thread Avram Aelony
package, for instance. I am currently playing with parallelizing R computations via Hadoop. I haven't looked at PIG yet though. Rory -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Roland Rau Sent: 21 August 2008 20:04 To: Avram Aelony Cc: r

Re: [R] Memory issues in R

2009-04-27 Thread Avram Aelony
Others may have mentioned this, but you might try loading your data in a small database like mysql and then pulling smaller portions of your data in via a package like RMySQL or RODBC. One approach might be to split the data file into smaller pieces outside of R, then read the smaller pie

[R] help converting for loop to vector operation

2009-04-29 Thread Avram Aelony
Dear List, I have a wrapper function that draws a graph that I'd like to use in a vector-like manner. The for-loop version I currently use is below. library(ggplot2) data(economics) h <- 600 w <- 800 #-- draw_metric_by_date <- functio

Re: [R] help converting for loop to vector operation

2009-04-29 Thread Avram Aelony
Dear List, Hadley offered the following solution: >library(plyr) >l_ply(2:6, draw_metric_by_date, df = economics, smooth = TRUE, BASEPATH = >basepath) Many thanks, Avram On Wednesday, April 29, 2009, at 12:59PM, "Avram Aelony" wrote: > >Dear List, > >

[R] database table merging tips with R

2008-09-11 Thread Avram Aelony
Dear R list, What is the best way to efficiently marry an R dataset with a very large (Oracle) database table? The goal is to only return Oracle table rows that match IDs present in the R dataset. I have an R data frame with 2000 user IDs analogous to: r = data.frame(userid=round(runif(20

Re: [R] database table merging tips with R

2008-09-11 Thread Avram Aelony
y the >> matching rows out. >> >> -Aaron >> >> On Thu, Sep 11, 2008 at 2:33 PM, Avram Aelony <[EMAIL PROTECTED]> wrote: >>> >>> Dear R list, >>> >>> What is the best way to efficiently marry an R dataset with a very large &

Re: [R] database table merging tips with R

2008-09-11 Thread Avram Aelony
ble(connection, "r_user_ids") > >Of course, I don't know whether the ODBC driver implements these >functions or not. (Is 'RODBC' built on DBI? Looks like Aaron and I >have been assuming that.) > >Coey > > > -Aaron > > > > On Thu, S

Re: [R] database table merging tips with R

2008-09-11 Thread Avram Aelony
. Avram On Thursday, September 11, 2008, at 02:19PM, "Coey Minear" <[EMAIL PROTECTED]> wrote: >Avram Aelony writes: > > > > I have not devoted time to setting up ROracle since binaries are > > not available and it seems to require some effort to compile (se

[R] box-whisker plot from pre-summarized data?

2008-09-29 Thread Avram Aelony
Hello, Although my summary descriptives are generated outside of R (dataset is huge), I would like to produce a box-whisker plot using bxp or perhaps a function from the ggplot2 library using the precomputed summaries. My dataset currently contains 10 rows (one row per week) with the fo

Re: [R] Doing a Task Without Using a For Loop

2008-10-14 Thread Avram Aelony
or perhaps... data1$NinYear <- with(data1, ave(ID, Year, FUN = length)) > unique(data1) ID Year NinYear 1 209 1971 2 3 213 1951 2 5 213 1953 20 20 213 1954 11 31 213 1955 2 33 234 1953 20 38 234 1958 2 40 234 1965 3 43 249 1952 2 A

Re: [R] png() and jpg() devices acting weird.

2008-12-21 Thread Avram Aelony
I experienced a similar problem generating PNGs on linux and found that installing a missing font library corrected the situation. Hope this helps, Avram On Dec 19, 2008, at 12:46 PM, Jeroen Ooms wrote: I use a CentOS 5.2 VPS to generate graphs through an R web- application. I generate

[R] bottom legends in ggplot2 ?

2009-02-26 Thread Avram Aelony
Has anyone had success with producing legends to a qplot graph such that the legend is placed on the bottom, under the abcissa rather than to the right hand side ? The following doesn't move the legend: library(ggplot2) qplot(mpg, wt, data=mtcars, colour=cyl, gpar(legend.position=

Re: [R] ways to put multiple graphs on single page (using ggplot2)

2009-03-02 Thread Avram Aelony
I usually have a function like this: vplayout <- function(x, y) viewport(layout.pos.row=x, layout.pos.col=y) draw4 <- function(pngname, a,b,c,d,w,h) { png(pngname,width=w, height=h) grid.newpage() pushViewport(viewport(layout=grid.layout(2,2) ) ) print(a, vp=vplayout(1,1))

[R] Help with maps

2008-12-02 Thread Avram Aelony
A few questions about maps... (1) How can I find a listing of the internal data sets that map() from the maps library contains? For example, "usa", "county", "state", "nz" all work. Are there any others? (2) Is there an easier, more generalized way to produce this (http://www.ai.rug.nl/~hedder

Re: [R] Help with maps

2008-12-02 Thread Avram Aelony
On Tuesday, December 02, 2008, at 04:40PM, "hadley wickham" <[EMAIL PROTECTED]> wrote: >On Tue, Dec 2, 2008 at 6:21 PM, Avram Aelony <[EMAIL PROTECTED]> wrote: >> A few questions about maps... >> >> (1) How can I find a listing of the internal dat

Re: [R] Help with maps

2008-12-03 Thread Avram Aelony
aha!, thanks for this. Avram On Dec 2, 2008, at 5:38 PM, Ray Brownrigg wrote: On Wed, 03 Dec 2008, Ray Brownrigg wrote: The easiest way would be: map('world', regions="UK", xlim=c(-10, 5), ylim=c(48, 60)) But of course: map('world', regions=c("UK", "Ireland"), xlim=c(-10, 5), ylim=c(48,

[R] bigmemory with reshape?

2008-12-10 Thread Avram Aelony
Hello, I am running into memory boundaries and would like to try to make use of the bigmemory (or any other memory enabling) library. Can anyone help with suggestions as to how this might work? > library(reshape) > s <- melt( d[,1:62], id=c(1) ) Error: cannot allocate vector of size 16.0 Mb >