Hi there,
Sometime download.file() failed to download the file and I would like to
remove the correspond file.
The issue is that I am not able to do it and Windows complain that the
file is use by another application.
I try to closeAllConnections(), or unlink() before removing the file but
wit
Hi Harold,
Generally: you can not beat data.table, unless you can represent your
data in a matrix (or array or vector). For some specific cases, Hervé's
suggestion might be also competitive.
Your problem is that you did not put any effort to read at least part of
the very extensive documentati
> On Sep 28, 2016, at 9:49 AM, Greg Snow <538...@gmail.com> wrote:
>
> There are multiple ways of doing this, but here are a couple.
>
> To just test the fixed effect of treatment you can use the glm function:
>
> test <- read.table(text="
> replicate treatment n X
> 1 A 32 4
> 1 B 33 18
> 2 A
You can find an example of annotating lattice graphics with text anywhere
on the graphics device using the pagenum package. See the vignette here:
https://cran.r-project.org/web/packages/pagenum/vignettes/pagenum.html
The pagenum package uses the grid package to add viewports for the
annotation.
On 09/28/2016 02:53 PM, Hervé Pagès wrote:
Hi,
I'm surprised nobody suggested split(). Splitting the data.frame
upfront is faster than repeatedly subsetting it:
tmp <- data.frame(id = rep(1:2, each = 10), foo = rnorm(20))
idList <- unique(tmp$id)
system.time(for (i in idList) tmp
> On Sep 27, 2016, at 8:11 PM, Karl Neergaard wrote:
>
> Thank you David for taking time to answer my not so helpful question.
>
I thought your question had sufficient detail for at least a reasonable guess
at an answer. When I first started using R I also thought that the gam function
would
"I'm surprised nobody suggested split(). "
I did.
by() is a data frame oriented version of tapply(), which uses split().
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom C
Hi,
I'm surprised nobody suggested split(). Splitting the data.frame
upfront is faster than repeatedly subsetting it:
tmp <- data.frame(id = rep(1:2, each = 10), foo = rnorm(20))
idList <- unique(tmp$id)
system.time(for (i in idList) tmp[which(tmp$id == i),])
# user system el
I just modified the reproducible example a bit, so it's a bit more
realistic. The function "mean" could be "easily" replaced by your analysis.
And here are some possible solutions:
tmp <- data.frame(id = rep(1:2000, each = 100), foo = rnorm(20))
tmp <- tmp[sample(dim(tmp)[1]),] # re-sampling
I regularly crunch through this amount of data with tidyverse. You can also
try the data.table package. They are optimized for speed, as long as you
have the memory.
Dominik
On Wed, Sep 28, 2016 at 10:09 AM, Doran, Harold wrote:
> I have an extremely large data frame (~13 million rows) that rese
I regularly crunch through this amount of data with tidyverse. You can also
try the data.table package. They are optimized for speed, as long as you
have the memory.
Dominik
On Wed, Sep 28, 2016 at 10:09 AM, Doran, Harold wrote:
> I have an extremely large data frame (~13 million rows) that rese
On Wed, 28 Sep 2016, "Doran, Harold" writes:
> I have an extremely large data frame (~13 million rows) that resembles
> the structure of the object tmp below in the reproducible code. In my
> real data, the variable, 'id' may or may not be ordered, but I think
> that is irrelevant.
>
> I have a p
Try changing:
v_list_of_files[v_file]
to:
v_list_of_files[[v_file]]
Also are you sure you are not generating warnings? For example,
l = list()
l["iris"] = iris;
Also, you can change it to lapply(v_files, function(v_file){...})
Have a good one,
Jeremiah
On Wed, Sep 28, 2016 at 8:02 AM, wrote:
Note that for base R, by() is considerably faster than aggregate()
(both of which are *must* faster than the sapply() stuff; tapply() is
what is more appropriate here).
(for Constantin's example):
> system.time({
+ res4 <- aggregate(tmp$foo, by = list(id=tmp$id), FUN = mean)
+ })
user syste
Many thanks. I did also try the filter function in dplyr and was also much
slower than simply indexing in the original way the code had.
system.time(replicate(500, filter(tmp, id == idList[1])))
I did this on the toy example as well as the real data, finding the same
(slower) result each time c
Don't do it this way. You are reinventing wheels.
1. Look at package dplyr, which has optimized functions to do exactly
this (break into subframes, calculate on subframes, reassemble). Note
also that dplyr is part of tidyverse. I use base R functionality for
this because I know it and it does wha
Hello,
If you work with a matrix instead of a data.frame, it usually runs
faster, but your column vectors must all be numeric.
### Fast, but not fast enough
system.time(replicate(500, tmp[which(tmp$id == idList[1]),]))
user system elapsed
0.050.000.04
### Not fast at all, a
There are multiple ways of doing this, but here are a couple.
To just test the fixed effect of treatment you can use the glm function:
test <- read.table(text="
replicate treatment n X
1 A 32 4
1 B 33 18
2 A 20 6
2 B 21 18
3 A 7 0
3 B 8 4
", header=TRUE)
fit1 <- glm( cbind(X,n-X) ~ treatment, da
Thank you very much. I don’t know tidyverse, I’ll look at that now. I did some
tests with data.table package, but it was much slower on my machine, see
examples below
tmp <- data.frame(id = rep(1:200, each = 10), foo = rnorm(2000))
idList <- unique(tmp$id)
system.time(replicate(500, tmp[which(
I have an extremely large data frame (~13 million rows) that resembles the
structure of the object tmp below in the reproducible code. In my real data,
the variable, 'id' may or may not be ordered, but I think that is irrelevant.
I have a process that requires subsetting the data by id and then
Hi All,
I need to read a bunch of Excel files and store them in R.
I decided to store the different Excel files in data.frames in a named
list where the names are the file names of each file (and that is
different from the sources as far as I can see):
-- cut --
# Sources:
# -
http://stackove
Hi,
maybe the package vegan with its tutorials is a good starting point,
too...
http://cc.oulu.fi/~jarioksa/opetus/metodi/vegantutor.pdf
http://cc.oulu.fi/~jarioksa/opetus/metodi/sessio2.pdf
all the best,
Albin
Am 22.09.2016 10:23 PM, schrieb David L Carlson:
Looking at your data there ar
22 matches
Mail list logo