[R] Compression data to variable using gzcon

2025-01-31 Thread Jan van der Laan
I wanted to compress a raw vector into another raw vector. This problem is solved using memCompress. However before discovering this function, I first tried to get this working using gzcon. From the documentation it seems that this should be possible: > Compressed output will contain e

Re: [R] Regarding Issue Running Parallel Computing on Linux RHEL version 8

2025-01-14 Thread Jan van der Laan
On 1/14/25 10:07, akshay kulkarni wrote: dear Ivan, THe present problem had been encountered by me also about 1 and a half years ago. You had solved the issue then. Can't we search this mail list according to some keywords? It helps people with problems already solved in t

Re: [R] Time zones in POSIClt objects

2024-10-11 Thread Jan van der Laan
On 10/11/24 11:56, Rui Barradas wrote: Hello, A way to have different time zones is to store t1 and t2 in list, which are vectors. Just not atomic vectors. I think it complicates what should be simple, but here it is. # create two lists t1 <- lapply(c("2024-01-01 12:30", "2024-01-01 12:30

Re: [R] Time zones in POSIClt objects

2024-10-11 Thread Jan van der Laan
Thanks, On 10/11/24 09:10, Ivan Krylov wrote: В Thu, 10 Oct 2024 17:16:52 +0200 Jan van der Laan пишет: This is where it is unclear to me what the purpose is of the `zone` element of the POSIXlt object. It does allow for registering a time zone per element. It just seems to be ignored. I

Re: [R] Time zones in POSIClt objects

2024-10-10 Thread Jan van der Laan
T and it would be fine to convert them to a single time zone if that is what it takes to work with them in R. So, I guess, I could split the vector in two, convert local time to GMT and combine them again (respecting the original order). Jan On October 10, 2024 6:46:19 AM PDT, Jan van

[R] Time zones in POSIClt objects

2024-10-10 Thread Jan van der Laan
It is not completely clear to me how time zones work with POSIXlt objects. For POSIXct, I can understand what happens: time is always stored in GMT, the `tzone` attribute only affects how the times are displayed. All computations etc. are done in GMT. POSIXlt objects have both a `tzone` att

Re: [R] Removing polygons from shapefile of Scotland and Islands

2024-05-14 Thread Jan van der Laan
I believe mapshaper has functionality for removing small 'islands'. There is a webinterface for mapshaper, but I see there is also an R-package (see https://search.r-project.org/CRAN/refmans/rmapshaper/html/ms_filter_islands.html for island removal). If you want to manually select which islan

Re: [R] Is it possible to get a downward pointing solid triangle plotting symbol in R?

2023-10-06 Thread Jan van der Laan
thing about these issues though I do hit problems exchanging things with my Spanish speaking colleagues).  Jan or anyone: any simple reassurance or pointers to resources I should best use for homework about these issues? TIA (again!) Chris On 06/10/2023 12:55, Jan van der Laan wrote: You are

Re: [R] Is it possible to get a downward pointing solid triangle plotting symbol in R?

2023-10-06 Thread Jan van der Laan
B" = "grey",   "C" = "green"),    labels = c("Deteriorated",   "No change",   "Improved")) +   scale_fill_manual(name = "Change"

Re: [R] Is it possible to get a downward pointing solid triangle plotting symbol in R?

2023-10-06 Thread Jan van der Laan
Does adding , show.legend = c("color"=TRUE, "fill"=FALSE) to the geom_point do what you want? Best, Jan On 06-10-2023 11:09, Chris Evans via R-help wrote: library(tidyverse) tibble(x = 2:9, y = 2:9, c = c(rep("A", 5), rep("B", 3))) -> tmpTibPoints tibble(x = c(1, 5, 5, 1), y = c(1, 1, 5, 5)

Re: [R] overlay shaded area in base r plot

2023-09-19 Thread Jan van der Laan
Shorter/simpler alternative for adding a alpha channel adjustcolor("lightblue", alpha = 0.5) So I would use something like: # Open new plot; make sure limits are ok; but don't plot plot(0, 0, xlim=c(1,20), ylim = range(c(mean1+sd1, mean2+sd2, mean1-sd1, mean2-sd2)), type="n", las=1, xla

Re: [R] Obtaining R-squared from All Possible Combinations of Linear Models Fitted

2023-07-18 Thread Jan van der Laan
The dredge function has a `extra` argument to get other statistics: optional additional statistics to be included in the result, provided as functions, function names or a list of such (preferably named or quoted). As with the rank argument, each function must accept as an argument a fitted mo

Re: [R] Plotting directly to memory?

2023-05-28 Thread Jan van der Laan
Perhaps the ragg package? That has an `agg_capture` device "that lets you access the device buffer directly from your R session." https://github.com/r-lib/ragg HTH, Jan On 28-05-2023 13:46, Duncan Murdoch wrote: Is there a way to open a graphics device that plots entirely to an array or

Re: [R] nth kludge

2023-03-09 Thread Jan van der Laan
Hi Avi, list, Below an alternative suggestion: func <- function(a, b, c) { list(a, b, c) } 1:3 |> list(x = _) |> with(func(a, x, b)) Not sure if this is more readable than some of the other solutions, e.g. your solution, but you could make a variant of with more specific for this use case

Re: [R] foreign package: unable to read S-Plus objects

2023-01-17 Thread Jan van der Laan
You could try to see what stattransfer can make of it. They have a free version that imports only part of the data. You could use that to see if stattransfer would help and perhaps discover what format it is in. HTH Jsn On 16-01-2023 23:22, Joseph Voelkel wrote: Dear foreign maintainers and

Re: [R] Reading very large text files into R

2022-09-29 Thread Jan van der Laan
You're sure the extra column is indeed an extra column? According to the documentation (https://artefacts.ceda.ac.uk/badc_datadocs/ukmo-midas/RH_Table.html) there should be 15 columns. Could it, for example, be that one of the columns contains records with commas? Jan On 29-09-2022 15:54

Re: [R] How to represent tree-structured values

2022-05-30 Thread Jan van der Laan
For visualising hierarchical data a treemap can also work well. For example, using the treemap package: n <- 1000 library(data.table) library(treemap) dta <- data.table(   level1 = sample(LETTERS[1:5], n, replace = TRUE),   level2 = sample(letters[1:5], n, replace = TRUE),   level3 = sample(1:

Re: [R] vectorization of loops in R

2021-11-17 Thread Jan van der Laan
Have a look at the base functions tapply and aggregate. For example see: - https://cran.r-project.org/doc/manuals/r-release/R-intro.html#The-function-tapply_0028_0029-and-ragged-arrays , - https://online.stat.psu.edu/stat484/lesson/9/9.2, - or ?tapply and ?aggregate. Also your current code se

Re: [R] Is there a hash data structure for R

2021-11-03 Thread Jan van der Laan
(listOfValues, parent = emptyenv()) Hope this helps! On Tue, Nov 2, 2021, 06:49 Yonghua Peng wrote: But for data.frame the colnames can be duplicated. Am I right? Regards. On Tue, Nov 2, 2021 at 6:29 PM Jan van der Laan wrote: True, but in a lot of cases where a python user might use a

Re: [R] Is there a hash data structure for R

2021-11-02 Thread Jan van der Laan
e a data.frame with multiple columns with the same name. But as Duncan Murdoch mentions you can usually control for that. Best, Jan On 02-11-2021 11:32, Yonghua Peng wrote: But for data.frame the colnames can be duplicated. Am I right? Regards. On Tue, Nov 2, 2021 at 6:29 PM Jan van der L

Re: [R] Is there a hash data structure for R

2021-11-02 Thread Jan van der Laan
True, but in a lot of cases where a python user might use a dict an R user will probably use a list; or when we are talking about arrays of dicts in python, the R solution will probably be a data.frame (with each dict field in a separate column). Jan On 02-11-2021 11:18, Eric Berger wro

Re: [R] Getting different results with set.seed()

2021-08-19 Thread Jan van der Laan
What you could also try is check if the self coded functions use the random generator when defining them: starting_seed <- .Random.seed Step 1. Self-coded functions (these functions generate random numbers as well) # check if functions have modified the seed: all.equal(starting_seed, .Ra

Re: [R] Read fst files

2021-06-09 Thread Jan van der Laan
read_fst is from the package fst. The fileformat fst uses is a binary format designed to be fast readable. It is a column oriented format and compressed. So, to be able to work fst needs access to the file itself and wont accept a file connection as functions like read.table an variants ac

Re: [R] What is an alternative to expand.grid if create a long vector?

2021-04-20 Thread Jan van der Laan
This is an optimisation problem that you are trying to solve using a grid search. There are numerous methods for optimisation, see https://cran.r-project.org/web/views/Optimization.html for and overview for R. It really depends on the exact problem what method is appropriate. As Petr said h

Re: [R] What is an alternative to expand.grid if create a long vector?

2021-04-20 Thread Jan van der Laan
But even if you could have a generator that is superefficient and perform an calculation that is superfast the number of elements is ridiculously large. If we take 1 nanosec per element; the computation would still take: > (100^10)*1E-9/3600 [1] 2778 hours, or > (100^10)*1E-9/3600/24/

Re: [R] /usr/local/lib/R/site-library is not writable

2021-04-08 Thread Jan van der Laan
I would actually go a step in the other direction: per project libraries. For example by adding a .Rprofile file to your project directory. This ensures that everybody working on a project uses the same version of the packages (even on different machines e.g. on shared folders). This can g

Re: [R] Help with connection issue for R (just joined, leading R for our agency)

2020-12-15 Thread Jan van der Laan
Alejandra, If it was initially working ok, I would first check with the IT department if there has been a change to the configuration of the firewall, virus scanners, file system etc. as these can affect the performance of R-studio. R-studio uses a client-server setup on your machine, so a fi

Re: [R] saveRDS() and readRDS() Why?

2018-11-07 Thread Jan van der Laan
Are you sure you didn't do saveRDS("rawData", file = "rawData.rds") instead of saveRDS(rawData, file = "rawData.rds") ? This would explain the result you have under linux. In principle saveRDS and readRDS can be used to copy objects between R-sessions without loosing information. What doe

Re: [R] Plot a path

2018-11-01 Thread Jan van der Laan
Below a similar example, using sf and leaflet; plotting the trajectory on a background map. library(leaflet) library(sf) library(dplyr) # Generate example data gen_data <- function(id, n) { data.frame( id = id, date = 1:n, lat = runif(10, min = -90, max = 90), lon = runif(10,

Re: [R] Calculating just a single row of dissimilarity/distance matrix

2018-10-27 Thread Jan van der Laan
distance for the first 5 rows. but it did not work. Do you have any suggestion about it? On Fri, 26 Oct 2018 at 21:31, Jan van der Laan <mailto:rh...@eoos.dds.nl>> wrote: Using another implementation of the gower distance: library(gower) gower_dist(iris[1,], iris)

Re: [R] Calculating just a single row of dissimilarity/distance matrix

2018-10-26 Thread Jan van der Laan
Using another implementation of the gower distance: library(gower) gower_dist(iris[1,], iris) HTH, Jan On 26-10-18 15:07, Aerenbkts bkts wrote: I have a data-frame with 30k rows and 10 features. I would like to calculate distance matrix like below; gower_dist <- daisy(data-frame, metr

Re: [R] Erase content of dataframe in a single stroke

2018-09-27 Thread Jan van der Laan
Or testdf <- testdf[FALSE, ] or testdf <- testdf[numeric(0), ] which seems to be slightly faster. Best, Jan Op 27-9-2018 om 10:32 schreef PIKAL Petr: Hm I would use testdf<-data.frame(A=c(1,2),B=c(2,3),C=c(3,4)) str(testdf) 'data.frame': 2 obs. of 3 variables: $ A: num 1 2 $ B:

Re: [R] security using R at work

2018-08-09 Thread Jan van der Laan
You can also inadvertently transmit data to the internet using a package without being obviously 'stupid', e.g. by using a package that uses an external service for data processing. For example, some javascript visualisation libs can do that (not sure if those wrapped in R-packages do), or, for

Re: [R] Help understanding why glm and lrm.fit runs with my data, but lrm does not

2017-09-14 Thread Jan van der Laan
With lrm.fit you are fitting a completely different model. One of the things lrm does, is preparing the input for lrm.fit which in this case means that dummy variables are generated for categorical variables such as 'KILLIP'. The error message means that model did not converge after the maxi

Re: [R] Loading large .pxt and .asc datasets causes issues.

2016-02-23 Thread Jan van der Laan
First, the file does contain 302 columns; the variable layout (http://www.cdc.gov/brfss/annual_data/2006/varlayout_table_06.htm) contains 302 columns. So, reading the SASS file probably works correctly. Second, the read.asc function you use is for reading geographic raster files, not fixed wid

Re: [R] Coding systems.

2013-11-26 Thread Jan van der Laan
Could it be that your r-script is saved in a different encoding than the one used by R (which will probably be UTF8 since you're working on linux)? -- Jan gerald.j...@dgag.ca schreef: Hello, I am using R, 2.15.2, on a 64-bit Linux box. I run R through Emacs' ESS. R runs in a French,

Re: [R] Reading in csv data with ff package

2013-11-19 Thread Jan van der Laan
The following seems to work: data = read.csv.ffdf(x=NULL,file="data.csv",nrows=1001,first.rows = 500, next.rows = 1005,sep=",",colClasses = c("integer","factor","logical")) 'character' doesn't work because ff does not support character vectors. Character vector need to be stored as factors.

Re: [R] laf_open_fwf

2013-08-09 Thread Jan van der Laan
admin.ch -Ursprüngliche Nachricht- Von: Jan van der Laan [mailto:rh...@eoos.dds.nl] Gesendet: Freitag, 9. August 2013 10:01 An: Kamenik Christian ASTRA Betreff: Re: AW: AW: [R] laf_open_fwf Christian, It seems some of the lines in your file have additional characters at the end causing the line

Re: [R] laf_open_fwf

2013-08-08 Thread Jan van der Laan
89 Fax +41 31 323 43 21 christian.kame...@astra.admin.ch www.astra.admin.ch -Ursprüngliche Nachricht- Von: Jan van der Laan [mailto:rh...@eoos.dds.nl] Gesendet: Mittwoch, 7. August 2013 20:57 An: r-help@r-project.org Cc: Kamenik Christian ASTRA Betreff: Re: [R] laf_open_fwf Dear Christian

Re: [R] read.table.ffdf and fixed width files

2013-08-07 Thread Jan van der Laan
What probably is the problem is that read.table.ffdf uses the nrows argument to read the file in chunks. However, read.fwf doesn't use a nrow argument but a n argument. One (non tested) solution is to write a wrapper around read.fwf and pass this wrapper to read.table.ffwf. Something like:

Re: [R] laf_open_fwf

2013-08-07 Thread Jan van der Laan
Dear Christian, Well... it shouldn't normally do that. The only way I can currently think of that might cause this problem is that the file has \r\n\r\n, which would mean that every line is followed by an empty line. Another cause might be (although I would not really expect the results you

Re: [R] How is a file descriptor stored ?

2013-08-07 Thread Jan van der Laan
I don't know how many files you are planning to open, but what you also might run into is the maximum number of connections namely 125. See ?file. Jan mohan.radhakrish...@polarisft.com schreef: Hi, I thought that 'R' like java will allow me to store file names (keys) and file d

Re: [R] How to use character in C code?

2013-05-16 Thread Jan van der Laan
Characters in R are zero terminated (although I couldn't find that in the R extensions manual). So, you could use: void dealWithCharacter(char **chaine, int *size){ Rprintf("The string is '%s'\n", chaine[0]); } Jan On 05/10/2013 03:51 PM, cgenolin wrote: Hi the list, I include some C c

Re: [R] path reference problems in R 3.0.0

2013-04-28 Thread Jan van der Laan
Some colleagues ran into similar problems after migrating to windows 7. They could no longer install packages in certain network locations because the read only bit was set (which could be unset after which windows set it again). Perhaps the following helps: http://itexpertvoice.com/home/fi

Re: [R] Read big data (>3G ) methods ?

2013-04-27 Thread Jan van der Laan
I believe it was already mentioned, but I can recommend the LaF package (not completely impartial being the maintainer of LaF ;-) However, the speed differences between packages will not be very large. Eventually all packages will have to read in 6 GB of data and convert the text data to num

Re: [R] Running other programs from R

2013-03-16 Thread Jan van der Laan
Have a look at the system command: ?system HTH, Jan On 03/16/2013 10:09 PM, Sedat Sen wrote: Dear list, I want to run a statistical program (using its .exe file) from R by writing a script. I know there are some packages that call WinBUGS, Mplus etc. form R. I just want to call the .exe

Re: [R] HOw to achieve big vector times big dataframe in R?

2013-03-14 Thread Jan van der Laan
apply((t(as.matrix(b)) * a), 2, sum) should do what you want. Why this works; see, http://cran.r-project.org/doc/manuals/r-release/R-intro.html#The-recycling-rule and the paragraph before that. Jan Tammy Ma schreef: HI, I have the following question: Vector a with lenght 150 A B

Re: [R] How to transpose it in a fast way?

2013-03-08 Thread Jan van der Laan
You could use the fact that scan reads the data rowwise, and the fact that arrays are stored columnwise: # generate a small example dataset exampl <- array(letters[1:25], dim=c(5,5)) write.table(exampl, file="example.dat", row.names=FALSE. col.names=FALSE, sep="\t", quote=FALSE) # and re

Re: [R] can not read table in dbReadTable

2012-11-02 Thread Jan van der Laan
I suspect it should be my.data.copy <- dbReadTable(con, "week42") (with con instead of tbs as first argument) Jan Tammy Ma schreef: tbs<-dbListTables(con) tbs [1] "lowend" "time" "week30" "week33" "week39" "week42" my.data.copy <- dbReadTable(tbs, "week42") Error in function (cla

Re: [R] Loop over several variables

2012-11-01 Thread Jan van der Laan
Or ti <- aggregate(dataframename[paste0("y", 1:3)], by=dataframename["aggregationvar"], sum,na.rm=TRUE) which gives you all results in one data.frame. Jan "MacQueen, Don" schreef: Many ways. Here is one: ### supposing you have y1, y2, and y3 in your data frame for (i in

[R] Start R from bash/bat file and end in interactive mode

2012-11-01 Thread Jan van der Laan
I have a r-script (rook.R) that starts a Rook server. To present users from having to start R and type in source("rook.R"), I want to create a bash script and bat file that starts R and sources the script. However, to keep the Rook server running R should not close after running the scrip

Re: [R] own function: computing time

2012-10-10 Thread Jan van der Laan
Did not see a simple way to make it faster. However, this is a piece of code which can be made to run much faster in C. See below. I don't know if you are familiar with running c-code from R. If not, the official documentation is in the R Extensions manual. However, this is not the most easy

Re: [R] ffbase, help with %in%

2012-10-02 Thread Jan van der Laan
It doesn't seem possible to index an ff-vector using a logical ff-vector. You can use subset (also in ffbase) or first convert 'a' to a normal logical vector: library(ff) library(ffbase) data1 <- as.ffdf(data.frame(a = letters[1:10], b=1:10)) data2 <- as.ffdf(data.frame(a = letters[5:26

Re: [R] splitting a vector

2012-08-02 Thread Jan van der Laan
I come up with: runs <- function(numbers) { tmp <- diff(c(0, which(diff(numbers) <= 0), length(numbers))) split(numbers, rep(seq_along(tmp), tmp)) } Can't say it's elegant, but it seems to work runs(c(1:3, 1:4)) $`1` [1] 1 2 3 $`2` [1] 1 2 3 4 runs(c(1,1,1)) $`1` [1] 1 $`2` [1]

Re: [R] ff package: reading selected columns from csv

2012-07-26 Thread Jan van der Laan
You probably have a character (which is converted to factor) or factor column with a large number of distinct values. All the levels of a factor are stored in memory in ff. Jan threshold schreef: *..plus I get the following message after reading the whole set (all 7 columns):* read.c

Re: [R] ff package: reading selected columns from csv

2012-07-26 Thread Jan van der Laan
Looking at the source code for read.table.ffdf what seems to happen is that when reading the first block of data by read.table (standard 1000 lines) the specified colClasses are used. In subsequent calls the types of the columns of the ffdf object are used as colClasses. In your case the

Re: [R] ff package: reading selected columns from csv

2012-07-26 Thread Jan van der Laan
Having had a quick look at the source code for read.table.ffdf, I suspect that using 'NULL' in the colClasses argument is not allowed. Could you try to see if you can use read.table.ffdf with specifying the colClasses for all columns (thereby reading in all columns in the file)? If that w

Re: [R] complexity of operations in R

2012-07-20 Thread Jan van der Laan
ALSE) } return(v$get()) } system.time(h3(1E5)) user system elapsed 22.846 0.536 23.407 system.time(h4(1E5)) user system elapsed 0.700 0.000 0.702 Jan Johan Henriksson schreef: On Thu, Jul 19, 2012 at 5:02 PM, Jan van der Laan wrote: Johan, Your 'list&#

Re: [R] complexity of operations in R

2012-07-19 Thread Jan van der Laan
oject.org Subject: Re: [R] complexity of operations in R Hadley et. al: Indeed. And using a loop is a poor way to do it anyway. v <- as.list(rep(FALSE,dotot)) is way faster. -- Bert On Thu, Jul 19, 2012 at 8:50 AM, Hadley Wickham wrote: On Thu, Jul 19, 2012 at 8:02 AM, Jan van der Laan

Re: [R] complexity of operations in R

2012-07-19 Thread Jan van der Laan
AM, Jan van der Laan wrote: Johan, Your 'list' and 'array doubling' code can be written much more efficient. The following function is faster than your g and easier to read: g2 <- function(dotot) { v <- list() for (i in seq_len(dotot)) { v[[i]] <- FAL

Re: [R] complexity of operations in R

2012-07-19 Thread Jan van der Laan
On 07/19/2012 05:50 PM, Hadley Wickham wrote: On Thu, Jul 19, 2012 at 8:02 AM, Jan van der Laan wrote: The following function is faster than your g and easier to read: g2 <- function(dotot) { v <- list() for (i in seq_len(dotot)) { v[[i]] <- FALSE } } Except that you d

Re: [R] complexity of operations in R

2012-07-19 Thread Jan van der Laan
Johan, Your 'list' and 'array doubling' code can be written much more efficient. The following function is faster than your g and easier to read: g2 <- function(dotot) { v <- list() for (i in seq_len(dotot)) { v[[i]] <- FALSE } } In the following line in you array doubling function

Re: [R] PPM to BMP converter

2012-05-09 Thread Jan van der Laan
I don't know if any R-packages exist that can do this, but you could install imagemagick (http://www.imagemagick.org), which provides command line tools for image manipulation and conversion, and call these from R using system. Something like: system("convert yourimage.ppm yourimage.bmp")

Re: [R] how to deduplicate records, e.g. using melt() and cast()

2012-05-07 Thread Jan van der Laan
using reshape: library(reshape) m <- melt(my.df, id.var="pathway", na.rm=T) cast(m, pathway~variable, sum, fill=NA) Jan On 05/07/2012 12:30 PM, Karl Brand wrote: Dimitris, Petra, Thank you! aggregate() is my lesson for today, not melt() | cast() Really appreciate the super fast help, Karl

Re: [R] Can't import this 4GB DATASET

2012-05-05 Thread Jan van der Laan
Perhaps you could contact the persons that supplied/created the file and ask them what the format of the file exactly is. That is probably the safest thing to do. If you are sure that the lines containing only whitespace are meaningless, then you could alter the previous code to make a copy

Re: [R] Can't import this 4GB DATASET

2012-05-04 Thread Jan van der Laan
OK, not all, but most lines have the same length. Perhaps you could write the lines with a different line size to a separate file to have a closer look at those lines. Modifying the previous code (again not tested): con <- file("dataset.txt", "rt") out <- file("strangelines.txt", "wt") #

Re: [R] Equality of multiple vectors

2012-05-04 Thread Jan van der Laan
or identical(vec1, vec2) && identical(vec2, vec3) Jan Petr Savicky schreef: On Fri, May 04, 2012 at 12:53:12AM -0700, aaurouss wrote: Hello, I'm writing a piece of code where I need to compare multiple same length vectors. I've gone through the basic functions like identical() or all(),

Re: [R] read.table() vs read.delim() any difference??

2012-05-04 Thread Jan van der Laan
read.delim calls read.table so any differences between the two are caused by differences in the default values of some of the parameters. Take a look at the help file ?read.table read.table uses white space as separator; read.delim tabs read.table uses " and ' as quotes; read.delim just " e

Re: [R] Can't import this 4GB DATASET

2012-05-04 Thread Jan van der Laan
read.table imports the company name "GREAT FALLS GAS CO" as four separate columns. I think that needs to be one column. I can imagine that further one in your file you will have another company name that does not consist of four words which would cause the error you observed. From your ou

Re: [R] Reading big files in chunks-ff package

2012-03-25 Thread Jan van der Laan
The 'normal' way of doing that with ff is to first convert your csv file completely to a ffdf object (which stores its data on disk so shouldn't give any memory problems). You can then use the chunk routine (see ?chunk) to divide your data in the required chunks. Untested so may contain err

Re: [R] Reading big files in chunks-ff package

2012-03-25 Thread Jan van der Laan
Your question is not completely clear. read.csv.ffdf automatically reads in the data in chunks. You don´t have to do anything for that. You can specify the size of the chunks using the next.rows option. Jan On 03/24/2012 09:29 PM, Mav wrote: Hello! A question about reading large CSV files

Re: [R] Handling 8GB .txt file in R?

2012-03-25 Thread Jan van der Laan
What you could try to do is skip the first 5 lines. After that the file seems to be 'normal'. With read.table.ffdf you could try something like # open a connection to the file con <- file('yourfile', 'rt') # skip first 5 lines tmp <- readLines(con, n=5) # read the remainder using read.table.ff

Re: [R] check for data in a data.frame and return correspondent number

2012-03-14 Thread Jan van der Laan
Marianna, You can use merge for that (or match). Using merge: MyData <- data.frame( V1=c("red-j", "red-j", "red-j", "red-j", "red-j", "red-j"), V4=c(10.5032, 9.3749, 10.2167, 10.8200, 9.2831, 8.2838), redNew=c("appearance blood-n", "appearance ground-n", "appearance sea-n", "appeara

Re: [R] Reading in 9.6GB .DAT File - OK with 64-bit R?

2012-03-09 Thread Jan van der Laan
You could also have a look at the LaF package which is written to handle large text files: http://cran.r-project.org/web/packages/LaF/index.html Under the vignettes you'll find a manual. Note: LaF does not help you to fit 9GB of data in 4GB of memory, but it could help you reading your fi

Re: [R] Novice Alert!: odfWeave help!

2012-03-08 Thread Jan van der Laan
Step by step: 1. Create a new document in Open/LibreOffice 2. Copy/paste the following text into the document (as an example) <>= cat("Hello, world") @ 2. Save the file (e.g. "hello.odt") 3. Start R (if not already) shouldn't matter if its plain R/RStudio 4. Change working directory to the fol

Re: [R] Week number from a date

2012-02-22 Thread Jan van der Laan
The suggestion below gives you week numbers with week 1 being the week containing the first monday of the year and weeks going from monday to sunday. There are other conventions. The ISO convention is that week 1 is the first week containing at least 4 days in the new year (week 1 of 2012

Re: [R] Problems reading tab-delim files using read.table and read.delim

2012-02-08 Thread Jan van der Laan
I don't know if this completely solves your problem, but here are some arguments to read.table/read.delim you might try: row.names=FALSE fill=TRUE The details section also suggests using the colClasses argument as the number of columns is determined from the first 5 rows which may not be

Re: [R] Not generating line chart

2012-01-19 Thread Jan van der Laan
very much for the solution given. Still I am having one more question. I want both the graphs in single pdf and the legend should contain ACTTRT of individual REFID (Only two lines in legend) Can you solve it? Devarayalu -Original Message- From: Jan van der Laan [mailto:rh

Re: [R] Not generating line chart

2012-01-19 Thread Jan van der Laan
reply. But... Sorry still I am not getting by using print() with the following modified code. I am also attaching the raw datafile. par(mfrow=c(1,3)) #qplot(TIME1, BASCHGA, data=Orange1, geom= c("point", "line"), colour= ACTTRT) unique(Orange1$REFID) -> refid for

Re: [R] Not generating line chart

2012-01-19 Thread Jan van der Laan
Devarayalu, This is FAQ 7.22: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-lattice_002ftrellis-graphics-not-work_003f use print(qplot()) Regards, Jan Sri krishna Devarayalu Balanagu schreef: Hi All, Can you please help me, why this code in not generating line chart? librar

Re: [R] R - Linux_SSH

2011-12-14 Thread Jan van der Laan
What I did in the past (not with R scripts) is to start my jobs using at (start the job at a specified time e.g. now) or batch (start the job when the cpu drops below ?%) at now "R CMD BATCH yourscript.R" or batch "R CMD BATCH yourscript.R" something like that, you'll have to look at the m

Re: [R] Generating input population for microsimulation

2011-12-14 Thread Jan van der Laan
n the future. Thanks again! Emma - Original Message - From: Jan van der Laan To: "r-help@r-project.org" Cc: Emma Thomas Sent: Wednesday, December 14, 2011 6:18 AM Subject: Re: [R] Generating input population for microsimulation Emma, If, as you say, each unit is the sa

Re: [R] Generating input population for microsimulation

2011-12-14 Thread Jan van der Laan
Emma, If, as you say, each unit is the same you can just repeat the units to obtain the required number of units. For example, unit_size <- 10 n_units <- 10 unit_id <- rep(1:n_units, each=unit_size) pid <- rep(1:unit_size, n_units) senior <- ifelse(pid <= 2, 1, 0) pop <- d

Re: [R] Read TXT file with variable separation

2011-11-29 Thread Jan van der Laan
Raphael, This looks like fixed width format which you can read with read.fwf. In fixed width format the columns are not separated by white space (or other characters), but are identified by the positition in the file. So in your file, for example the first field looks to contained in the

Re: [R] RV: Reporting a conflict between ADMB and Rtools on Windows systems

2011-11-17 Thread Jan van der Laan
I assume you use a command window to build your packages. One possible solution might be to leave out the path variables set by Rtools from your global path and to create a separate shortcut to cmd for building r-packages where you set your path as needed by R CMD build/check Something like

Re: [R] Reading a specific column of a csv file in a loop

2011-11-15 Thread Jan van der Laan
Yet another solution. This time using the LaF package: library(LaF) d<-c(1,4,7,8) P1 <- laf_open_csv("M1.csv", column_types=rep("double", 10), skip=1) P2 <- laf_open_csv("M2.csv", column_types=rep("double", 10), skip=1) for (i in d) { M<-data.frame(P1[, i],P2[, i]) } (The skip=1 is needed as l

[R] [R-pkgs] LaF 0.3: fast access to large ASCII files

2011-11-14 Thread Jan van der Laan
The LaF package provides methods for fast access to large ASCII files. Currently the following file formats are supported: * comma separated format (csv) and other separated formats and * fixed width format. It is assumed that the files are too large to fit into memory, although the package ca

Re: [R] Applying function to only numeric variable (plyr package?)

2011-10-12 Thread Jan van der Laan
plyr isn't necessary in this case. You can use the following: cols <- sapply(df, is.numeric) df[, cols] <- pct(df[,cols]) round (and therefore pct) accepts a data.frame and returns a data.frame with the same dimensions. If that hadn't been the case colwise might have been of help: librar

Re: [R] Chi-Square test and survey results

2011-10-12 Thread Jan van der Laan
George, Perhaps the site of the RISQ project (Representativity indicators for Survey Quality) might be of use: http://www.risq-project.eu/ . They also provide R-code to calculate their indicators. HTH, Jan Quoting ghe...@mathnmaps.com: An organization has asked me to comment on the val

Re: [R] Problem with .C

2011-10-06 Thread Jan van der Laan
Quoting Uwe Ligges : I don't agree that it's overkill -- you get to sidestep the whole `R CMD SHLIB ...` and `dyn.load` dance this way while you experiment with C(++) code 'live" using the inline package. You need two additional packages now where you have to rely on the fact those are avai

Re: [R] Problem with .C

2011-10-06 Thread Jan van der Laan
An obvious reason might be that your second argument should be a pointer to int. As others have mentioned, you might want to have a look at Rccp and/or inline. The documentation is good and I find it much easier to work with. For example, your example could be written as: library(Rcpp) l

Re: [R] Data import

2011-09-26 Thread Jan van der Laan
You can with the routines in the memisc library. You can open a file using spss.system.file and then import a subset using subset. Look in the help pages of spss.system.file for examples. HTH Jan On 09/25/2011 11:56 PM, sassorauk wrote: Is it possible to import only certain variables from a

Re: [R] R help on write.csv

2011-09-21 Thread Jan van der Laan
can append row wise, so that it all stacks up horizontally, the way you do it in xlswrite in matlab, where you can even specify the cell number from where you want to write. -Ashish *From:*R. Michael Weylandt [mailto:michael.weyla...@gmail.com] *Sent:* Thursday, September 22, 2011 1

Re: [R] R help on write.csv

2011-09-21 Thread Jan van der Laan
Michael, You example doesn't seem to work. Append isn't passed on to the write.table call. You will need to add a Call$append<- append to the function. And even then there will be a problem with the headers that are repeated when appending. An easier solution is to use write.table dire

Re: [R] R help on write.csv

2011-09-21 Thread Jan van der Laan
Michael, You example doesn't seem to work. Append isn't passed on to the write.table call. You will need to add a Call$append<- append to the function. And even then there will be a problem with the headers that are repeated when appending. An easier solution is to use write.table dire

Re: [R] R help on write.csv

2011-09-21 Thread Jan van der Laan
Michael, You example doesn't seem to work. Append isn't passed on to the write.table call. You will need to add a Call$append<- append to the function. And even then there will be a problem with the headers that are repeated when appending. An easier solution is to use write.table dir

Re: [R] odfWeave: Combining multiple output statements in a function

2011-09-16 Thread Jan van der Laan
(as far as I can tell). Could you perhaps just tell me how I should combine the output of multiple odf* calls inside a function? Thanks again. Jan Quoting Max Kuhn : formatting.odf, page 7. The results are in formattingOut.odt On Thu, Sep 15, 2011 at 2:44 PM, Jan van der Laan wrote: M

Re: [R] Where to put tryCatch or similar in a very big for loop

2011-09-16 Thread Jan van der Laan
Laura, Perhaps the following example helps: nbstr <- 100 result <- numeric(nbstr) for (i in seq_len(nbstr)) { # set the default value for when the current bootstrap fails result[i] <- NA try({ # estimate your cox model here if (runif(1) < 0.1) stop("ERROR") result[i] <- i },

Re: [R] odfWeave: Combining multiple output statements in a function

2011-09-15 Thread Jan van der Laan
examples in the package directory that explain this. On Thu, Sep 15, 2011 at 8:16 AM, Jan van der Laan wrote: What is the correct way to combine multiple calls to odfCat, odfItemize, odfTable etc. inside a function? As an example lets say I have a function that needs to write two paragraphs of text and

[R] odfWeave: Combining multiple output statements in a function

2011-09-15 Thread Jan van der Laan
What is the correct way to combine multiple calls to odfCat, odfItemize, odfTable etc. inside a function? As an example lets say I have a function that needs to write two paragraphs of text and a list to the resulting odf-document (the real function has much more complex logic, but I don'

  1   2   >