Re: [R] R 2.10.0: Error in gsub/calloc

2009-11-05 Thread Richard R. Liu
f Of Richard R. Liu Sent: Tuesday, November 03, 2009 11:32 AM To: Uwe Ligges Cc: r-help@r-project.org Subject: Re: [R] R 2.10.0: Error in gsub/calloc I apologize for not being clear. d is a character vector of length 158908. Each element in the vector has been designated by sentDetect (package: openN

Re: [R] R 2.10.0: Error in gsub/calloc

2009-11-03 Thread Richard R. Liu
I am using gsubfn 0.5-0. When I do not specify perl = TRUE I now get the following error on the same document: Error in structure(.External("dotTcl", ..., PACKAGE = "tcltk"), class = "tclObj") : [tcl] bad index "1e+05": must be integer?[+-]integer? or end? [+-]integer?. Regards, Richard

Re: [R] R 2.10.0: Error in gsub/calloc

2009-11-03 Thread Prof Brian Ripley
r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Richard R. Liu Sent: Tuesday, November 03, 2009 3:00 PM To: Kenneth Roy Cabrera Torres Cc: r-help@r-project.org; Uwe Ligges Subject: Re: [R] R 2.10.0: Error in gsub/calloc Kenneth, Thanks for the hint. I downloa

Re: [R] R 2.10.0: Error in gsub/calloc

2009-11-03 Thread Gabor Grothendieck
Note that you don't need perl = T since by default strapply uses tcl regular expressions and they support \w. What happens if you omit the perl = T? Also please specify the version of gsubfn you are using and if its not the latest then try it with the latest version. On Tue, Nov 3, 2009 at 11:0

Re: [R] R 2.10.0: Error in gsub/calloc

2009-11-03 Thread William Dunlap
es utils datasets methods base loaded via a namespace (and not attached): [1] tcltk_2.10.0 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Richard R. L

Re: [R] R 2.10.0: Error in gsub/calloc

2009-11-03 Thread Richard R. Liu
Kenneth, Thanks for the hint. I downloaded and installed the latest patch, but to no avail. I can reproduce the error on a single sentence, the longest in the document. It contains 743,393 characters. It isn't a true sentence, but since it is more than three standard deviations longer

Re: [R] R 2.10.0: Error in gsub/calloc

2009-11-03 Thread Bert Gunter
works, it should be way faster than strapply() and should not have any memory allocation issues either. HTH. Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Richard R. Liu Sent: Tuesday

Re: [R] R 2.10.0: Error in gsub/calloc

2009-11-03 Thread Richard R. Liu
I apologize for not being clear. d is a character vector of length 158908. Each element in the vector has been designated by sentDetect (package: openNLP) as a sentence. Some of these are really sentences. Others are merely groups of meaningless characters separated by white space. str

Re: [R] R 2.10.0: Error in gsub/calloc

2009-11-03 Thread Kenneth Roy Cabrera Torres
Try the patch version... Maybe is the same problem I had with large database when using gsub() HTH El mar, 03-11-2009 a las 20:31 +0100, Richard R. Liu escribió: > I apologize for not being clear. d is a character vector of length > 158908. Each element in the vector has been designated by s

Re: [R] R 2.10.0: Error in gsub/calloc

2009-11-03 Thread Uwe Ligges
richard@pueo-owl.ch wrote: I'm running R 2.10.0 under Mac OS X 10.5.8; however, I don't think this is a Mac-specific problem. I have a very large (158,908 possible sentences, ca. 58 MB) plain text document d which I am trying to tokenize: t <- strapply(d, "\\w+", perl = T). I am encounte

[R] R 2.10.0: Error in gsub/calloc

2009-11-03 Thread richard . liu
I'm running R 2.10.0 under Mac OS X 10.5.8; however, I don't think this is a Mac-specific problem. I have a very large (158,908 possible sentences, ca. 58 MB) plain text document d which I am trying to tokenize: t <- strapply(d, "\\w+", perl = T). I am encountering the following error: Error in