Duncan Murdoch wrote: > On 6/14/2007 10:49 AM, Jeffrey Horner wrote: >> Hi, >> >> Here's a patch to the readChar manual page (R-trunk as of today) that >> better clarifies readChar's return value. > > Your update is not right. For example: > > x <- as.raw(32:96) > readChar(x, nchars=rep(2,100)) > > This returns a character vector of length 100, of which the first 32 > elements have 2 chars, the next one has 1, and the rest are "". > > So the length of nchars really does affect the length of the value. > > Now, I haven't looked at the code, but it's possible we could delete the > "(which might be less than \code{length(nchars)})" remark, and if not, > it would be useful to explain the situations in which the return value > could be shorter than the nchars vector.
Well, this is rather a misunderstanding on my part; I completely forgot about vectorization. The manual page makes sense to me now. But the situation about the return value possibly being less than length(nchars) isn't clear. Consider a 101 byte text file in a non-multibyte character locale: f <- tempfile() writeChar(paste(rep(seq(0,9),10),collapse=''),con=f) and calling readChar() to read 100 bytes with length(nchar)=10: > readChar(f,nchar=rep(10,10)) [1] "0123456789" "0123456789" "0123456789" "0123456789" "0123456789" [6] "0123456789" "0123456789" "0123456789" "0123456789" "0123456789" and readChar() reading the entire file with length(nchar)=11: > readChar(f,nchar=rep(10,11)) [1] "0123456789" "0123456789" "0123456789" "0123456789" "0123456789" [6] "0123456789" "0123456789" "0123456789" "0123456789" "0123456789" [11] "\0" but the following two outputs are confusing. readchar() with length(nchar)>=12 returns a character vector length 12: > readChar(f,nchar=rep(10,12)) [1] "0123456789" "0123456789" "0123456789" "0123456789" "0123456789" [6] "0123456789" "0123456789" "0123456789" "0123456789" "0123456789" [11] "\0" "" > readChar(f,nchar=rep(10,13)) [1] "0123456789" "0123456789" "0123456789" "0123456789" "0123456789" [6] "0123456789" "0123456789" "0123456789" "0123456789" "0123456789" [11] "\0" "" It seems that the first time EOF is encountered on a read operation, an empty string is returned, but on subsequent reads nothing is returned. Is this intended behavior? Jeff > > Duncan Murdoch > > > It could use some work as I'd >> also like to add some text about using nchar() to find the length of >> the string that readchar() returns, but I'm unsure which of >> type="bytes" or type="chars" to mention. Is it type="chars"? >> >> Index: src/library/base/man/readChar.Rd >> =================================================================== >> --- src/library/base/man/readChar.Rd (revision 41943) >> +++ src/library/base/man/readChar.Rd (working copy) >> @@ -57,8 +57,8 @@ >> } >> >> \value{ >> - For \code{readChar}, a character vector of length the number of >> - items read (which might be less than \code{length(nchars)}). >> + For \code{readChar}, a character vector of length 1 with the number >> + of characters less than or equal to nchars. >> >> For \code{writeChar}, a raw vector (if \code{con} is a raw vector) or >> invisibly \code{NULL}. >> >> >> Jeff > -- http://biostat.mc.vanderbilt.edu/JeffreyHorner ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel