On 12/02/2009 7:01 AM, Romain Francois wrote:
Hello,

Consider this file (/tmp/test.R) :

<file>
f <- function( x, y = 2 ){
   z <- x + y
   print( z )
}
</file>

I get this in R 2.7.2 :

 > p <- parse( "/tmp/test.R" )
 > str( attr( p, "srcref" ) )
List of 1
$ :Class 'srcref'  atomic [1:4] 1 1 4 1
 .. ..- attr(*, "srcfile")=Class 'srcfile' length 4 <environment>

and this in R-devel :

 > p <- parse( "/tmp/test.R" )
 > str( attr(p, "srcref") )
List of 1
$ :Class 'srcref'  atomic [1:6] 1 1 4 1 1 1
 .. ..- attr(*, "srcfile")=Class 'srcfile' <environment: 0x946b944>

What are the two last numbers ?

The original design for srcref gave 4 entries: start line, start byte, stop line, stop byte. However, in multibyte strings, bytes don't correspond to columns, so error messages could often report the wrong location according to what a user sees in an editor. To support the more useful error messages in R-devel, I added two more values: start column and stop column. With pure ASCII text these will be the same as start byte and stop byte; with UTF-8 text and non-ASCII characters they will be be different. Other multibyte encodings are only supported if the platform can convert them to UTF-8 (and are not well tested; error reports would be welcome, if there's a way to improve the performance.)

If you are using these for error reports, I recommend using the two new values. If you are trying to retrieve the text from the source file, use the originals.

Duncan Murdoch

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to