from:"Zefram"

bug#16363: interactive use subject to compiler limitations

2014-01-05 Thread Zefram

guile-2.0.9's compiler has some inconvenient restrictions, relative to
its interpreter.  Where the compiler is automatically applied to scripts,
the restrictions aren't a serious problem, because if compilation fails
then guile falls back to interpreting the script.  But in an interactive
REPL session, by default each form entered by the user is passed through
the compiler, and if compilation fails then the error is signalled,
with no fallback to interpretation.

As a test case, consider a form in which a procedure object appears.
The compiler can't handle forms that directly reference a wide variety of
object types, including procedures (both primitive and user-defined) and
GOOPS objects.  In the interpreter these objects simply self-evaluate,
and it can be useful to reference them without the usual indirection
through a named variable.  Here I'll show what happens to such a form
in a script and interactively, in guile 1.8 and 2.0:

$ cat t2
(cond-expand
  (guile-2
(eval-when (compile load eval)
  (fluid-set! read-eval? #t)))
  (else
(fluid-set! read-eval? #t)))
(define (p x y) (#.+ x y))
(write (p 2 3))
(newline)
$ guile-1.8 t2
5
$ guile-2.0 --no-auto-compile t2
5
$ guile-2.0 t2
;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
;;;   or pass the --no-auto-compile argument to disable.
;;; compiling /home/zefram/usr/guile/t2
;;; WARNING: compilation of /home/zefram/usr/guile/t2 failed:
;;; ERROR: build-constant-store: unrecognized object #
5
$ guile-1.8
guile> (fluid-set! read-eval? #t)
guile> (define (p x y) (#.+ x y))
guile> (p 2 3)
5
guile> ^D
$ guile-2.0
GNU Guile 2.0.9-deb+1-1
Copyright (C) 1995-2013 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> (fluid-set! read-eval? #t)
scheme@(guile-user)> (define (p x y) (#.+ x y))
While compiling expression:
ERROR: build-constant-store: unrecognized object #
scheme@(guile-user)> (p 2 3)
:3:0: In procedure #:3:0 ()>:
:3:0: In procedure #:3:0 
()>: Unbound variable: p

There is a workaround for this problem: the REPL's "interp" option
controls whether forms go through the compiler or the interpreter.  Hence:

scheme@(guile-user)> (fluid-set! read-eval? #t)
scheme@(guile-user)> (#.+ 2 3)
While compiling expression:
ERROR: build-constant-store: unrecognized object #
scheme@(guile-user)> ,o interp #t
scheme@(guile-user)> (#.+ 2 3)
$1 = 5

So the problem is merely that the REPL is broken *by default*.
It should either default to the working mechanism, or fall back to it
when compilation fails (as the file auto-compilation does).

Debian incarnation of this bug report:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734108

-zefram

bug#16364: auto-compile noise can't be avoided by script

2014-01-05 Thread Zefram

Guile 2.0.9 has a facility to automatically cache a compiled version
of any Scheme source file that it loads, and it wants the world to
know about it!  If auto-compilation is enabled, which it is by default,
then when guile loads a file (that was not already compiled) it emits a
banner describing the auto-compilation.  This interferes with the proper
functionality of any program written as a guile script, by producing
output that the program did not intend.  Working around this is tricky
(discussed below).  There's no straightforward way for a script to avoid
the noise while being portable between guile versions 1.8 and 2.0.
There's also no way to avoid the noise while actually getting the
auto-compilation behaviour.

In my particular case, my script makes interesting use of the
read-eval (#.) feature, which means that the compilation process
actually can't work.  This means that *every* time the script is run,
not just the first time, guile emits the banner about auto-compilation,
followed by a rather misleading warning/error about compilation failure.
It's misleading because it then goes on to execute the script just fine.
I can demonstrate this with a minimal test case (using read-eval in an
uninteresting way, just making the compiler barf by not having applied
eval-when to enable it):

$ cat t0
#!/usr/bin/guile -s
!#
(fluid-set! read-eval? #t)
(display #."hello world")
(newline)
$ guile-1.8 -s t0
hello world
$ guile-2.0 -s t0
;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
;;;   or pass the --no-auto-compile argument to disable.
;;; compiling /home/zefram/usr/guile/t0
;;; WARNING: compilation of /home/zefram/usr/guile/t0 failed:
;;; ERROR: #. read expansion found and read-eval? is #f.
hello world
$

I can turn off the auto-compilation from within the script by using the
--no-auto-compile option, but that breaks compatibility to 1.8:

$ cat t1
#!/usr/bin/guile \
--no-auto-compile -s
!#
(fluid-set! read-eval? #t)
(display #."hello world")
(newline)
$ guile-2.0 '\' t1
hello world
$ guile-1.8 '\' t1
guile-1.8: Unrecognized switch `--no-auto-compile'
Usage: guile-1.8 OPTION ...
Evaluate Scheme code, interactively or from a script.
...

Aside from the portability concern, turning off auto-compilation doesn't
actually fix the problem.  If a compiled version has previously been
cached for the filename of a script being run, guile will consider
using the cached version even if --no-auto-compile was supplied: the
switch only controls the attempt to compile for the cache.  If the
cached compilation is up to date then it is used silently, which is OK.
But if it's out of date, because the cache was for a different script
that previously existed under the same name, then guile emits a banner
saying that it's out of date (implying that the cached compilation is
therefore not being used).  So the script's visible behaviour is defiled
even if it applies the option.

Observe what happens to the second script in this sequence:

$ echo '(display "hello world\n")' >t10
$ guile-2.0 t10
;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
;;;   or pass the --no-auto-compile argument to disable.
;;; compiling /home/zefram/usr/guile/t10
;;; compiled 
/home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t10.go
hello world
$ echo '(display "goodbye world\n")' >t10
$ guile-2.0 --no-auto-compile t10
;;; note: source file /home/zefram/usr/guile/t10
;;;   newer than compiled 
/home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t10.go
goodbye world

I have, however, come up with a truly ugly workaround.  The meta option
system can be used to introduce a -c option that explicitly loads the
script file via primitive-eval, which does not attempt compilation.
(Nor does it look at the compilation cache, so this even avoids the
problem that --no-auto-compile runs into.)  Running the script this way
yields a different command line (visible through (program-arguments))
from that which arrives when the script is run via -s, so if the script
is to process its command line, for robustness it must pay attention to
which way it was invoked.  All together, this looks like:

$ cat t11
#!/usr/bin/guile \
-c (begin\
\ \ \ (define\ arg-hack\ #t)\
\ \ \ (primitive-load\ (cadr\ (program-arguments
!#
(define argv
  (if (false-if-exception arg-hack)
(cdr (program-arguments))
(program-arguments)))
(write argv)
(newline)
$ guile-1.6 '\' t11 a b c
("t11" "a" "b" "c")
$ guile-1.6 -s t11 a b c 
("t11" "a" "b" "c")
$ guile-1.8 '\' t11 a b c
("t11" "a" "b" "c")
$ guile-1.8 -s t11 a b c 
("t11" "a" "b" "c")
$ guile-2.0 '\' t11 a b c
("t11" "a

bug#16359: "guild list" lists nothing

2014-01-05 Thread Zefram

"guild list" is meant to list the available subcommands within guild.
It actually shows an empty list:

$ GUILE=/usr/bin/guile-2.0 guild list
Usage: guild COMMAND [ARGS]
Run command-line scripts provided by GNU Guile and related programs.

Commands:

For help on a specific command, try "guild help COMMAND".

Report guild bugs to bug-guile@gnu.org
GNU Guile home page: <http://www.gnu.org/software/guile/>
General help using GNU software: <http://www.gnu.org/gethelp/>
For complete documentation, run: info guile 'Using Guile Tools'
$

Subcommands mentioned in the guile documentation are actually available,
despite not being listed.

This is guile-2.0.9 on Debian.  Debian incarnation of this bug report:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734313

-zefram

bug#16361: compile cache confused about file identity

2014-01-05 Thread Zefram

The automatic cache of compiled versions of scripts in guile-2.0.9
identifies scripts mainly by name, and partially by mtime.  This is not
actually sufficient: it is easily misled by a pathname that refers to
different files at different times.  Test case:

$ echo '(display "aaa\n")' >t13
$ echo '(display "bbb\n")' >t14
$ guile-2.0 t13
;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
;;;   or pass the --no-auto-compile argument to disable.
;;; compiling /home/zefram/usr/guile/t13
;;; compiled 
/home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t13.go
aaa
$ mv t14 t13
$ guile-2.0 t13
aaa

You can see that the mtime is not fully used here: the cache is misapplied
even if there is a delay of seconds between the creations of the two
script files.  The cache's mtime check will only notice a mismatch if
the script currently seen under the supplied name was modified later
than when the previous script was *compiled*.

Obviously, in this test case the cache could trivially distinguish the
two script files by looking at the inode numbers.  On its own the inode
number isn't sufficient, but exact match on device, inode number, and
mtime would be far superior to the current behaviour, only going wrong
in the presence of deliberate timestamp manipulation.  As a bonus, if
the cache were actually *keyed* by inode number and device, rather than
by pathname, it would retain the caching of compilation across renamings
of the script.

Or, even better, the cache could be keyed by a cryptographic hash of the
file contents.  This would be immune even to timestamp manipulation, and
would preserve the cached compilation even across the script being copied
to a fresh file or being edited and reverted.  This would be a cache
worthy of the name.  The only downside is the expense of computing the
hash, but I expect this is small compared to the expense of compilation.

Debian incarnation of this bug report:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734178

-zefram

bug#16357: insufficient print abbreviation in error messages

2014-01-05 Thread Zefram

When guile is constructing error messages that display offending objects,
in version 2.0.9 it never abbreviates long or deep structures.  This can
easily lead to pathologically-long messages that take stupid amounts of
time and memory to construct and to display.  By contrast, guile-1.8
applies abbreviation at a reasonable level, and objects appearing in
stack traces have reasonable abbreviation on both versions.  Two very
mild examples:

$ guile-1.8 --debug -c "(read (let aaa ((n 100) (v '())) (if (= n 0) v (aaa (- 
n 1) (cons n v)"
Backtrace:
In current input:
   1: 0* [read {(1 2 3 4 5 6 7 8 9 ...)}]

:1:1: In procedure read in expression (read (# 100 #)):
:1:1: Wrong type argument in position 1 (expecting open input 
port): (1 2 3 4 5 6 7 8 9 10 ...)
$ guile-2.0 --debug -c "(read (let aaa ((n 100) (v '())) (if (= n 0) v (aaa (- 
n 1) (cons n v)"
Backtrace:
In ice-9/boot-9.scm:
 157: 7 [catch #t # ...]
In unknown file:
   ?: 6 [apply-smob/1 #]
In ice-9/boot-9.scm:
  63: 5 [call-with-prompt prompt0 ...]
In ice-9/eval.scm:
 432: 4 [eval # #]
In unknown file:
   ?: 3 [call-with-input-string "(read (let aaa ((n 100) (v '())) (if (= n 0) v 
(aaa (- n 1) (cons n v)" ...]
In ice-9/command-line.scm:
 180: 2 [# #]
In unknown file:
   ?: 1 [eval (read (let aaa (# #) (if # v #))) #]
   ?: 0 [read (1 2 3 4 5 6 7 8 9 ...)]

ERROR: In procedure read:
ERROR: In procedure read: Wrong type argument in position 1 (expecting open 
input port): (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100)
$ guile-1.8 --debug -c "(read (let aaa ((n 100) (v '())) (if (= n 0) v (aaa (- 
n 1) (cons v n)"
Backtrace:
In current input:
   1: 0* [read {(((# . 3) . 2) . 1)}]

:1:1: In procedure read in expression (read (# 100 #)):
:1:1: Wrong type argument in position 1 (expecting open input 
port): (((# . 7) . 6) . 5) . 4) . 3) . 2) . 1)
$ guile-2.0 --debug -c "(read (let aaa ((n 100) (v '())) (if (= n 0) v (aaa (- 
n 1) (cons v n)"
Backtrace:
In ice-9/boot-9.scm:
 157: 7 [catch #t # ...]
In unknown file:
   ?: 6 [apply-smob/1 #]
In ice-9/boot-9.scm:
  63: 5 [call-with-prompt prompt0 ...]
In ice-9/eval.scm:
 432: 4 [eval # #]
In unknown file:
   ?: 3 [call-with-input-string "(read (let aaa ((n 100) (v '())) (if (= n 0) v 
(aaa (- n 1) (cons v n)" ...]
In ice-9/command-line.scm:
 180: 2 [# #]
In unknown file:
   ?: 1 [eval (read (let aaa (# #) (if # v #))) #]
   ?: 0 [read (((# . 3) . 2) . 1)]

ERROR: In procedure read:
ERROR: In procedure read: Wrong type argument in position 1 (expecting open 
input port): 
()
 . 100) . 99) . 98) . 97) . 96) . 95) . 94) . 93) . 92) . 91) . 90) . 89) . 88) 
. 87) . 86) . 85) . 84) . 83) . 82) . 81) . 80) . 79) . 78) . 77) . 76) . 75) . 
74) . 73) . 72) . 71) . 70) . 69) . 68) . 67) . 66) . 65) . 64) . 63) . 62) . 
61) . 60) . 59) . 58) . 57) . 56) . 55) . 54) . 53) . 52) . 51) . 50) . 49) . 
48) . 47) . 46) . 45) . 44) . 43) . 42) . 41) . 40) . 39) . 38) . 37) . 36) . 
35) . 34) . 33) . 32) . 31) . 30) . 29) . 28) . 27) . 26) . 25) . 24) . 23) . 
22) . 21) . 20) . 19) . 18) . 17) . 16) . 15) . 14) . 13) . 12) . 11) . 10) . 
9) . 8) . 7) . 6) . 5) . 4) . 3) . 2) . 1)

Debian incarnation of this bug report:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734128

-zefram

bug#16358: combinatorial explosion in elided stack trace

2014-01-05 Thread Zefram

In guile 2.0.9, if an error is signalled in the interpreter, and the
stack contains in a certain position an object whose unabbreviated print
representation is very large, then the process of displaying the stack
trace will take a huge amount of time and memory, pausing in the middle
of output, even though the displayed stack trace doesn't actually show
the object at all.  Test case:

$ cat t6
(define bs (let aaa ((n 100) (v '())) (if (= n 0) v (aaa (- n 1) (cons v v)
(write (list bs (error "wibble")))
$ guile-2.0 --no-auto-compile t6
Backtrace:
In ice-9/boot-9.scm:
 157: 11 [catch #t # ...]
In unknown file:
   ?: 10 [apply-smob/1 #]
In ice-9/boot-9.scm:
  63: 9 [call-with-prompt prompt0 ...]
In ice-9/eval.scm:
 432: 8 [eval # #]
In ice-9/boot-9.scm:
2320: 7 [save-module-excursion #]
3968: 6 [#]
1645: 5 [%start-stack load-stack #]
1650: 4 [#]
In unknown file:
   ?: 3 [primitive-load "/home/zefram/usr/guile/t6"]
In ice-9/eval.scm:
 387: 2 ^Z
zsh: suspended  guile-2.0 --no-auto-compile t6
$ jobs -l
[1]  + 32574 suspended  guile-2.0 --no-auto-compile t6
$ ps vw 32574
  PID TTY  STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
32574 pts/5T  0:36  0 3 2266300 1634556  9.9 guile-2.0 
--no-auto-compile t6

With the test's size parameter at 100 as above, there is no realistic
prospect of actually completing generation of the stack trace.  For some
range of values (about 25 on my machine) there will be a noticeable pause,
after which the stack trace completes:

...
 387: 2 [eval # ()]
 387: 1 [eval # ()]
In unknown file:
   ?: 0 [scm-error misc-error #f "~A" ("wibble") #f]

It appears that it's generating the entire print representation of
the object behind the scenes, though it then obviously throws it away.
Experimentation with customising print methods for SRFI-9 record types
shows that the delay and memory usage depend on the print representation
per se, rather than on the amount of structure beneath the object.
(A record-based cons-like type produces similar behaviour to the
cons test when using the default print method that shows the content.
Replacing it with a print method that emits a fixed string and doesn't
recurse eliminates the delay entirely.)

If my test program is run in compiled form (via auto-compilation) then
it doesn't exhibit the pause.  Actually it gets optimised such that the
problem object isn't anywhere near what the stack trace displays, so for
a fair test the program needs to be tweaked.  It can be arranged for the
problem object to be directly mentioned in the stack trace, and there is
still no pause: the object appears in a highly abbreviated form, such as

   2: 1 [vv ((# # # # ...) (# # # # ...) (# # # # ...) (# # # # ...) ...)]

For comparison, guile-1.8 never exhibits this problem.  By default
it doesn't emit a stack trace for a script, but it can be asked to do
so via --debug.  It then behaves like the compiled form of guile-2.0:
there is no delay, and the object is shown in very abbreviated form.

Debian incarnation of this bug report:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734132

-zefram

bug#16356: doc out of date about (integer? +inf.0)

2014-01-05 Thread Zefram

The "Integers" node of the guile info document contains this gem (source
in doc/ref/api-data.texi):

  (integer? +inf.0)
  => #t

Actual guile-2.0.9 behaviour:

scheme@(guile-user)> (integer? +inf.0)
$16 = #f

The doc example matches the behaviour of guile-1.8, which classifies
+inf.0 and -inf.0 as integers, and +nan.0 as rational but not integer.
guile-2.0 follows R6RS in treating all three of these values as real
but not rational, and the "Reals and Rationals" node describes this
accurately.

Debian incarnation of this bug report:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734323

Mathematically, infinities are not real, and NaN is, as the acronym says,
not a number.  The documentation could perhaps do with a note about
the difference between mathematical terminology and Scheme terminology.
I was rather surprised to find any discrepancy, as Scheme's numerical
tower stands out among programming languages as being uniquely accurate
in its use of mathematical terms.  Scheme's concept of "real" more
closely corresponds to the mathematical concept of "hyperreal", which
includes infinities, although NaN doesn't fit.  Scheme's "complex" is
similarly extended relative to the mathematical complex numbers, but
the mathematical term "hypercomplex" unfortunately refers to something
quite different (quaternions and the like).

-zefram

bug#16360: "guild help COMMAND" crashes

2014-01-05 Thread Zefram

"guild help COMMAND" crashes for most existing guild subcommands.
For example:

$ GUILE=/usr/bin/guile-2.0 guild help frisk 
Usage: guild frisk OPTION...
Show dependency information for a module.
Backtrace:
In ice-9/boot-9.scm:
 157: 8 [catch #t # ...]
In unknown file:
   ?: 7 [apply-smob/1 #]
In ice-9/boot-9.scm:
  63: 6 [call-with-prompt prompt0 ...]
In ice-9/eval.scm:
 432: 5 [eval # #]
In /usr/bin/guild:
  74: 4 [main ("/usr/bin/guild" "help" "frisk")]
In scripts/help.scm:
 181: 3 [main "frisk"]
 155: 2 [show-help # #]
In ice-9/boot-9.scm:
 788: 1 [call-with-input-file #f ...]
In unknown file:
   ?: 0 [open-file #f "r" #:encoding #f #:guess-encoding #f]

ERROR: In procedure open-file:
ERROR: Wrong type (expecting string): #f
$

This is guile-2.0.9 on Debian.  Debian incarnation of this bug report:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734314

-zefram

bug#16365: (* 0 +inf.0) rationale is flawed

2014-01-05 Thread Zefram

Commit 5e7918077a4015768a352ab19e4a8e94531bc8aa says

  A note on the rationale for (* 0 +inf.0) being a NaN and not exact 0:
  The R6RS requires that (/ 0 0.0) return a NaN value, and that (/ 0.0)
  return +inf.0.  We would like (/ x y) to be the same as (* x (/ y)),

This identity doesn't actually hold.  For example, on guile 2.0.9 with
IEEE double flonums:

scheme@(guile-user)> (/ (expt 2.0 -20) (expt 2.0 -1026))
$36 = 6.857655085992111e302
scheme@(guile-user)> (* (expt 2.0 -20) (/ (expt 2.0 -1026)))
$37 = +inf.0

This case arises because the dynamic range of this flonum format is
slightly asymmetric: 2^-1026 is representable, but 2^1026 overflows.

So the rationale for (* 0 +inf.0) yielding +nan.0 is flawed.  As the
supposed invariant and the rationale are not in the actual documentation
(only mentioned in the commit log) this is not necessarily a bug.
But worth thinking again to determine whether the case for adopting
the flonum behaviour here is still stronger than the obvious case for
the exact zero to predominate.  (Mathematically, multiplying zero by an
infinite number does yield zero.  Let alone multiplying it by a merely
large finite number, which is what the flonum indefinite `infinity'
really represents.)

-zefram

bug#16362: compiler disrespects referential integrity

2014-01-05 Thread Zefram

The guile-2.0.9 compiler doesn't preserve the distinctness of mutable
objects that are referenced in code via the read-eval (#.) facility.
(I'm not mutating the code itself, only quoted objects.)  The interpreter,
and for comparison guile-1.8, do preserve object identity, allowing
read-eval to be used to incorporate direct object references into code.
Test case:

$ cat t9
(cond-expand
  (guile-2 (defmacro compile-time f `(eval-when (compile eval) ,@f)))
  (else (defmacro compile-time f `(begin ,@f
(compile-time (fluid-set! read-eval? #t))
(compile-time (define aaa (cons 1 2)))
(set-car! '#.aaa 5)
(write '#.aaa)
(newline)
(write '(1 . 2))
(newline)
$ guile-1.8 t9
(5 . 2)
(1 . 2)
$ guile-2.0 --no-auto-compile t9
(5 . 2)
(1 . 2)
$ guile-2.0 t9
;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
;;;   or pass the --no-auto-compile argument to disable.
;;; compiling /home/zefram/usr/guile/t9
;;; compiled 
/home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t9.go
(5 . 2)
(5 . 2)
$ guile-2.0 t9
(5 . 2)
(5 . 2)

In the test case, the explicitly-constructed pair aaa is conflated with
the pair literal (1 . 2), and so the runtime modification of aaa (which
is correctly mutable) affects the literal.

This issue seems closely related to the problem described at
<http://debbugs.gnu.org/cgi/bugreport.cgi?bug=11198>, wherein the compiler
is entirely unable to handle code incorporating references to some kinds
of object.  In that case the failure mode is a compile-time error, so
the problem can be worked around.  The failure mode with pairs, silent
misbehaviour, is a more serious problem.  Between them, these problems
break most of the interesting uses for read-eval, albeit only when using
the compiler.

Debian incarnation of this bug report:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734157

-zefram

bug#16362: compiler disrespects referential integrity

2014-01-15 Thread Zefram

Mark H Weaver wrote:
>I'm sorry that you've written code that assumes that this is allowed,
>but in Scheme all literals are immutable.

It's not a literal: the object was not constructed by the action of
the reader.  It was constructed by non-literal means, and merely *passed
through* the reader.

That's not to say your not-a-bug opinion is wrong, though.  Scheme as
defined by RnRS certainly doesn't support this kind of thing.  It treats
the print form of an expression as primary, and so doesn't like having
anything unprintable in the object form.

>It worked by accident in Guile 1.8,

This is the bit that's really news to me.  *Scheme* doesn't support
it, but *Guile* is more than just Scheme, and I presumed that it was
intentional that it took a more enlightened view of what constitutes
an expression.  If that was just an accident, then what you actually
support ought to be documented.  In principle it would also be a good
idea to enforce this restriction in the interpreter, to avoid having
this incompatibility between interpreter and compiler of the `same'
implementation.

>but there's simply no way to support
>this robustly in an ahead-of-time compiler, which must serialize all
>literals to an object file.

Sure there is.  The object in question is eminently serialisable: it
contains only references to other serialisable data.  All that needs
to change is to distinguish between actual literal pairs (that can be
merged) and non-literals whose distinct identity needs to be preserved.
This might well be painful to add to your existing code, given the
way you represent pairs.  But that's a difficulty with the specific
implementation, not an inherent limitation of compilation.

-zefram

bug#16363: interactive use subject to compiler limitations

2014-01-15 Thread Zefram

Mark H Weaver wrote:
>that all code and literals be serialized, there's no sane way to support
>the semantics you seem to want.

We've addressed the semantics themselves on the other ticket, #16362.
Accepting that the compiler semantics are preferred, there's still a
problem in the scope of my intent for this ticket #16363: that interactive
behaviour doesn't match the behaviour of a script.  The mismatch is a
problem for development regardless of which set of semantics is correct.

As I mentioned in passing on the other ticket, you could fix this by
enforcing the compiler restrictions in interpreting situations.  A start
on this would be for read-eval to refuse to accept any object without a
readable print form, such as the procedure in my example on this ticket.
For objects that do have a readable print form, such as the pair in
#16362, it could break the referential identity by copying the object,
as if by printing it to characters and reading it back.

If, on the other hand, you actually intend for the compiler and
interpreter to have visibly different semantics, there's still the
problem that the REPL approaches that difference in a different way from
script execution.  In that case, either the REPL should perform the same
fallback that script execution does (as I originally suggested on this
ticket), or script execution should not perform the fallback.

-zefram

bug#16362: compiler disrespects referential integrity

2014-01-15 Thread Zefram

Mark H Weaver wrote:
>In Scheme terminology, an expression of the form (quote ) is a
>literal.

Ah, sorry, I see your usage now.  R6RS speaks of that kind of expression
being a "literal expression".  (Elsewhere it uses "literal" in the sense
I was using it, referring to the readable representation of an object.)
Section 5.10 "Storage model" says "It is desirable for constants (i.e. the
values of literal expressions) to reside in read-only memory.".  So in
the Scheme model whatever that  in the expression is it's a
"constant".  Of course, that's in the RnRS view of expressions that
ignores the homoiconic representation.  It's assuming that these
"constants" will always be "literal" in the sense I was using.

>Where does it say in the documentation that this is allowed?

It doesn't: as far as I can see it doesn't document that aspect of the
language at all.  It would be nice if it did.

>To my mind, Guile documents itself as Scheme plus extensions,

I thought the documentation was attempting to document the language that
Guile implements per se.  It doesn't generally just refer to RnRS for the
language definition; it actually tells you most of what it could have
referred to RnRS for.  For example, it fully describes tail recursion,
without any reference to RnRS.  It's good that it does this, and it
would be good for it to be more complete in the areas such as this where
it's lacking.

So maybe I got the wrong impression of the documentation's role.  As the
documentation doesn't describe expressions in the RnRS character-based
way, I got the impression that Guile had not necessarily adopted that
restriction.  As it doesn't describe expressions in the homoiconic way
either, I interpreted it as silent on the issue, making experimentation
appropriate to determine the intent.

Maybe the documentation should have a note about its relationship
to the Scheme language definition: say which things it tries to be
authoritative about.

>cannot determine what extensions you can depend on by experiment.

Fair point, and I'm not bitter about my experiment turning out to have
this limited applicability.

>Consider this: you serialize an object to one file, and then the same
>object to a second file.  Now you load them both in from a different
>Guile session.  How can the Guile loader know whether these two objects
>should have the same identity or be distinct?

That's an interesting case, and I suppose I wouldn't expect that to
preserve identity.  I also wouldn't expect you to serialise an I/O port.
But the case I'm concerned about is a standalone script, being compiled
as a whole, and the objects it's setting up at compile time are made of
ordinary data.

I think some of our difference of opinion here comes because you're
mainly thinking of the compiler as something to apply to modules, so
you expect to deal with many compiled files in one session, whereas I'm
thinking about compilation of a program as a whole.  Your viewpoint is
the more general.

>For example, how do you correctly serialize a procedure produced by
>make-counter?

Assuming we're only serialising it to one file, it shouldn't be any more
difficult than my test case with a mutable pair.  The procedure object
needs to contain a reference to the body expression and a reference to
the lexical environment that it closed over.  The lexical environment
contains the binding of the symbol "n" to a variable, which contains
some current numeric value.  That variable is the basic mutable item
whose identity needs to be maintained through serialisation.  If we have
multiple procedures generated by make-counter, they'll have distinct
variables, and therefore distinct lexical environments, and therefore
be distinct procedures, though they'll share bodies.

The only part of this that looks at all difficult to me is that you may
have compiled the function body down to VM code, which is not exactly
a normal Lisp object and needs its own serialisation arrangements.
Presumably you already have that solved in order to compile code that
contains function definitions.  Aside from that it's all ordinary
Lisp objects that look totally serialisable.  What do you think is the
difficult part?

-zefram

bug#16464: + folding differs between compiler and interpreter

2014-01-16 Thread Zefram

The + procedure left-folds its arguments in interpreted code and
right-folds its arguments in compiled code.  This may or may not be a bug.

Obviously, with exact numbers the direction of folding makes no
difference.  But the difference is easily seen with flonums, as flonum
addition is necessarily non-associative.  For example, where flonums
are IEEE doubles:

scheme@(guile-user)> ,o interp #f
scheme@(guile-user)> (+ 1.0 (expt 2.0 -53) (expt 2.0 -53))
$1 = 1.0002
scheme@(guile-user)> (+ (expt 2.0 -53) (expt 2.0 -53) 1.0)
$2 = 1.0
scheme@(guile-user)> ,o interp #t
scheme@(guile-user)> (+ 1.0 (expt 2.0 -53) (expt 2.0 -53))
$3 = 1.0
scheme@(guile-user)> (+ (expt 2.0 -53) (expt 2.0 -53) 1.0)
$4 = 1.0002

Compiler and interpreter agree when the order of operations is explicitly
specified:

scheme@(guile-user)> (+ (+ 1.0 (expt 2.0 -53)) (expt 2.0 -53))
$5 = 1.0
scheme@(guile-user)> (+ 1.0 (+ (expt 2.0 -53) (expt 2.0 -53)))
$6 = 1.0002

If your flonums are not IEEE double then the exponent in the test case
has to be adapted.

R5RS and the Guile documentation are both silent about the order of
operations in cases like this.  I do not regard either left-folding or
right-folding per se as a bug.  A portable Scheme program obviously can't
rely on a particular behaviour.  My concern here is that the compiler
and interpreter don't match, making program behaviour inconsistent on
what is notionally a single implementation.  That mismatch may be a bug.
I'm not aware of any statement either way on whether you regard such
mismatches as bugs.  (An explicit statement in the documentation would
be most welcome.)

R6RS does have some guidance about the proper behaviour here.  The
description of the generic arithmetic operators doesn't go into such
detail, just describing it as generic.  It can be read as implying that
the behaviour on flonums should match the behaviour of the flonum-specific
fl+.  The description of fl+ (libraries section 11.3 "Flonums") says it
"should return the flonum that best approximates the mathematical sum".
That suggests that it shouldn't use a fixed sequence of dyadic additions
operations, and in my test case should return 1.0002
regardless of the order of operands.  Obviously that's more difficult
to achieve than just folding the argument list with dyadic addition.

Interestingly, fl+'s actual behaviour differs both from + and from the
R6RS ideal.  It left-folds in both compiled and interpreted code:

scheme@(guile-user)> (import (rnrs arithmetic flonums (6)))
scheme@(guile-user)> ,o interp #f
scheme@(guile-user)> (fl+ 1.0 (expt 2.0 -53) (expt 2.0 -53))
$7 = 1.0
scheme@(guile-user)> (fl+ (expt 2.0 -53) (expt 2.0 -53) 1.0)
$8 = 1.0002
scheme@(guile-user)> ,o interp #t
scheme@(guile-user)> (fl+ 1.0 (expt 2.0 -53) (expt 2.0 -53))
$9 = 1.0
scheme@(guile-user)> (fl+ (expt 2.0 -53) (expt 2.0 -53) 1.0)
$10 = 1.0002

fl+'s behaviour is not a bug.  The R6RS ideal is clearly not mandatory,
and the Guile documentation makes no stronger claim than that its fl+
conforms to R6RS.  As it is consistent between compiler and interpreter,
it is not subject to the concern that I'm raising in this ticket about
the generic +.

-zefram

bug#16364: auto-compile noise can't be avoided by script

2014-01-17 Thread Zefram

Ludovic Courtes wrote:
>However, you can set the environment variable GUILE_AUTO_COMPILE=0.
>
>Do you think that would solve the problem?

It does not solve the problem.  Firstly, it can't be done from the
#! line at all, so the script can't do it early enough.  It only works
if it's already been set by the user, which is no good for what should
be an internal detail of the program.  Secondly, it suffers the second
problem that I noted with --no-auto-compile: if there's already a cached
compilation then that'll be looked at, and if it's out of date then a
"newer than" banner is emitted.  With the environment variable set the
cached version will never be updated, nor will it be deleted, so the
banner then appears on every execution.

-zefram

bug#16361: compile cache confused about file identity

2015-05-13 Thread Zefram

Mark H Weaver wrote:
>You could make the same complaint about 'make', 'rsync', or any number
>of other programs.

Not really.  make does use this type of freshness check, but it's used
in a specific situation where the freshness issue is immediately obvious
and is part of the program's visible primary concern.  That's quite
unlike guile's compile cache, which as the name suggests is a cache.
It's meant to be unobtrusive, and the cache semantics are not a direct
part of the transaction that is ostensibly taking place, of running
a program that happens to be written in Scheme.  Those circumstances,
of running an arbitrary program, are much broader than circumstances in
which make's freshness checks become relevant.  make also gets a pass
from having always worked this way, whereas guile used to not cache
compilations.  rsync, by contrast, does not use this type of freshness
checking; I believe it uses a hash mechanism.

>It's true that a cryptographic hash would be more
>robust, but it would also be considerably more expensive in the common
>case where the .go file is already in the cache.
>
>I don't think it's worth paying this cost every time

OK, you can rule that suggestion out, but I think you have erred in
jumping from that to wontfix on the general problem.  You have not
addressed my prior suggestion of identifying programs by exact match on
device, inode number, and mtime.  (File size could also be included.)
This freshness check is very cheap, because it's just a few fixed-size
fields from the stat structure, and you're already necessarily doing a
stat on the program file.  Using the identifying fields as the cache
key even saves you a stat on the cached file.  Although not quite as
effective as a hash comparison, it would be a huge practical improvement
over the current filename-and-inexact-mtime comparison.

-zefram

bug#20822: environment mangled by locale

2015-06-15 Thread Zefram

When guile-2.0 is asked to read environment variables, via getenv,
it always decodes the underlying octet string according to the current
locale's nominal character encoding.  This is a problem, because the
environment variable's value is not necessarily encoded that way, and
may not even be an encoding of a character string at all.  The decoding
is lossy, where the octet string isn't consistent with the character
encoding, so the original octet string cannot be recovered from the
mangled form.  I don't see any Scheme interface that retrieves the
environment without locale decoding.

The decoding is governed by the currently selected locale at the time that
getenv is called, so this can be controlled to some extent by setlocale.
However, this doesn't provide a way round the lossy decoding problem,
because there is no guarantee of a cooperative locale being available
(and especially being available under a predictable name).  On my Debian
system here, the "POSIX" and "C" locales' nominal character encoding is
ASCII, so decoding under these locales results in all high-half octets
being turned into question marks.  Retrieving environment without calling
setlocale at all also yields this lossy ASCII decode.

Demos:

$ env - FOO=$'L\xc3\xa9on' guile-2.0 -c '(write (map char->integer 
(string->list (getenv "FOO" (newline)'
(76 63 63 111 110)
$ env - FOO=$'L\xc3\xa9on' guile-2.0 -c '(setlocale LC_ALL "POSIX") (write (map 
char->integer (string->list (getenv "FOO" (newline)'
(76 63 63 111 110)
$ env - FOO=$'L\xc3\xa9on' guile-2.0 -c '(setlocale LC_ALL "de_DE.utf8") (write 
(map char->integer (string->list (getenv "FOO" (newline)'
(76 233 111 110)
$ env - FOO=$'L\xc3\xa9on' guile-2.0 -c '(setlocale LC_ALL "de_DE.iso88591") 
(write (map char->integer (string->list (getenv "FOO" (newline)'
(76 195 169 111 110)

The actual data passed between processes is an octet string, and there
really needs to be some reliable way to access that octet string.
There's an obvious parallel with reading data from an input port.
If setlocale is called, then input is by default decoded according
to locale, including the very lossy ASCII decode for C/POSIX.  But if
setlocale has not been called, then input is by default decoded according
to ISO-8859-1, preserving the actual octets.  It would probably be most
sensible that, if setlocale hasn't been called, getenv should likewise
decode according to ISO-8859-1.  It might also be sensible to offer
some explicit control over the encoding to be used with the environment,
just as I/O ports have a concept of per-port selected encoding.

The same issue applies to other environment access functions too.
For setenv the corresponding problem is the inability to *write* an
arbitrary octet string to an environment variable.  Obviously all the
functions should have mutually consistent behaviour.

-zefram

bug#20823: argv mangled by locale

2015-06-15 Thread Zefram

When guile-2.0 stores argv for later access via program-arguments,
it sometimes decodes the underlying octet string according to the
nominal character encoding of the locale suggested by the environment.
This is a problem, because the arguments are not necessarily encoded
that way, and may not even be encodings of character strings at all.
The decoding is lossy, where the octet string isn't consistent with the
character encoding, so the original octet string cannot be recovered
from the mangled form.  I don't see any Scheme interface that reliably
retrieves the command line arguments without locale decoding.

The decoding doesn't follow the usual rules for locale control.  It is
not at all sensitive to setlocale, which is understandable due to the
arguments being acquired before any of the actual program's code runs.
Empirically, if the environment nominates no locale, "POSIX", or a
non-existent locale, then argv is decoded according to ISO-8859-1, thus
preserving the octets.  If the environment nominates an extant locale
other than "POSIX", then argv is decoded according to that locale's
nominal character encoding.

Demos:

$ env - guile-2.0 -c '(write (map char->integer (string->list (cadr 
(program-arguments) (newline)' $'L\xc3\xa9on'  
(76 195 169 111 110)
$ env - LANG=C guile-2.0 -c '(write (map char->integer (string->list (cadr 
(program-arguments) (newline)' $'L\xc3\xa9on' 
(76 63 63 111 110)
$ env - LANG=de_DE.utf8 guile-2.0 -c '(write (map char->integer (string->list 
(cadr (program-arguments) (newline)' $'L\xc3\xa9on' 
(76 233 111 110)
$ env - LANG=de_DE.iso88591 guile-2.0 -c '(write (map char->integer 
(string->list (cadr (program-arguments) (newline)' $'L\xc3\xa9on' 
(76 195 169 111 110)

The actual data passed between processes is an octet string, and
there really needs to be some reliable way to access that octet string.
My comments about resolution in bug#20822 "environment mangled by locale"
mostly apply here too, with a slight change: it seems necessary to store
the original octet strings and decode at the time program-arguments is
called.  With that change, the decoding can be responsive to setlocale
(and in particular can reliably use ISO-8859-1 in the absence of
setlocale).

-zefram

bug#21883: unnecessary bit shifting range limits

2015-11-11 Thread Zefram

Not really outright bugs, but these responses are less than awesome:

$ guile -c '(write (logbit? (ash 1 100) 123))'
ERROR: Value out of range 0 to 18446744073709551615: 
1267650600228229401496703205376
$ guile -c '(write (ash 0 (ash 1 100)))'
ERROR: Value out of range -9223372036854775808 to 9223372036854775807: 
1267650600228229401496703205376
$ guile -c '(write (ash 123 (ash -1 100)))'
ERROR: Value out of range -9223372036854775808 to 9223372036854775807: 
-1267650600228229401496703205376

In all three cases, the theoretically-correct result of the expression
is not only representable but easily computed.  The functions could be
improved to avoid failing in these cases, by adding logic amounting to:

(define (better-logbit? b v)
  (if (>= b (integer-length v)) (< v 0) (logbit? b v)))

(define (better-ash v s)
  (cond
((= v 0) 0)
((<= s (- (integer-length v))) (if (< v 0) -1 0))
(else (ash v s

-zefram

bug#21894: escape continuation doc wrong about reinvokability

2015-11-12 Thread Zefram

The manual says

#  Escape continuations are delimited continuations whose
# only use is to make a non-local exit--i.e., to escape from the current
# continuation.  Such continuations are invoked only once, and for this
# reason they are sometimes called "one-shot continuations".

O RLY?

scheme@(guile-user)> (use-modules (ice-9 control))
scheme@(guile-user)> (define cc #f)
scheme@(guile-user)> (list 'a (let/ec e (list 'b (e 
(call-with-current-continuation (lambda (c) (set! cc c) 0))
$1 = (a 0)
scheme@(guile-user)> (cc 1)
$2 = (a 1)
scheme@(guile-user)> (cc 2)
$3 = (a 2)

Clearly I have invoked this escape continuation, successfully, more
than once.  The semantics here are perfectly sensible, it's just
the documentation that's off the mark, because it ignores how escape
continuations interact with other kinds of continuation.  I suggest
changing "Such continuations are invoked only once" sentence to something
like

Such continuations can only be invoked from within the dynamic
extent of the call to which they will jump.  Because the jump
ends that extent, if escape continuations are the only kind of
continuations being used it is only possible to invoke an escape
continuation at most once.  For this reason they are sometimes
called "one-shot continuations", but that is a misnomer when other
kinds of continuations are also in use.  Most kinds can reinstate a
dynamic extent that has been exited, and if the extent of an escape
continuation is reinstated then it can be invoked again to exit that
extent again.  Conversely, an escape continuation cannot be invoked
from a separate thread that has its own dynamic state not including
the continuation's extent, even if the continuation's extent is
still in progress in its original thread and the continuation has
never been invoked.

-zefram

bug#21897: escape continuation passes barrier

2015-11-12 Thread Zefram

scheme@(guile-user)> (use-modules (ice-9 control))
scheme@(guile-user)> (call/ec (lambda (c) (with-continuation-barrier (lambda () 
(c "through continuation"))) "c-w-b returned"))
$1 = "through continuation"

The continuation barrier works fine on call/cc continuations and
on throw/catch, but doesn't block call/ec continuations.  The manual
doesn't mention any difference in behaviour for this case, nor can I
see any obvious justification for it.  The manual's statement that

#  Thus, `with-continuation-barrier' returns exactly once.

is false in this case.  I think a continuation barrier should block the
use of the call/ec continuation.

-zefram

bug#21899: let/ec continuations not distinct under compiler

2015-11-12 Thread Zefram

With guile 2.0.11:

scheme@(guile-user)> (use-modules (ice-9 control))
scheme@(guile-user)> (list 'a (let/ec ae (list 'b (let/ec be (be 2)
$1 = (a (b 2))
scheme@(guile-user)> (list 'a (let/ec ae (list 'b (let/ec be (ae 2)
$2 = (a (b 2))
scheme@(guile-user)> (list 'a (let/ec ae (list 'b (ae 2
$3 = (a 2)

The middle of these three cases is wrong: it attempts to invoke the outer
escape continuation, but only goes as far as the target of the inner one,
which it isn't using.  It therefore produces the same result as the first
case, which invokes the inner escape continuation.  It ought to behave
like the third case, which shows that the outer escape continuation can
be successfully invoked when the unused inner continuation is not present.

The problem only affects let/ec, *not* call/ec:

scheme@(guile-user)> (list 'a (call/ec (lambda (ae) (list 'b (call/ec (lambda 
(be) (be 2)))
$4 = (a (b 2))
scheme@(guile-user)> (list 'a (call/ec (lambda (ae) (list 'b (call/ec (lambda 
(be) (ae 2)))
$5 = (a 2)
scheme@(guile-user)> (list 'a (call/ec (lambda (ae) (list 'b (ae 2)
$6 = (a 2)

It also only happens when compiling, not when interpreting:

scheme@(guile-user)> ,o interp #t
scheme@(guile-user)> (list 'a (let/ec ae (list 'b (let/ec be (be 2)
$7 = (a (b 2))
scheme@(guile-user)> (list 'a (let/ec ae (list 'b (let/ec be (ae 2)
$8 = (a 2)
scheme@(guile-user)> (list 'a (let/ec ae (list 'b (ae 2
$9 = (a 2)

-zefram

bug#21900: map is not continuation-safe

2015-11-13 Thread Zefram

With Guile 2.0.11:

scheme@(guile-user)> (define cc #f)
scheme@(guile-user)> (map (lambda (v) (if (= v 0) (call/cc (lambda (c) (set! cc 
c) 0)) (+ v 1))) '(10 20 30 0 40 50 60))
$1 = (11 21 31 0 41 51 61)
scheme@(guile-user)> (cc 5)
$2 = (61 51 41 0 31 5 41 51 61)

It worked correctly in Guile 1.8.

-zefram

bug#21901: bit shift wrong on maximal right shift

2015-11-13 Thread Zefram

With Guile 2.0.11:

scheme@(guile-user)> (ash 123 (ash -1 63))
$1 = 123

Correct result would of course be zero.  Problem only occurs for
exactly this shift distance: one bit less produces the right answer.
Problem also occurs on Guile 1.8.8.  Looking at the implementation,
the problem is attributable to the negation of the shift distance,
which in twos-complement fails to produce the expected positive result.

Note the resemblance to bug #14864, fixed in 2.0.10.  This bug is of
very similar form, but is distinct.  The test cases of #14864 pass for
me on the 2.0.11 that shows the problem with a 2^63 bit shift.  My bug
does occur with the rnrs bitwise-arithmetic-shift-right, which was used
in #14864, as well as with ash.

-zefram

bug#21902: doc incorrectly describes Julian Date

2015-11-13 Thread Zefram

The manual says, in the section "SRFI-19 Introduction",

#Also, for those not familiar with the terminology, a "Julian Day" is
# a real number which is a count of days and fraction of a day, in UTC,
# starting from -4713-01-01T12:00:00Z, ie. midday Monday 1 Jan 4713 B.C.

There are two errors in the first statement of the epoch for Julian Date,
in ISO 8601 format.  The JD epoch is noon on 1 January 4713 BC *in the
proleptic Julian calendar*.  The ISO 8601 format is properly never used on
the Julian calendar: ISO 8601 specifies the use of the Gregorian calendar,
including proleptically where necessary (as it most certainly is here).
On the proleptic Gregorian calendar, the JD epoch is noon on 24 November
4714 BC, and so the ISO 8601 expression should have some "-11-24".

The second error is in how the year is expressed in ISO 8601.  The initial
"-" does not mean the BC era, it means that the year number is negative.
ISO 8601 specifies that the AD era is always used, with year numbers
going negative where necessary; this arrangement is commonly known as
"astronomical year numbering".  So "" means 1 BC, "-0001" means 2
BC, and "-4713" means 4714 BC.  So the "-4713" is not correct for the
attempted expression of the Julian calendar date, but happens to be
correct for the Gregorian calendar date.

Putting it together, a correct ISO 8601 expression for the Julian Date
epoch is "-4713-11-24T12:00:00Z".

The word-based statement of the JD epoch is correct as far as it goes,
but would benefit considerably by the addition of a clause stating that
it is in the proleptic Julian calendar.  (Generally, a clarification
of which calendar is being used is helpful with the statement of any
date prior to the UK's switch of calendar in 1752.)  The description of
Modified Julian Date is essentially correct.

However, there's a third problem: misuse of the term "UTC" for historical
times.  The description of Julian Date says it's counted "in UTC",
and the statement of the MJD epoch describes its 1858 time as being
specified in UTC.  UTC is defined entirely by its relationship to TAI,
which is defined by the operation of atomic clocks.  TAI is therefore
only defined for the period since the operation of the first caesium
atomic clock in the middle of 1955.  The UTC<->TAI relationship isn't
actually defined even that far back: UTC begins at the beginning of
1961 (and that was not in the modern form with leap seconds).  It is
therefore incorrect to apply the term "UTC" to any time prior to 1961.
These two references to UTC should instead be to "UT", the wider class
of closely-matching time scales of which UTC is one representative.
Also, in the first sentence of this doc section, the phrase "universal
time (UTC)" should be either "universal time (UT)" or (more likely)
"coordinated universal time (UTC)".

-zefram

bug#21903: date->string duff ISO 8601 negative years

2015-11-13 Thread Zefram

The date->string function from (srfi srfi-19), used on ISO 8601 formats
"~1", "~4" and "~5", for years preceding AD 1, has an off-by-one error:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (date->string (julian-day->date 0 0) "~4")
$1 = "-4714-11-24T12:00:00Z"

The date in question, the JD epoch, is 24 November 4714 BC (in the
proleptic Gregorian calendar).  In ISO 8601 format, that year is properly
represented as "-4713", not "-4714", because ISO 8601 uses the AD era
exclusively.  4714 BC = AD -4713.

-zefram

bug#21904: date->string duff ISO 8601 format for non-4-digit years

2015-11-13 Thread Zefram

The date->string function from (srfi srfi-19), used on ISO 8601 formats
"~1", "~4", and "~5", gets the formatting of year numbers wrong when the
year number doesn't have exactly four digits.  There are multiple cases:

scheme@(guile-user)> (date->string (julian-day->date 150 0) "~1")
$1 = "-607-10-04"
scheme@(guile-user)> (date->string (julian-day->date 170 0) "~1")
$2 = "-59-05-05"
scheme@(guile-user)> (date->string (julian-day->date 172 0) "~1")
$3 = "-4-02-05"

For year numbers -999 to -1 inclusive, date->string is using the minimum
number of digits to express the number, but ISO 8601 requires the use
of at least four digits, with zero padding on the left.  So one should
write "-0059" rather than "-59", for example.  Note that this range is
also affected by the off-by-one error in the selection of the year number
that I described in bug #21903, but that's not the subject of the present
bug report.  Here I'm concerned with how the number is represented in
characters, not with how the year is represented numerically.

scheme@(guile-user)> (date->string (julian-day->date 1722000 0) "~1")
$4 = "2-07-29"
scheme@(guile-user)> (date->string (julian-day->date 173 0) "~1")
$5 = "24-06-23"
scheme@(guile-user)> (date->string (julian-day->date 200 0) "~1")
$6 = "763-09-18"

For year numbers 1 to 999 inclusive, again date->string is using the
minimum number of digits to express the number, but ISO 8601 requires the
use of at least four digits.  If no leading "+" sign is used then the
number must be exactly four digits, and that is the appropriate format
to use in this situation.  So one should write "0024" rather than "24",
for example.

The year number 0, representing the year 1 BC, logically also falls into
this group, and should be represented textually as "".  Currently this
case doesn't arise in the function's output, because the off-by-one bug
has it erroneously emit "-1" for that year.

scheme@(guile-user)> (date->string (julian-day->date 1000 0) "~1")
$7 = "22666-12-20"
scheme@(guile-user)> (date->string (julian-day->date 1 0) "~1")
$8 = "269078-08-07"

For year numbers 1 and above, it is necessary to use more than four
digits for the year, and that's permitted, but ISO 8601 requires that
more than four digits are preceded by a sign.  For positive year numbers
the sign must be "+".  So one should write "+22666" rather than "22666",
for example.

The formatting of year numbers for ISO 8601 purposes is currently only
correct for numbers -1000 and lower (though the choice of number is off
by one) and for year numbers 1000 to  inclusive.

-zefram

bug#21906: julian-day->date negative input breakage

2015-11-13 Thread Zefram

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (julian-day->date 0 0)
$1 = #
scheme@(guile-user)> (julian-day->date -1 0)
$2 = #
scheme@(guile-user)> (julian-day->date -10 0)
$3 = #
scheme@(guile-user)> (julian-day->date -1000 0)
$4 = #

Observe the various erroneous field values: negative hour, negative
day-of-month, zero month.  These occur in general for various negative
JD inputs.  Not only should the conversion not produce these kinds of
values, the date structure type probably ought to reject them if they
get that far.

-zefram

bug#21907: date->string duff ISO 8601 zone format

2015-11-13 Thread Zefram

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (date->string (julian-day->date 245 3600) "~4")
$1 = "1995-10-09T13:00:00+0100"
scheme@(guile-user)> (date->string (julian-day->date 245 3630) "~4")
$2 = "1995-10-09T13:00:30+0100"

There are two problems here with date->string's representation of zone
offsets for the ISO 8601 formats "~2" and "~4".  Firstly, because the
time-of-day is represented in the extended format with colon separators,
the zone offset must also be represented with colon separators.  So the
first "+0100" above should be "+01:00".

Secondly, the offset is being truncated to an integral minute, so the
output doesn't fully represent the zone offset.  More importantly,
because the local time-of-day isn't being adjusted to match, it's not
accurately representing the point in time.  ISO 8601 doesn't permit
a seconds component in the zone offset, so you have a choice of three
not-entirely-satisfactory options.  Firstly, you could round the zone
offset and adjust the represented local time accordingly, so the 3630
conversion above would yield either "1995-10-09T13:00:00+01:00" or
"1995-10-09T13:01:00+01:01".  Secondly, you could use the obvious
extension of the ISO 8601 format to a seconds component, outputting
"1995-10-09T13:00:30+01:00:30".  Or finally you could signal an error
when trying to represent a zone offset that's not an integral minute.

Incidentally, for offsets of -1 to -59 inclusive, the truncation isn't
clearing the negative sign, so is producing the invalid output "-".
The zero offset is required to be represented with a "+" sign.  If you
take the rounding option described above, anything that rounds to a
zero-minutes offset must yield "+00:00" in the output.

-zefram

bug#21911: TAI-to-UTC conversion leaps at wrong time

2015-11-13 Thread Zefram

Probing the TAI-to-UTC conversion offered by srfi-19's time-tai->date,
in the minutes around the leap second in 2012:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (for-each (lambda (d) (write (list d (date->string 
(time-tai->date (add-duration (julian-day->time-tai 2456109) (make-time 
time-duration 0 d)) 0) "~4"))) (newline)) (list 43000 43160 43164 43165 43166 
43167 43199 43200 43201 43202))
(43000 "2012-06-30T23:56:40Z")
(43160 "2012-06-30T23:59:20Z")
(43164 "2012-06-30T23:59:24Z")
(43165 "2012-06-30T23:59:25Z")
(43166 "2012-06-30T23:59:25Z")
(43167 "2012-06-30T23:59:26Z")
(43199 "2012-06-30T23:59:58Z")
(43200 "2012-06-30T23:59:59Z")
(43201 "2012-06-30T23:59:60Z")
(43202 "2012-07-01T00:00:01Z")

The julian-day->time-tai conversion is correct (the JD refers to
2012-06-30T12:00:00 UTC, which is 2012-06-30T12:00:34 TAI), and the
duration addition works in a perfectly regular manner in TAI space.
All the interesting stuff happens in the TAI-to-UTC conversion, between
the time-tai structure and the date structure.  The same thing happens
if the conversion is performed by separate time-tai->time-utc and
time-utc->date calls.  The date->string part is correct and uninteresting.

The conversion is initially correct, minutes before midnight, but a
discontinuity is seen 35 seconds before midnight.  Outputs from then up
to one second after midnight are one second slow.  At one second after
midnight it recovers.  Because 35 seconds happens to be the TAI-UTC
difference prevailing immediately after this leap second, I suspect that
this is down to a time_t value (as used in the time-utc structure) for
the moment of the leap being misinterpreted as a time-tai seconds value.

The UTC-to-TAI conversion is in better shape.  As a result,
time-tai->time-utc and time-utc->time-tai are not inverses during the
35 second erroneous period.  Round-tripping through the two conversions
produces an output not matching the input.

-zefram

bug#21912: TAI<->UTC conversion botches the unknown

2015-11-13 Thread Zefram

Probing the existence of leap seconds on particular days, via srfi-19's
TAI-to-UTC conversion.  The methodology here is to take noon UT on the
day of interest, convert to TAI, add 86400 seconds, then convert to UTC
and display.  The resulting time of day is 11:59:59 if there is a leap
second that day, and 12:00:00 if there is not.

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (date->string (time-tai->date (add-duration 
(julian-day->time-tai 2455743) (make-time time-duration 0 86400)) 0) "~4")
$1 = "2011-07-01T12:00:00Z"
scheme@(guile-user)> (date->string (time-tai->date (add-duration 
(julian-day->time-tai 2456109) (make-time time-duration 0 86400)) 0) "~4")
$2 = "2012-07-01T11:59:59Z"
scheme@(guile-user)> (date->string (time-tai->date (add-duration 
(julian-day->time-tai 2457204) (make-time time-duration 0 86400)) 0) "~4")
$3 = "2015-07-01T12:00:00Z"
scheme@(guile-user)> (date->string (time-tai->date (add-duration 
(julian-day->time-tai 2457935) (make-time time-duration 0 86400)) 0) "~4")
$4 = "2017-07-01T12:00:00Z"

For 2011-06-30 it is correct that there was not a leap second, and for
2012-06-30 it is correct that there was.  But for 2015-06-30 it says
there was not a leap second, when in fact there was.  For 2017-06-30
it says there will not be a leap second, when in fact it is not yet
determined whether there will be.

Really both of these errors come from the same cause.  At the time
this Guile 2.0.11 was released, the leap status of 2015-06-30 had not
yet been determined.  Both 2015 and 2017 fall within the future period
beyond the scope of this Guile's static leap second knowledge.

The bug is not that Guile doesn't know that there was a leap second
in 2015.  As the 2017 case illustrates, it's impossible for it to
know all the leap second scheduling about which it can be asked.
The bug is that Guile *thinks* it knows about all future leap seconds.
It specifically thinks that there will be no leaps at all beyond the
historically-scheduled ones that it knows about.

Guile ought to be aware of how far its leap table extends, and signal
an error when asked to perform a TAI<->UTC conversion that falls outside
its scope.

-zefram

bug#21915: write inconsistent about #nil

2015-11-13 Thread Zefram

The write function is inconsistent about whether it distinguishes between
#nil and ():

scheme@(guile-user)> '(#nil . a)
$1 = (#nil . a)
scheme@(guile-user)> '(a . #nil)
$2 = (a)

Thee latter behaviour, emitting #nil as if it were (), breaks the usual
write/read round-tripping, and the traditional correspondence between
equal? and matching of written representation.  Admittedly those standards
are not absolute, nor is the extent to which they're expected to hold
documented, but #nil is clearly sufficiently atomic to be the kind of
value to which one would expect them to apply.  For these reasons,
if a consistent behaviour is to be chosen, I think it should be to
consistently distinguish the values.

I think the behaviour should be consistent.  The values should be
distinguished or not without regard to the context in which they arise
within an s-expression.

Whatever is done, even if it's to endorse the inconsistency, the behaviour
should be documented, with rationale.

-zefram

bug#22033: time-utc format is lossy

2015-11-27 Thread Zefram

In SRFI-19, round-tripping some UTC dates through the time-utc structure
format, for the couple of seconds around a leap second:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (define (tdate d) (write (list (date->string d "~4") 
(date->string (time-utc->date (date->time-utc d) 0) "~4"))) (newline))
scheme@(guile-user)> (tdate (make-date 0 59 59 23 30 6 2012 0))
("2012-06-30T23:59:59Z" "2012-06-30T23:59:59Z")
scheme@(guile-user)> (tdate (make-date 0 60 59 23 30 6 2012 0))
("2012-06-30T23:59:60Z" "2012-06-30T23:59:60Z")
scheme@(guile-user)> (tdate (make-date 0 0 0 0 1 7 2012 0))
("2012-07-01T00:00:00Z" "2012-06-30T23:59:60Z")
scheme@(guile-user)> (tdate (make-date 0 1 0 0 1 7 2012 0))
("2012-07-01T00:00:01Z" "2012-07-01T00:00:01Z")

Observe that the second immediately following the leap second, the
first second of the following UTC day, isn't round-tripped correctly.
It comes back as the leap second.  These two seconds are perfectly
distinct parts of the UTC time scale, and the time-utc format ought to
preserve their distinction.

-zefram

bug#22034: time-utc->date shows bogus zone-dependent leap second

2015-11-27 Thread Zefram

time-utc->date seems to think that a leap second occurs at a different
time in each time zone:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (define (tdate d) (write (list (date->string d "~4") 
(date->string (time-utc->date (date->time-utc d) 3600) "~4"))) (newline))
scheme@(guile-user)> (tdate (make-date 0 59 59 22 30 6 2012 0))
("2012-06-30T22:59:59Z" "2012-06-30T23:59:59+0100")
scheme@(guile-user)> (tdate (make-date 0 0 0 23 30 6 2012 0))
("2012-06-30T23:00:00Z" "2012-06-30T23:59:60+0100")
scheme@(guile-user)> (tdate (make-date 0 1 0 23 30 6 2012 0))
("2012-06-30T23:00:01Z" "2012-07-01T00:00:01+0100")

These are three consecutive seconds that occur an hour before a genuine
leap second (at 23:59:60Z).  Observe that time-utc->date, applied to the
middle second, describes it as a leap second happening at 23:59:60+01:00,
which is bogus.  Describing the same seconds on input as a date structure
with a non-zero zone offset produces the same wrong output, and requesting
output with a different zone offset changes which second is affected.
The faulty output is always 23:59:60 in the output zone.

Matching up with this, the actual leap second is never correctly described
with a non-zero zone offset.  It should be, for example, 00:59:60+01:00.
However, probing for this side of the problem also runs into the
round-tripping failure that I described in bug#22033.

-zefram

bug#22901: drain-input doesn't decode

2016-03-03 Thread Zefram

The documentation for drain-input says that it returns a string of
characters, implying that the result is equivalent to what you'd get
from calling read-char some number of times.  In fact it differs in a
significant respect: whereas read-char decodes input octets according to
the port's selected encoding, drain-input ignores the selected encoding
and always decodes according to ISO-8859-1 (thus preserving the octet
values in character form).

$ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! (current-input-port) 
"UCS-2BE") (write (port-encoding (current-input-port))) (newline) (write (map 
char->integer (let r ((l '\''())) (let ((c (read-char (current-input-port 
(if (eof-object? c) (reverse l) (r (cons c l))) (newline)'
"UCS-2BE"
(353 610 867)
$ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! (current-input-port) 
"UCS-2BE") (write (port-encoding (current-input-port))) (newline) (peek-char 
(current-input-port)) (write (map char->integer (string->list (drain-input 
(current-input-port) (newline)'
"UCS-2BE"
(1 97 2 98 3 99)

The practical upshot is that the input returned by drain-input can't
be used in the same way as regular input from read-char.  It can still
be used if the code doing the reading is totally aware of the encoding,
so that it can perform the decoding manually, but this seems a failure
of abstraction.  The value returned by drain-input ought to be coherent
with the abstraction level at which it is specified.

I can see that there is a reason for drain-input to avoid performing
decoding: the problem that occurs if the buffer ends in the middle
of a character.  If drain-input is to return decoded characters then
presumably in this case it would have to read further octets beyond the
buffer contents, in an unbuffered manner, until it reaches a character
boundary.  If this is too unpalatable, perhaps drain-input should be
permitted only on ports configured for single-octet character encodings.

If, on the other hand, it is decided to endorse the current non-decoding
behaviour, then the break of abstraction needs to be documented.

-zefram

bug#22902: GUILE_INSTALL_LOCALE not equivalent to setlocale

2016-03-03 Thread Zefram

The documentation claims that setting GUILE_INSTALL_LOCALE=1 in the
environment is equivalent to calling (setlocale LC_ALL "") at startup.
Actually there is at least one difference: calling setlocale causes ports
(both primordial and later-opened) to be initially configured for the
locale's nominal character encoding, but setting the environment variable
does not.  Setting the environment variable leaves the port encoding at
#f, functioning as ISO-8859-1, just as if locale had not been invoked
at all.  I do see some effects from setting the environment variable,
specifically message strings affecting strftime.

$ echo -n $'L\xc3\xa9on' | LANG=de_DE.UTF-8 guile-2.0 -c '(write (strftime "%c" 
(gmtime 10))) (newline) (write (port-encoding (current-input-port))) 
(newline) (write (map char->integer (let r ((l '\''())) (let ((c (read-char 
(current-input-port (if (eof-object? c) (reverse l) (r (cons c l))) 
(newline)'
"Sun Sep  9 01:46:40 2001"
#f
(76 195 169 111 110)
$ echo -n $'L\xc3\xa9on' | GUILE_INSTALL_LOCALE=1 LANG=de_DE.UTF-8 guile-2.0 -c 
'(write (strftime "%c" (gmtime 10))) (newline) (write (port-encoding 
(current-input-port))) (newline) (write (map char->integer (let r ((l '\''())) 
(let ((c (read-char (current-input-port (if (eof-object? c) (reverse l) (r 
(cons c l))) (newline)'
"So 09 Sep 2001 01:46:40 GMT"
#f
(76 195 169 111 110)
$ echo -n $'L\xc3\xa9on' | LANG=de_DE.UTF-8 guile-2.0 -c '(setlocale LC_ALL "") 
(write (strftime "%c" (gmtime 10))) (newline) (write (port-encoding 
(current-input-port))) (newline) (write (map char->integer (let r ((l '\''())) 
(let ((c (read-char (current-input-port (if (eof-object? c) (reverse l) (r 
(cons c l))) (newline)'
"So 09 Sep 2001 01:46:40 GMT"
"UTF-8"
(76 233 111 110)

In case anyone trawls the archives later investigating the usage of
GUILE_INSTALL_LOCALE: I am not attempting to use it myself, despite the
scenario implied by the above test cases.  I think it's a bloody stupid
mechanism, imposing on the program something that needs to be under the
program's control, and which previously was.  I'm actually investigating
how to make programs cope with the unpredictable situation caused by
this mechanism with the unpredictable environment setting.

-zefram

bug#22905: GUILE_INSTALL_LOCALE produces unavoidable noise

2016-03-04 Thread Zefram

GUILE_INSTALL_LOCALE=1 breaks some of the robustness of non-locale-using
programs, marring their stderr output if the environment's locale settings
are faulty.

Suppose you have a program written in Guile Scheme that doesn't use any
locale facilities.  To be portable to the GUILE_INSTALL_LOCALE=1 situation
(which the documentation threatens will become the default in Guile 2.2),
it must be prepared to start up with some locale already selected, and
reconfigure from there as required.  Being a conscientious programmer,
you are of course willing to add the (setlocale LC_ALL "C") and whatever
other invocations are required to recover the non-locale state.  But then
this situation arises:

$ LANG=wibble GUILE_INSTALL_LOCALE=1 guile-2.0 -c '(setlocale LC_ALL "C") 
(write "hi") (newline)'
guile: warning: failed to install locale
"hi"

The warning shown goes to the program's stderr.  It does not come from
the program's setlocale call, which is succeeding and would signal a
perfectly ordinary (catchable) exception if it failed.  The warning comes
from the implicit setlocale call triggered by GUILE_INSTALL_LOCALE=1,
before the program gains control.  As far as I can see, there is no way
for the program to prevent the failing setlocale attempt or to muffle
the warning.  Or even to detect that this has happened.

Guile should not be saying anything on the program's stderr.  This is
damaging the program's visible behaviour, and making it impossible to
effectively port non-locale programs to new Guile versions.

If Guile must attempt this implicit setlocale and continue to run the
program if it fails, then it needs to keep quiet about the failure.
This is no disadvantage to a program that actually wants to use the
environmental locale, because the program is free to call (setlocale
LC_ALL "") itself and handle its failure in whatever manner it finds
appropriate.  Indeed, any such program predating Guile 2.0 must already
be performing that call itself, because the implicit setlocale didn't
occur then.  The same for any program portable to pre-2.0 Guiles.  And on
Guile 2.0+ such a program still really needs to perform the call itself,
because it can't predict how GUILE_INSTALL_LOCALE will be set in the
environment, so still can't rely on the implicit setlocale happening.

However, if it is deemed to be essential that Guile attempt the implicit
setlocale and gripe about its failure, then the message should not
precede or otherwise mix with the actual program run.  The message should
be emitted *instead of* running the program, declaring the absolute
incompatibility of the Guile framework with this environmental condition.

-zefram

bug#22910: read-only setlocale has side effect

2016-03-04 Thread Zefram

A call to setlocale with no second argument is documented to be
a read-only operation, querying the current locale configuration.
In fact it has a side effect of setting the encoding on primordial ports:

$ guile-2.0 -c '(write (port-encoding (current-input-port))) (newline) 
(setlocale LC_TIME) (write (port-encoding (current-input-port))) (newline)'
#f
"ANSI_X3.4-1968"

Observe that this occurs even if the locale reading operation is for a
category unrelated to character encoding.  The actual decoding behaviour
of read-char is altered in accordance with the reported encoding.
Non-primordial ports opened before or after the setlocale call are
not affected.

-zefram

bug#22910: read-only setlocale has side effect

2016-03-04 Thread Zefram

Additional information: setlocale's side effect on primordial ports
happens even if the port's encoding has been individually set using
set-port-encoding!.  This means that to maintain a specific encoding on
these ports (other than the locale's nominal encoding, which is likely to
not be binary compatible) it is necessary to set the encoding repeatedly,
before any I/O operation after setlocale might have been called.
Since the read-only mode of setlocale has this effect, and arbitrary
library code might feel entitled to call setlocale for read purposes
without documenting that it does so, this really amounts to setting the
encoding before every I/O operation.

-zefram

bug#20822: environment mangled by locale

2016-03-04 Thread Zefram

I wrote:
>There's an obvious parallel with reading data from an input port.
>If setlocale is called, then input is by default decoded according
>to locale, including the very lossy ASCII decode for C/POSIX.  But if
>setlocale has not been called, then input is by default decoded according
>to ISO-8859-1, preserving the actual octets.  It would probably be most
>sensible that, if setlocale hasn't been called, getenv should likewise
>decode according to ISO-8859-1.  It might also be sensible to offer
>some explicit control over the encoding to be used with the environment,
>just as I/O ports have a concept of per-port selected encoding.

In the light of what I've learned recently about Guile's locale handling,
this needs some revision.  What I thought was a well-defined "setlocale
not called" state is a mirage.  The encoding of ports is not reliably
fixed at ISO-8859-1; per bug#22910 it can be affected by ostensibly
read-only calls to setlocale, and seems to be only accidentally
ISO-8859-1 until that's done.  So that's not a good model.  Due to the
GUILE_INSTALL_LOCALE mechanism, a program wanting no locale selected
can't just never call setlocale in write mode.  So setlocale not having
been called is not really available as a way to control anything.

So it would seem to be necessary to use some explicit control of character
encoding for environment access.  (This must be control of encoding
per se, not merely of which locale to use for environment access,
because, as I noted in the original report, there's no guarantee of a
locale with a suitable encoding.)  This could be an optional parameter
to the environment access functions, or a settable variable that takes
precedence over locale to determine encoding for all environment access.
The latter would match the encoding model used by ports.

-zefram

bug#20823: argv mangled by locale

2016-03-04 Thread Zefram

I wrote:
>My comments about resolution in bug#20822 "environment mangled by locale"
>mostly apply here too,

The revised comments that I have just made on that ticket also apply
here.  Short version: "absence of setlocale" isn't a useful criterion,
so explicit control of encoding will be necessary.

-zefram

bug#22913: filenames mangled by locale

2016-03-04 Thread Zefram

It seems that guile-2.0 applies locale encoding and decoding to pathnames
being used in system calls.  This radically breaks file access anywhere
that the locale's character encoding is anything other than a simple
8-bit encoding such as ISO-8859-1.  For example, in the default C locale
with its nominal ASCII encoding,

$ guile-2.0 -c '(open-file (list->string (map integer->char '\''(76 195 169 111 
110))) "w")'
$ echo L*n | od -tc
000   L   ?   ?   o   n  \n
006

Those are literal question marks in the name of the file actually
created, apparently arising as substitutions for the high-half octets in
the requested filename.  Existing files with names containing high-half
octets can't be found (resulting in an ENOENT error message that shows the
actually-existing filename), and new ones can't be created (actually being
created under the mangled name instead).  There's no warning or exception
advising that the requested name can't be used, just this misbehaviour.

The equivalent problem arises with decoding when filenames are received:

$ echo foo > $'L\303\251on.txt'
$ guile-2.0 -c '(define d (opendir ".")) (let r () (let ((n (readdir d))) (if 
(eof-object? n) #t (begin (if (eq? (car (reverse (string->list n))) #\t) (begin 
(write (map char->integer (string->list n))) (newline))) (r)'
(76 63 63 111 110 46 116 120 116)

Again no warning or exception, just incorrect data returned.

To work around this would require the program to select a locale with
a more accommodating nominal character encoding.  As I've previously
noted, there's no guarantee of such a locale existing.  Thus the above
behaviour is fatal to any attempt to write in Guile Scheme a program to
operate on arbitrarily-named files.

Guile even applies this mangling to the pathname of a script that it is
to load:

$ echo '(write "hi")(newline)' > $'L\303\251on.scm' 
$ guile-2.0 -s L*n.scm
[big error message saying it couldn't find the file that exists]

Obviously, even if a program could turn off the locale mangling in
general, this instance of it occurs too early for the program to avoid.
The guile framework itself has acquired the kind of 8-bit-cleanliness
bug that it is imposing on the programs that it interprets.

-zefram

bug#16357: insufficient print abbreviation in error messages

2016-06-21 Thread Zefram

Andy Wingo wrote:
>Thoughts?

How was this managed in Guile 1.8?

It seems that you need the truncated-print mechanism to be always
available internally, but this doesn't require that it be always visible
to the user.  You can still require the full libraries to be loaded for
the user to get access.

Lazy loading sounds like a bad idea.  Error handling is a bad place to
attempt something that complex and failure-prone.

-zefram

bug#16365: (* 0 +inf.0) rationale is flawed

2016-06-21 Thread Zefram

Mark H Weaver wrote:
> I also suspect that (/ 0 ) should be 0,
>although that conflicts with R6RS.  We should probably investigate the
>rationale behind R6RS's decision to specify that (/ 0 0.0) returns a NaN
>before changing that, though.

I think R6RS makes sense for (/ 0 0.0).  A flonum zero really represents
a range of values including both small non-zero numbers and actual zero.
The mathematical result of the division could therefore be either zero or
undefined.  To return zero for it would be picking a particular result,
on the assumption that the flonum zero actually represented a non-zero
value, and that's not justified.  So to use the flonum behaviour seems
the best thing available.

(/ 0 3.5) is a different case.  Here the mathematical result is an
exact zero, and I'm surprised that R6RS specifies that this should be
an inexact zero.  This seems inconsistent with (* 1.0 0), for which it
specifies that the result may be either 0 or 0.0.

I'd also question R6RS in the related case of (/ 0.0 0).  Mathematically
this division is definitely an error, regardless of whether the dividend
represents zero or a non-zero number.  So it would make sense for this
to raise an exception in the same manner as (/ 3 0) or (/ 0 0), rather
than get flonum treatment as R6RS specifies.

But deviating from R6RS, even with a good rationale for other behaviour,
would be a bad idea.  The questionable R6RS requirements are not crazy,
just suboptimal.  The case I originally raised, (* 0 +inf.0), is one
for which R6RS offers the choice.

-zefram

bug#20823: argv mangled by locale

2016-06-24 Thread Zefram

Andy Wingo wrote:
>I also don't
>know whether to supply an optional "encoding" argument, and use that
>encoding to decode the command line arguments.

That, or something that just retrieves octets, is necessary.  Decoding via
the selected locale does not suffice, because there's no guarantee that
there'll be a locale with a cooperative encoding.

-zefram

bug#21899: let/ec continuations not distinct under compiler

2016-06-24 Thread Zefram

Andy Wingo wrote:
>  ,opt (let* ((x (list 'a))
>  (y (list 'a)))
> (list x y))
>  ;; ->
>  (let* ((x (list 'a)) (y x)) (list x y))

Wow, that's a scary level of wrongitude.  It's specific to let* (or
equivalent nested let forms), but really easy to trigger within that:

scheme@(guile-user)> (let ((x (list 'a)) (y (list 'a))) (eq? x y))
$1 = #f
scheme@(guile-user)> (let* ((x (list 'a)) (y (list 'a))) (eq? x y))
$2 = #t
scheme@(guile-user)> (let ((x (list 'a))) (let ((y (list 'a))) (eq? x y)))
$3 = #t

-zefram

bug#21899: let/ec continuations not distinct under compiler

2016-06-24 Thread Zefram

One more variant:

scheme@(guile-user)> (let ((x (list 'a))) (eq? x (list 'a)))
$1 = #t
scheme@(guile-user)> ,opt (let ((x (list 'a))) (eq? x (list 'a)))
$2 = (let ((x (list 'a))) (eq? x x))

-zefram

bug#21902: doc incorrectly describes Julian Date

2016-06-24 Thread Zefram

Andy Wingo wrote:
>Would you like to propose a specific patch to the documentation?

Sure.  Patch attached.

-zefram
--- a/doc/ref/srfi-modules.texi 2014-03-20 20:21:21.0 +
+++ b/doc/ref/srfi-modules.texi 2016-06-24 18:57:59.088243245 +0100
@@ -2461,8 +2461,8 @@
 @cindex UTC
 @cindex TAI
 This module implements time and date representations and calculations,
-in various time systems, including universal time (UTC) and atomic
-time (TAI).
+in various time systems, including Coordinated Universal Time (UTC)
+and International Atomic Time (TAI).
 
 For those not familiar with these time systems, TAI is based on a
 fixed length second derived from oscillations of certain atoms.  UTC
@@ -2494,18 +2494,12 @@
 @cindex julian day
 @cindex modified julian day
 Also, for those not familiar with the terminology, a @dfn{Julian Day}
-is a real number which is a count of days and fraction of a day, in
-UTC, starting from -4713-01-01T12:00:00Z, ie.@: midday Monday 1 Jan
-4713 B.C.  A @dfn{Modified Julian Day} is the same, but starting from
-1858-11-17T00:00:00Z, ie.@: midnight 17 November 1858 UTC.  That time
-is julian day 240.5.
-
-@c  The SRFI-1 spec says -4714-11-24T12:00:00Z (November 24, -4714 at
-@c  noon, UTC), but this is incorrect.  It looks like it might have
-@c  arisen from the code incorrectly treating years a multiple of 100
-@c  but not 400 prior to 1582 as non-leap years, where instead the Julian
-@c  calendar should be used so all multiples of 4 before 1582 are leap
-@c  years.
+is a real number which is a count of days and fraction of a day, in UT,
+starting from -4713-11-24T12:00:00Z, ie.@: midday UT on Monday 24 November
+4714 BC in the proleptic Gregorian calendar (1 January 4713 BC in the
+proleptic Julian calendar).  A @dfn{Modified Julian Day} is the same,
+but starting from 1858-11-17T00:00:00Z, ie.@: midnight UT on Wednesday
+17 November AD 1858.  That time is julian day 240.5.
 
 
 @node SRFI-19 Time

bug#20822: environment mangled by locale

2016-06-26 Thread Zefram

Mark H Weaver wrote:
>   by convention they are
>supposed to encoded in the locale encoding.

This convention is bunk.  The encoding aspect of the locale system is
fundamentally broken: the model is that every string in the universe
(every file content, filename, command line argument, etc.) is encoded
in the same way, and the locale environment variable tells you which
universe you're in.  But in the real universe, files, filenames, and so
on turn up encoded how their authors liked to encode them, and that's
not always the same.  In the real universe we have to cope with data
that is not encoded in our preferred way.

> If that convention is
>violated, I don't see what a program could do about it.

If the convention is violated, then there is some difficulty in presenting
correctly-encoded (or even consistently-encoded) output to the user, but
it is not insuperable.  Perhaps the program knows by some non-locale means
how a string is encoded, and can explicitly convert.  Perhaps it doesn't
know the real encoding, but can trust that the user will understand the
octet string if it is passed through with neither decoding of input nor
encoding for output.  Or perhaps the program doesn't need to put the
string into textual output at all, but only to use it some API or file
format that's expecting an encodingless octet string.

So there are many things a program can reasonably do about it, and which
one to do depends on the application.

>Can someone show me a realistic example of how this would be used in
>practice?

Looking specifically at environment variables: an environment
variable could give the name of a file that is to be consulted under
specified circumstances, and the right file may happen to have a name
that is inconsistent with the encoding used by the user's terminal.
(The filename is not required for output; it only needs to be passed as
an uninterpreted octet string to the open(2) syscall.)  An environment
variable could specify a Unicode-using name of a language module to be
loaded, while the user doesn't otherwise use Unicode, or doesn't use
an encoding encompassing enough of it.  (Name not required on output,
again; will be either transformed into a filename or looked up in a file
format that specifies its own encoding.)  The program could be env(1), not
interpreting the environment but needing to output the octets correctly.
The program could be saving an uninterpreted environment, for a cron
job to later run some other program with equivalent settings.

-zefram

bug#22905: GUILE_INSTALL_LOCALE produces unavoidable noise

2016-08-07 Thread Zefram

Andy Wingo wrote:
>I believe this is consistent with other programs which call setlocale,
>notably Perl and Bash.

It is consistent with them, but the fact that others get it wrong isn't
an excuse.

>avoid the call to setlocale, and Guile offers the GUILE_INSTALL_LOCALE=0
>knob to do this.

That knob is not available to the program.  If you provide a knob that
the program can control, independent of the environment, with backward
compatibility to Guile 1.8, then we can consider the setlocale call
avoidable.

>  Probably adding the suggestion to the warning is the right
>thing; wdyt?

No, that's not an improvement.  Emitting a warning and then running
the program anyway is fundamentally broken behaviour, and tweaking the
content of the warning doesn't help.

Some way for the program to detect that you've screwed up its output,
so that it can decide to abort rather than continue with faulty output,
would be another middle way.

-zefram

bug#24186: setlocale can't be localised

2016-08-08 Thread Zefram

In Guile 1.8 it was possible to localise the effect of a setlocale
operation, but in Guile 2.0 it's no longer possible by natural use of the
locale API.  This loss of a useful facility is either a bug or something
that needs to be discussed in the documentation.

In Guile 1.8 one could perform a temporary setlocale for the execution of
some piece of code, and revert its effect by another setlocale on unwind.
This looks like:

(define (call-with-locale cat newval body)
  (let ((oldval #f))
(dynamic-wind
  (lambda () (set! oldval (setlocale cat)) (setlocale cat newval))
  body (lambda () (setlocale cat oldval)

Some difficulty arises from this being temporally scoped, where dynamic
or lexical scoping would be nicer, but in single-threaded programs it
works pretty well.  The C setlocale(3) API, after which Guile's setlocale
is modelled, is obviously designed to enable this kind of mechanism: the
read operation reports all relevant state, and the write operation with
the old value sets it all back as it was.  It is critical to this ability
that the read operation does indeed report all the state that will be set.

In Guile 2.0, the setlocale function no longer corresponds so closely to
the C setlocale(3), and this critical guarantee has been lost.  I have
previously reported in bug#22910 that the setlocale read operation
has a side effect on port encoding, and obviously that interferes with
the above code, but actually there's still a problem if that's fixed.
The setlocale *write* operation also affects port encoding (actually
the default port encoding fluid and the encoding of currently-selected
ports), and that seems to be an intentional change, but it also breaks
the above code.  The setlocale read operation doesn't report the encoding
of the currently-selected ports, so doesn't represent everything that
setlocale will set.  The setlocale write operation is not even capable
of setting the port encodings independently: it sets all three to the
encoding nominated by the locale selected for LC_CTYPE purposes.

I think adding this extra effect to setlocale was a mistake.  It doesn't
fit the locale API.  If the extra effect is removed, that would resolve
this problem.

If you really want setlocale to have this effect, then something needs to
be done to address the ability that has been lost.  The documentation
certainly needs to describe the effect on port encoding, which it
currently doesn't.  (There is a mention of some interaction with the
%default-port-encoding fluid in the documentation of that fluid, but it
doesn't match reality: it doesn't say that setlocale writes to the fluid.)
It also ought to specifically warn that the setlocale save-and-restore
dance that works in C doesn't work here.  It should explain what needs
to be done by library functions that want to achieve a localised locale
change.  Are they entirely forbidden to use setlocale?  Are they expected
to manually save and restore port encodings around setlocale calls?
(This is complicated by set-port-encoding! not accepting #f as an encoding
value, despite it actually being a permitted value for the encoding slot.)
Some example code equivalent to the above call-with-locale would be
useful.

-zefram

bug#22905: GUILE_INSTALL_LOCALE produces unavoidable noise

2016-08-08 Thread Zefram

Andy Wingo wrote:
>If you would like for me to work on your bugs then I would appreciate it
>if you would keep things constructive.  Thanks :)

I'm sorry that that bit came across badly.  I do appreciate your efforts.

>Serious question tho: what sort of back-compatibility can there be with
>a Guile that only supports latin-1 strings?

I'd expect that almost any program that runs on Guile 1.8 ought to be
portable, with only minimal modifications, to later versions of Guile.
Obviously this wouldn't work the other way round: if a program relies
on 2.0's non-Latin-1 strings then it can't be easily ported back to 1.8.
But lots of programs work fine on 1.8, either not processing non-Latin-1
data or processing it in forms other than the builtin string type.
Scheme was a good programming language long before Unicode came along.

> What property is it that
>you are going for here?

In that bit, I'm going for it being possible for a program to run on
both Guile 1.8 and Guile 2.N while avoiding the new locale warning
from Guile 2.N.  This should be a single program file, starting with a
"#!/usr/bin/guile" line, where /usr/bin/guile may refer to either version
of Guile.  This would be especially relevant for a program originally
written for Guile 1.8, but more generally is relevant for any program
that doesn't need any of 2.0's new capabilities.

The particular problem that arises is that a possible form for a
warning-muffling switch would be a command-line switch that goes on the
#! line.  Any new switch of that nature wouldn't be recognised by Guile
1.8, and would cause an error when attempting to run the program on 1.8.

>What about GUILE_INSTALL_LOCALE=require or something like that?

In the environment?  That's still not controllable by the program.  The
environment is the wrong place for any switch that needs to be the choice
of the program.  Whether to engage with the environmentally-suggested
locale ought to be the choice of the program.

>How would this work?

I imagine a builtin function that returns a truth value saying whether
the Guile framework has emitted a warning before running the program.
Suppose it's called "program-running-with-unclean-output".  Then those
who particularly want clean output can write something like

(when (program-running-with-unclean-output)
  (error "can't run after warnings"))

This doesn't avoid the warning appearing, but does avoid treating a
run marred by the warning as a successful program run.  The program's
checking code can easily be made portable back to Guile versions lacking
the new function, by using cond-expand, false-if-exception, or other
metaprogramming facilities.

-zefram

bug#24186: setlocale can't be localised

2016-08-08 Thread Zefram

et from the current locale.  You then have
the fluid default to that value, and have setlocale not touch the fluid
at all.  This way, if the user doesn't touch the fluid but does call
setlocale then the locale controls the encoding of new ports.  But if
the user does set the fluid (to something other than #:locale-at-open),
indicating a desire to specifically control default port encoding, then
setlocale doesn't clobber the user's choice.  How does this sound to you?

>   But I don't think
>it should change the encoding of already-open ports, should it?

In a situation where setlocale is expected to deliberately side-effect
the default port encoding fluid, I can't figure out whether to expect it
to do more.  I suppose on general principle it's less surprising for it
to do less.  It's certainly less work to work around it, where the side
effects are unwanted.

If you go with the #:locale-at-open plan that I described above, then
setlocale should definitely not touch the encoding of already-open ports.
Just so that it is localisable as originally designed.

There's another way to get the best of both worlds.  In addition to the
#:locale-at-open value for the default port encoding fluid, there could
also be some special encoding value for a port, #:locale-at-io, meaning
to use whatever locale is in effect at the time of an I/O operation.
#:locale-at-io is also a valid value for the fluid, which will be copied
into a new port in the regular way.  The stdin, stdout, and stderr ports
that are automatically opened at program initialisation can be set to
#:locale-at-io, and setlocale now doesn't directly set the encoding of
any port.  If the user calls setlocale without otherwise controlling port
encoding then the locale controls the encoding of the primordial ports.
I expect that's the effect that the setlocale code was aiming for,
given that when setlocale is called it's too late to affect the opening
of the primordial ports.

-zefram

bug#24186: setlocale can't be localised

2016-08-09 Thread Zefram

I wrote:
>is my first time compiling a Guile myself.  It's failing on a missing
>library for which Debian supplies no package.

Turns out there was a package.  It was complaining about a lack of
"bdw-gc", and Debian doesn't have anything of that name, but it does
have it under the name "libgc".  So I've now got 2.1.3 running.

All of the code in my day-of-week-string-for-locale sketch works exactly
the same on 2.1.3 as it did on 2.0.

-zefram

bug#22905: GUILE_INSTALL_LOCALE produces unavoidable noise

2016-08-09 Thread Zefram

Andy Wingo wrote:
>#!/bin/sh
>export FOO=bar
>exec guile $0 "$@"
>!#

That introduces all the complexity of using another language interpreter,
one I've chosen not to write my program in.  I don't much fancy working
round a gotcha by importing another series of gotchas.

Fundamentally, it seems like an admission of defeat.  With care it would
work, but means that Guile is not itself the platform on which to write
a Unix program.  Maybe you're OK with the idea that Guile programs aren't
meant to run in their own right.  Would you be OK with documenting it?

It also means that the Guile program isn't actually seeing the user's
environment, and so doesn't accurately pass that environment through to
anything that it runs in turn.  Working around that would involve some
hairy and error-prone shell code.

>This is certainly possible to do.  Actually I would guess that this
>works:
>
>  (setlocale LC_ALL "")

That succeeds in signalling an error in any case where the environmental
locale doesn't exist, but that's not really what I want.  If the
framework didn't perform an implicit setlocale, and so didn't mar my
output, I don't then want to make things break.

That approach is also totally specific to the setlocale warning.
If program-running-with-unclean-output were to exist, it should also
cover uncleanliness due to auto-compile banners (bug#16364).  It would
be the solution (though not a great one) to both problems.

>Does any of this work for you?

Shell script wrapper is the closest so far, but it's nasty.  You haven't
proposed any real solution.

The really simple solution would be to remove this switch from the
environment entirely, and remove the implicit setlocale from the startup
sequence entirely.  The environment was always the wrong place for the
switch, and there's no benefit in the implicit setlocale being as early
as it is.  The decision on whether to engage with the user's locale is
then made entirely by the program, as part of its ordinary execution.
If it wants to use the user's locale, it executes (setlocale LC_ALL
"").  If it wants non-default handling of errors, it executes that in
the dynamic scope of whatever throw or catch handler it likes.  If it
doesn't want to use the user's locale, it doesn't execute that.  Bonus:
works identically on older Guile versions.

If you won't go for the simple solution, then a proper solution
that maintains the default implicit setlocale would be to have
a switch in a magic comment in the program file.  Something like
"#!GUILE_INSTALL_LOCALE=0\n!#\n" immediately following the program's
initial #!...!# block.  This is ignored as a comment by older Guile
versions.  The semantic on newer versions would be that the setting given
there (which may be 0 or 1) determines conclusively whether the implicit
setlocale happens.  The environment variable would take effect as it
currently does only for programs not containing this kind of setting.

-zefram

bug#20823: argv mangled by locale

2016-08-14 Thread Zefram

Andy Wingo wrote:
>I also don't
>know whether to supply an optional "encoding" argument, and use that
>encoding to decode the command line arguments.

If you don't fancy the profusion of extra "encoding" parameters on
argv access (this ticket), environment access (bug#20822), and all
sorts of syscalls (bug#22913), you could bundle them all together in
a fluid.  This would be a bit like the %default-port-encoding fluid,
but setlocale should absolutely not modify it.  It should follow the
scheme that I laid out in bug#24186: its value can be either a string
naming an encoding, or #:locale-at-io meaning that whenever encoding
is required the currently selected locale is consulted.  There should
also be a fluid determining the conversion strategy, like the existing
%default-port-conversion-strategy.  These two fluids together would
control the encoding and decoding for all operations that currently
apply the locale encoding to arbitrary data.  (Decoding locale-supplied
messages is a different matter.)

-zefram

bug#24186: setlocale can't be localised

2016-10-11 Thread Zefram

Ludovic Courtes wrote:
>That wouldn't help with the "setlocale" issue you describe per se, but
>this would address such use cases in a different way.
>
>WDYT?

Yes, explicit locale objects and locale parameters to relevant functions
are a good thing.  In general, the model of a global locale state is
broken, at least by threading, so some advance beyond the setlocale system
is necessary.  Note the new(er) "uselocale" system in libc, which gives
a per-thread locale state, fixing the biggest problem with setlocale.
Some form of that could also be mapped into Guile; it would be reasonable
to have a fluid that determines the locale to use where not overridden
by an explicit parameter.

All of that is welcome, but, as you say, doesn't deal with the actual
problem I identified with setlocale.  One can expect that setlocale will
continue to be used for the foreseeable future, and it needs to be shorn
of its unwanted side effects.

-zefram

bug#26149: SRFI-19 doc erroneously warns about Gregorian reform

2017-03-17 Thread Zefram

The documentation, near the start of the section on SRFI-19, says

!*Caution*: The current code in this module incorrectly extends the
! Gregorian calendar leap year rule back prior to the introduction of
! those reforms in 1582 (or the appropriate year in various countries).
! The Julian calendar was used prior to 1582, and there were 10 days
! skipped for the reform, but the code doesn't implement that.
!
!This will be fixed some time.  Until then calculations for 1583
! onwards are correct, but prior to that any day/month/year and day of the
! week calculations are wrong.

The statements that the code is incorrect in this behaviour are erroneous.
SRFI-19 itself says

# A Date object, which is distinct from all existing types, represents a
# point in time as represented by the Gregorian calendar as well as by a
# time zone.

The code is thus correct in always using the Gregorian calendar in
date structures.  Per ISO 8601 it is also correct in always using
the Gregorian calendar in string output in that standard's formats.
SRFI-19 isn't explicit about the calendar used as the basis for the
other string output formats, but since the formatting proceeds from a
date structure it seems implied that they should use the same basis as
the date structure.  For string input it is explicit that the parseable
numeric formats correspond directly to fields of the date structure.
There is no part of SRFI-19 that looks like it is ever intended to use
the Julian calendar.

So the code should not be `fixed', and the statements about that and about
incorrectness should be removed from the documentation.  It is sensible to
keep an explicit statement about the treatment of the Gregorian reform,
but the decision to use the Gregorian calendar proleptically should be
credited to SRFI-19 (the standard), not to the code.

-zefram

bug#26151: date-year-day screws up leap days prior to AD 1

2017-03-17 Thread Zefram

In SRFI-19, the date-year-day function is meant to return the ordinal
day of the year for a date structure.  This value is properly 1 for the
first day of each calendar year, and on all other days 1 greater than
the value for the preceding day.  But the implementation occasionally
has it repeat a value:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (date-year-day (julian-day->date 1719657 0))
$1 = 59
scheme@(guile-user)> (date-year-day (julian-day->date 1719658 0))
$2 = 60
scheme@(guile-user)> (date-year-day (julian-day->date 1719659 0))
$3 = 60

and occasionally has it skip a value:

scheme@(guile-user)> (date-year-day (julian-day->date 1720023 0))
$4 = 59
scheme@(guile-user)> (date-year-day (julian-day->date 1720024 0))
$5 = 61

These errors happen around the end of February in years preceding AD 1.
In each leap year a value is repeated (ordinal values 1 too low from
March to December), and in each year immediately following a leap year
a value is skipped (ordinal values 1 too high from March to December).
Looking at the code, the bug arises from confusion between astronomical
year numbering (which leap-year? expects to receive) and the bizarre
zero-skipping year numbering that the library uses in the date structure
(which date-year-day passes, via year-day, to leap-year?).

Since the subject's come up: that year numbering used in the date
structures is surprising, and I'm not sure quite what to make of it.
It matches AD year numbering for years AD 1 onwards, but then numbers AD
0 (1 BC) as -1, and numbers all earlier years in accordance with that.
It's almost a straight linear numbering of years, except that it skips
the number 0.  (At least you've documented it.)  This is not a convention
that I've seen in real use anywhere else, and that weird exception to
the linearity makes it a pain to use.  It's likely to cause bugs in
user code, along the lines of the library bug that I've reported above
and the previously-reported bug#21903.  However, I haven't reported the
year numbering per se as a bug, because SRFI-19 doesn't actually say
what numbering is to be used for the date-year slot.

If I had implemented SRFI-19 myself, without reference to existing
implementations, I would have implemented astronomical year numbering
(consistent AD year numbering, extending linearly in both directions),
as used in ISO 8601.  This is the most conventional year numbering,
and at a stretch one could read SRFI-19 as implying it, by using some AD
year numbering and not saying to deviate from that scheme.  But really the
standard is silent on the issue.  Since the signification of date-year is
an interoperability issue, this silence is a problem, and it is troubling
that you and I have reached different interpretations of the standard
on this point.

Where did you get the idea to use a non-linear year numbering?  What's
your opinion of SRFI-19's (lack of) text on this matter?  You should
consider the possibility of changing your implementation to use the
conventional astronomical year numbering in this slot.

-zefram

bug#26162: time-duration screws up negative durations

2017-03-18 Thread Zefram

Computing a difference between two SRFI-19 times, using time-difference,
produces sensible results if the result is positive, but often nonsense
if it's negative:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (time-difference (make-time time-tai 0 1) (make-time 
time-tai 1000 0))
$1 = #
scheme@(guile-user)> (time-difference (make-time time-tai 1000 0) (make-time 
time-tai 0 1))
$2 = #

The above is computing the same interval both ways round.  The first time
is correct, but the second is obviously not the negative of the first.
The correct result for the second would be

#

or possibly, at a stretch,

#

(SRFI-19 isn't clear about which way it's meant to be normalised.
Having the nanoseconds field always non-negative is less surprising and
easier to maintain through computation.)

-zefram

bug#26163: time-difference doesn't detect error of differing time types

2017-03-18 Thread Zefram

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (time-difference (make-time time-tai 0 1) (make-time 
time-utc 0 1))
$1 = #

SRFI-19 is explicit that it "is an error" if the arguments to
time-difference are of different time types, and correspondingly the
Guile documentation says the arguments "must be" of the same type.
It would be very easy for time-difference to detect and signal this error.
It's not absolutely a bug that it currently doesn't, but it would be a
useful improvement if it did.

-zefram

bug#26164: time-difference mishandles leap seconds

2017-03-18 Thread Zefram

Computing the duration of the period between two UTC times, using
SRFI-19 mechanisms:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (define t0 (date->time-utc (make-date 0 59 59 23 30 6 2012 
0))) 
scheme@(guile-user)> (define t1 (date->time-utc (make-date 0 1 0 0 1 7 2012 0)))
scheme@(guile-user)> (time-difference t1 t0)
$1 = #

The two times are 2012-06-30T23:59:59 and 2012-07-01T00:00:01, so at
first glance one would expect the duration to be 2 s as shown above,
the two seconds being 23:59:59 and 00:00:00.  But in fact there was
a leap second 2012-06-30T23:59:60, so the duration of this period is
actually 3 s.  The SRFI-19 library is aware of this leap second, and
will compute the duration correctly if it's translated into TAI:

scheme@(guile-user)> (time-difference (time-utc->time-tai t1) 
(time-utc->time-tai t0))
$2 = #

The original computation in UTC space should yield a result of 3 s,
not the 2 s that it did.  Since 1972, the seconds of UTC are of exactly
the same duration as the seconds of TAI.  (They're also phase-locked to
TAI seconds.)  Thus the period of three TAI seconds is also a period of
three UTC seconds.  It is not somehow squeezed into two UTC seconds.

-zefram

bug#26165: date-week-day screws up prior to AD 1

2017-03-18 Thread Zefram

Looking at day of the week, via SRFI-19's date-week-day:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (julian-day->date 1721426 0) 
$1 = #
scheme@(guile-user)> (date-week-day (julian-day->date 1721426 0))
$2 = 1
scheme@(guile-user)> (date-week-day (julian-day->date 1721425 0))
$3 = 6

The output for 0001-01-01, Monday, is correct.  The preceding day is
actually a Sunday, but Saturday was shown.  Looking at the code, this
bug arises for the same reason as the problem with date-year-day raised
in bug#26151.  The date-year value, of the weird zero-skipping year
numbering, is passed to an algorithm that obviously expects astronomical
year numbering.

Looking at the code also reveals a second problem: the algorithm is
written to perform divisions with quotient where it obviously needs
modulo.  This will manifest in erroneous computations for some earlier
years once the above is fixed.

-zefram

bug#26165: date-week-day screws up prior to AD 1

2017-03-18 Thread Zefram

I wrote:
>written to perform divisions with quotient where it obviously needs
>modulo.

Oops, thinko there.  It needs floor-quotient, the quotient-like function
that uses floor rounding.  modulo is the *remainder*-like function that
uses floor rounding.

-zefram

bug#26182: cond-expand doc omits guile-2.2 feature

2017-03-19 Thread Zefram

In Guile 2.2.0, the SRFI-0 (cond-expand) documentation says:

! The Guile core has the following features,
!
!  guile
!  guile-2  ;; starting from Guile 2.x
!  r5rs
!  srfi-0
...

As implemented in Guile 2.2.0, the unlisted feature guile-2.2 is also
recognised by cond-expand.  Since the documentation's list is otherwise
complete, presumably it is intended to be a complete list, and the
omission of this feature from the list is a mistake.  In any case,
it would be helpful to list it.

-zefram

bug#26259: ~f SRFI-19 format broken for small nanoseconds values

2017-03-25 Thread Zefram

The ~f format specifier in SRFI-19's date->string function is supposed
to produce a decimal string representation of the seconds and nanoseconds
portions of a date together:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (date->string (make-date 55000 56 34 12 26 3 2017 0) 
"~f")
$1 = "56.55"

but it screws up for nanoseconds values in the range (0, 100), i.e.,
for any time that lies strictly within the first millisecond of a second:

scheme@(guile-user)> (date->string (make-date 55 56 34 12 26 3 2017 0) "~f")
$2 = "56.5e-4"

Looks like the fractional seconds value is being formatted through a
mechanism that is not suitable for this purpose, which uses exponent
notation for sufficiently small values and thereby surprises the
date->string code.  Note that just assembling the seconds+fraction value
and putting the whole thing through the same formatter, as opposed to
putting the fractional part through on its own, would fix the above test
cases, and any others with non-zero integer seconds, but would leave
the bug unfixed for the case where the integer seconds value is zero.
Fixing this requires not using any formatting mechanism that would ever
resort to exponent notation for values in the relevant range.

-zefram

bug#26260: ~f SRFI-19 format specifier mishandles one-digit seconds value

2017-03-25 Thread Zefram

The ~f format specifier for SRFI-19's date->string is documented as:

#~f seconds and fractional seconds, with locale
#   decimal point, eg. `5.2'

Let's test that example:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (date->string (make-date 2 5 34 12 26 3 2017 0) 
"~f")
$1 = "05.2"

That's not the documented format: the doc and the SRFI itself show "5.2"
with no leading padding, but actual behaviour is to zero pad.  There is
much that is ambiguous in the SRFI's specification of ~f, but with that
example it does at least seem clear that there should be no padding there.

-zefram

bug#26261: ~N mishandles small nanoseconds value

2017-03-25 Thread Zefram

The ~N format specifier in SRFI-19's date->string is documented to show
the nanoseconds value, with zero padding.  The documentation explicates
further by showing as an example a string of nine zeroes.  In fact the
implementation only pads to seven digits, and so produces incorrect
output for and nanoseconds value in the range [0, 1):

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (date->string (make-date 0 5 34 12 26 3 2017 0) "~N")
$1 = "000"
scheme@(guile-user)> (date->string (make-date 2 5 34 12 26 3 2017 0) "~N")
$2 = "002"
scheme@(guile-user)> (date->string (make-date 200 5 34 12 26 3 2017 0) "~N")
$3 = "200"
scheme@(guile-user)> (date->string (make-date 20 5 34 12 26 3 2017 0) "~N")
$4 = "020"
scheme@(guile-user)> (date->string (make-date  5 34 12 26 3 2017 0) 
"~N")
$5 = ""
scheme@(guile-user)> (date->string (make-date 2 5 34 12 26 3 2017 0) 
"~N")
$6 = "2"

The padding clearly has to be to the full nine digits.

-zefram

bug#26329: monotonic time not supplied by current-time

2017-04-01 Thread Zefram

The SRFI-19 current-time function can return several flavours of the
current time:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (current-time time-utc)
$1 = #
scheme@(guile-user)> (current-time time-tai)
$2 = #
scheme@(guile-user)> (current-time time-monotonic)
$3 = #

The last of these three is erroneous: a time structure of type
time-monotonic was requested and must be returned, but instead the
type is time-tai.  Although the implementation gives these two time
types numerically identical behaviour, it does treat them as nominally
distinct in other operations:

scheme@(guile-user)> (eqv? time-tai time-monotonic)
$4 = #f
scheme@(guile-user)> (julian-day->time-tai 245)
$5 = #
scheme@(guile-user)> (julian-day->time-monotonic 245)
$6 = #

-zefram

bug#26149: SRFI-19 doc erroneously warns about Gregorian reform

2017-04-19 Thread Zefram

Andy Wingo wrote:
>This makes sense to me, FWIW.

Patch attached.

-zefram
>From 444703940983d559935c4dd2a2c89d7888c67119 Mon Sep 17 00:00:00 2001
From: Zefram 
Date: Wed, 19 Apr 2017 17:08:30 +0100
Subject: [PATCH] correct note about Gregorian reform in SRFI-19

SRFI-19 specifies proleptic use of the Gregorian calendar, so it was
incorrect of the documentation to describe the code as erroneous in
doing so.  Rewrite the caution more neutrally, and move it to the section
about the "date" structure, where it seems most relevant.
---
 doc/ref/srfi-modules.texi | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/doc/ref/srfi-modules.texi b/doc/ref/srfi-modules.texi
index 95509b2..3d44156 100644
--- a/doc/ref/srfi-modules.texi
+++ b/doc/ref/srfi-modules.texi
@@ -2383,17 +2383,6 @@ functions and variables described here are provided by
 (use-modules (srfi srfi-19))
 @end example
 
-@strong{Caution}: The current code in this module incorrectly extends
-the Gregorian calendar leap year rule back prior to the introduction
-of those reforms in 1582 (or the appropriate year in various
-countries).  The Julian calendar was used prior to 1582, and there
-were 10 days skipped for the reform, but the code doesn't implement
-that.
-
-This will be fixed some time.  Until then calculations for 1583
-onwards are correct, but prior to that any day/month/year and day of
-the week calculations are wrong.
-
 @menu
 * SRFI-19 Introduction::
 * SRFI-19 Time::
@@ -2593,6 +2582,16 @@ The fields are year, month, day, hour, minute, second, nanoseconds and
 timezone.  A date object is immutable, its fields can be read but they
 cannot be modified once the object is created.
 
+Historically, the Gregorian calendar was only used from the latter part
+of the year 1582 onwards, and not until even later in many countries.
+Prior to that most countries used the Julian calendar.  SRFI-19 does
+not deal with the Julian calendar at all, and so does not reflect this
+historical calendar reform.  Instead it projects the Gregorian calendar
+back proleptically as far as necessary.  When dealing with historical
+data, especially prior to the British Empire's adoption of the Gregorian
+calendar in 1752, one should be mindful of which calendar is used in
+each context, and apply non-SRFI-19 facilities to convert where necessary.
+
 @defun date? obj
 Return @code{#t} if @var{obj} is a date object, or @code{#f} if not.
 @end defun
-- 
2.1.4

bug#26164: time-difference mishandles leap seconds

2017-04-19 Thread Zefram

Andy Wingo wrote:
>Makes sense to me.  Would you like to submit a patch and test case?

This particular bug has interactions with other bugs that make me
uncomfortable about attempting to fix it right now.  The right way to
fix this is especially influenced by the approach taken to bug#22033 and
to the bug regarding pre-1972 UTC.  The latter I haven't even reported
yet because it's difficult to formulate in the presence of some of the
other UTC-related bugs such as bug#21911 and bug#21912.  So I think this
is one to postpone until some of those are out of the way.

-zefram

bug#26163: time-difference doesn't detect error of differing time types

2017-04-19 Thread Zefram

Patch attached.

-zefram
>From 6f9d9b355233b578eb3ce13549c8fdc9d7fb8364 Mon Sep 17 00:00:00 2001
From: Zefram 
Date: Wed, 19 Apr 2017 19:02:13 +0100
Subject: [PATCH] signal error of time-difference on differing types

It is an error to apply SRFI-19's time-difference to time structures of
differing time types.  Detect and signal the error.
---
 module/srfi/srfi-19.scm | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/module/srfi/srfi-19.scm b/module/srfi/srfi-19.scm
index c6a55a2..8da711f 100644
--- a/module/srfi/srfi-19.scm
+++ b/module/srfi/srfi-19.scm
@@ -413,12 +413,14 @@
 ;; -- Time arithmetic
 
 (define (time-difference! time1 time2)
-  (let ((sec-diff (- (time-second time1) (time-second time2)))
-(nsec-diff (- (time-nanosecond time1) (time-nanosecond time2
-(set-time-type! time1 time-duration)
-(set-time-second! time1 sec-diff)
-(set-time-nanosecond! time1 nsec-diff)
-(time-normalize! time1)))
+  (if (not (eq? (time-type time1) (time-type time2)))
+  (time-error 'time-difference 'incompatible-time-types time2)
+  (let ((sec-diff (- (time-second time1) (time-second time2)))
+	(nsec-diff (- (time-nanosecond time1) (time-nanosecond time2
+	(set-time-type! time1 time-duration)
+	(set-time-second! time1 sec-diff)
+	(set-time-nanosecond! time1 nsec-diff)
+	(time-normalize! time1
 
 (define (time-difference time1 time2)
   (let ((result (copy-time time1)))
-- 
2.1.4

bug#21907: date->string duff ISO 8601 zone format

2017-04-19 Thread Zefram

A sequence of two patches is attached.  The first fixes the ~2/~4 bug,
signalling an error for any unrepresentable offset.

The second is a bonus patch, which fixes related problems in ~z, the
RFC 822 zone format specifier.  Prior to the patch, ~z outputs "Z" for
UT, which would be correct for ISO 8601 format but is deprecated (along
with all the other single-letter syntax) for RFC 822.  The patch changes
that to the approved "+".  ~z also had exactly the same problems as
~2/~4 regarding unrepresentable offsets, so the patch fixes them in the
same way.

I could report the ~z problems in a separate ticket if you like.
Beware that the second of these patches has some textual dependence on
the first, so trying to handle them separately might just be confusing.

-zefram
>From e6db0e40e5464591df204f9d07e66b3d7853c0d7 Mon Sep 17 00:00:00 2001
From: Zefram 
Date: Wed, 19 Apr 2017 21:50:39 +0100
Subject: [PATCH 1/2] fix SRFI-19's ISO 8601 zone output formats

The ISO 8601 timezone formats offered by SRFI-19's date->string function,
in the ~2 and ~4 format specifiers, were erroneously in the basic format
despite juxtaposition with extended-format date and time.  Fix that by
switching them to extended format.  This incidentally means that the
ISO 8601 zone format is no longer implemented as identical to the RFC
822 zone format (~z), so stop documenting them in terms of ~z.

The same format specifiers also made too much of an attempt to display
zone offsets that are not representable in ISO 8601 format.  They would
truncate an offset that is not an integral number of minutes, thus
producing inaccurate output.  The truncation of an offset in the range
(-60, 0) yielded a non-conforming "-".  An offset of 100 hours or more
(in either direction) resulted in non-conforming extra digits.  In all of
these cases, signal as an error that the zone offset is not representable.
---
 doc/ref/srfi-modules.texi |  4 ++--
 module/srfi/srfi-19.scm   | 22 +++---
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/doc/ref/srfi-modules.texi b/doc/ref/srfi-modules.texi
index ec3bb20..da7850f 100644
--- a/doc/ref/srfi-modules.texi
+++ b/doc/ref/srfi-modules.texi
@@ -2818,9 +2818,9 @@ with locale decimal point, eg.@: @samp{5.2}
 @item @nicode{~z} @tab time zone, RFC-822 style
 @item @nicode{~Z} @tab time zone symbol (not currently implemented)
 @item @nicode{~1} @tab ISO-8601 date, @samp{~Y-~m-~d}
-@item @nicode{~2} @tab ISO-8601 time+zone, @samp{~H:~M:~S~z}
+@item @nicode{~2} @tab ISO-8601 time+zone, @samp{~3} plus zone
 @item @nicode{~3} @tab ISO-8601 time, @samp{~H:~M:~S}
-@item @nicode{~4} @tab ISO-8601 date/time+zone, @samp{~Y-~m-~dT~H:~M:~S~z}
+@item @nicode{~4} @tab ISO-8601 date/time+zone, @samp{~5} plus zone
 @item @nicode{~5} @tab ISO-8601 date/time, @samp{~Y-~m-~dT~H:~M:~S}
 @end multitable
 @end defun
diff --git a/module/srfi/srfi-19.scm b/module/srfi/srfi-19.scm
index f09ec7a..ed88242 100644
--- a/module/srfi/srfi-19.scm
+++ b/module/srfi/srfi-19.scm
@@ -152,7 +152,6 @@
 (define locale-date-time-format "~a ~b ~d ~H:~M:~S~z ~Y")
 (define locale-short-date-format "~m/~d/~y")
 (define locale-time-format "~H:~M:~S")
-(define iso-8601-date-time-format "~Y-~m-~dT~H:~M:~S~z")
 
 ;;-- Miscellaneous Constants.
 ;;-- only the tai-epoch-in-jd might need changing if
@@ -970,6 +969,21 @@
 (display (padding hours #\0 2) port)
 (display (padding minutes #\0 2) port
 
+(define (iso-8601-tz-print offset port)
+  (let* ((neg? (negative? offset))
+	 (all-secs (abs offset))
+	 (seconds (remainder all-secs 60))
+	 (all-mins (quotient all-secs 60))
+	 (minutes (remainder all-mins 60))
+	 (hours (quotient all-mins 60)))
+(if (or (not (= seconds 0)) (> hours 99))
+  (time-error 'date-printer 'unrepresentable-zone-offset offset)
+  (begin
+	(display (if neg? #\- #\+) port)
+(display (padding hours #\0 2) port)
+	(display #\: port)
+(display (padding minutes #\0 2) port)
+
 ;; A table of output formatting directives.
 ;; the first time is the format char.
 ;; the second is a procedure that takes the date, a padding character
@@ -1119,11 +1133,13 @@
(cons #\1 (lambda (date pad-with port)
(display (date->string date "~Y-~m-~d") port)))
(cons #\2 (lambda (date pad-with port)
-   (display (date->string date "~H:~M:~S~z") port)))
+   (display (date->string date "~3") port)
+   (iso-8601-tz-print (date-zone-offset date) port)))
(cons #\3 (lambda (date pad-with port)
(display (date->string date "~H:~M:~S") port)))
(cons #\4 (lambda (date pad-with port)
-   (display (date->string date "~Y-~m-~dT~H:~M:~S~z") port)))
+   (display (date->string date "~5") port)
+   (iso-8601-

bug#26570: GC_is_heap_ptr() dep for 2.2.1

2017-04-19 Thread Zefram

Compilation of 2.2.1 fails for me, producing a lot of warnings about
implicit declaration of GC_is_heap_ptr(), and ultimately

  CCLD guile
./.libs/libguile-2.2.so: undefined reference to `GC_is_heap_ptr'
collect2: error: ld returned 1 exit status
Makefile:2439: recipe for target 'guile' failed
make[3]: *** [guile] Error 1

At a guess, maybe this is supposed to be supplied by libgc.  But I have
the version of libgc that README says is required (7.2), and configure
was happy with it.  Maybe a higher version is now required, and README
and configure need updating?

-zefram

bug#21904: date->string duff ISO 8601 format for non-4-digit years

2017-04-19 Thread Zefram

A patch to fix this is attached.  The ISO 8601 date formats were
implemented by using the ~Y formatter for the year portion, but SRFI-19
doesn't require ~Y to follow ISO 8601, so this raises the question of
whether ~Y should.  It could be fixed by changing ~Y to conform to
ISO 8601, retaining the existing factoring of the formatters.  Or a
separate internal formatting function could be instituted to do ISO
8601 year formatting, with ~1 et al using that and ~Y left unchanged.
I chose the former strategy, partly because the funny non-linear year
number doesn't seem a useful thing to support in date->string at all,
but more strongly because it's useful to have access to ISO 8601 year
formatting on its own.  There isn't any other format specifier for that
job; it looks like SRFI-19 imagines that ~Y will fill that need.

-zefram
>From 43dfb5fabc9debb80f87b17d82a1adde356e547c Mon Sep 17 00:00:00 2001
From: Zefram 
Date: Thu, 20 Apr 2017 00:42:54 +0100
Subject: [PATCH 1/2] fix SRFI-19's ISO 8601 year syntax

The ISO 8601 date formats offered by SRFI-19's date->string function
were emitting incorrect syntax for most years.  At least four digits
of year must be given, but it wasn't padding shorter numbers.  And any
number with more than four digits requires a leading sign, but this was
being omitted for positive numbers.  These problems are now fixed.

The ISO 8601 date formats were formerly implemented in terms of the ~Y
format, which was not specified to be an ISO 8601 format.  The fix is
achieved by altering ~Y to behave in the ISO 8601 manner, and ~Y is
now documented to conform to ISO 8601.  Doing it this way means that
ISO 8601 year numbering is available in isolation, which is a useful
facility not otherwise available.
---
 doc/ref/srfi-modules.texi | 1 +
 module/srfi/srfi-19.scm   | 5 -
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/doc/ref/srfi-modules.texi b/doc/ref/srfi-modules.texi
index da7850f..8a5f1a0 100644
--- a/doc/ref/srfi-modules.texi
+++ b/doc/ref/srfi-modules.texi
@@ -2815,6 +2815,7 @@ with locale decimal point, eg.@: @samp{5.2}
 
 @item @nicode{~y} @tab year, two digits, @samp{00} to @samp{99}
 @item @nicode{~Y} @tab year, full, eg.@: @samp{2003}
+(in ISO 8601 format, though SRFI-19 doesn't specify so)
 @item @nicode{~z} @tab time zone, RFC-822 style
 @item @nicode{~Z} @tab time zone symbol (not currently implemented)
 @item @nicode{~1} @tab ISO-8601 date, @samp{~Y-~m-~d}
diff --git a/module/srfi/srfi-19.scm b/module/srfi/srfi-19.scm
index 4b8445f..d4308bb 100644
--- a/module/srfi/srfi-19.scm
+++ b/module/srfi/srfi-19.scm
@@ -1128,7 +1128,10 @@
   2)
 port)))
(cons #\Y (lambda (date pad-with port)
-   (display (date-year date) port)))
+	   (let ((y (date-year date)))
+		 (cond ((negative? y) (display #\- port))
+		   ((>= y 1) (display #\+ port)))
+		 (display (padding (abs y) #\0 4) port
(cons #\z (lambda (date pad-with port)
(rfc-822-tz-print (date-zone-offset date) port)))
(cons #\Z (lambda (date pad-with port)
-- 
2.1.4

bug#21904: date->string duff ISO 8601 format for non-4-digit years

2017-04-19 Thread Zefram

I wrote:
>I chose the former strategy, partly because the funny non-linear year
>number doesn't seem a useful thing to support in date->string at all,

Sorry, this comment is misplaced.  It relates to bug#21903; the choice
about ~Y applies to both of these bugs.

-zefram

bug#21903: date->string duff ISO 8601 negative years

2017-04-19 Thread Zefram

A patch to fix this is attached.  It applies on top of my patch for
bug#21904.  The choice that I described for that bug about whether
to change ~Y or to have a separate ISO 8601 year formatter actually
applies to both bugs, and the comment that I made there about exposing
the non-linear year numbering is really only about this bug.

-zefram
>From 3d39f1dfa0e210282db48a9af828646d7e9acef3 Mon Sep 17 00:00:00 2001
From: Zefram 
Date: Thu, 20 Apr 2017 00:53:40 +0100
Subject: [PATCH 2/2] fix SRFI-19's ISO 8601 year numbering

The ISO 8601 date formats offered by SRFI-19's date->string function
were emitting incorrect year numbers for years preceding AD 1.  It was
following the non-linear numbering that the library uses in the date
structure, rather than the standard astronomical year numbering required
by ISO 8601.  This is now fixed.  As with the preceding fix for the
syntax of year numbers, the fix is actually applied to the ~Y format,
which SRFI-19 doesn't require to follow ISO 8601.
---
 module/srfi/srfi-19.scm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/module/srfi/srfi-19.scm b/module/srfi/srfi-19.scm
index d4308bb..0e56c31 100644
--- a/module/srfi/srfi-19.scm
+++ b/module/srfi/srfi-19.scm
@@ -1128,7 +1128,8 @@
   2)
 port)))
(cons #\Y (lambda (date pad-with port)
-	   (let ((y (date-year date)))
+	   (let* ((yy (date-year date))
+		  (y (if (negative? yy) (+ yy 1) yy)))
 		 (cond ((negative? y) (display #\- port))
 		   ((>= y 1) (display #\+ port)))
 		 (display (padding (abs y) #\0 4) port
-- 
2.1.4

bug#26632: TAI<->UTC conversion botches 1961 to 1971

2017-04-23 Thread Zefram

The SRFI-19 library gets TAI<->UTC conversions badly wrong in the years
1961 to 1971 (inclusive).

This has to be examined somewhat indirectly, because SRFI-19 doesn't offer
any way to display a TAI time in its conventional form as a date-like
structure, nor to input a TAI time from such a structure.  SRFI-19's
date structure, as implemented, is always interpreted according to UTC.
The only operations supported on TAI time structures are conversions to
and from the various forms of UTC, conversions to and from the less-useful
`monotonic' time, and arithmetic operations.  Thus the erroneous TAI<->UTC
conversions only come out through arithmetic operations in TAI space.
One must also be careful to avoid unrelated bugs such as bug#21911.

First I'll consider an ordinary day in 1967:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (time-difference
... (time-utc->time-tai (date->time-utc (make-date 0 0 0 0 15 3 1967 0)))
... (time-utc->time-tai (date->time-utc (make-date 0 0 0 0 14 3 1967 0
$1 = #

This takes the start and end of 1967-03-14, as judged by UTC, converts
both of these times to TAI, and asks for the duration of that TAI
interval.  It's asking how many TAI seconds long that UTC day was.
As described in <http://maia.usno.navy.mil/ser7/tai-utc.dat>, there
was no UTC leap on that day, but throughout 1967 UTC had a frequency
offset from TAI such that each UTC second lasted exactly 1.0003 TAI
seconds.  The correct answer to the above question is therefore exactly
86400.002592 s.  The answer shown above, of 86400.00 s, is incorrect.

If time-tai->time-utc is applied to the times in the above example,
it accurately inverts what time-utc->time-tai did.  It is good that the
conversions are mutually consistent, but in this case it means they are
both wrong.

Second, I'll consider a less ordinary day:

scheme@(guile-user)> (time-difference
... (time-utc->time-tai (date->time-utc (make-date 0 0 0 12 1 2 1968 0)))
... (time-utc->time-tai (date->time-utc (make-date 0 0 0 12 31 1 1968 0
$2 = #

This time the period considered is from noon 1968-01-31 to noon
1968-02-01.  The same frequency offset described above applies throughout
this period.  The additional complication here is that at the end of
1968-01-31 there was a leap of -0.1 (TAI) seconds.  The true duration of
this day is therefore exactly 86399.902592 s.  The answer shown above,
of 86400.00 s, is incorrect in two ways, accounting for neither the
frequency offset nor the leap.

Once again, time-tai->time-utc accurately inverts the incorrect
time-utc->time-tai.  The failure to handle UTC's leaps in this era is
not specific to the relatively unusual negative leaps: it's equally
clueless about the positive leaps.

The full extent of the conversion errors, integrated across the entire
"rubber seconds" era from 1961-01-01 to 1972-01-01, is a little over
8.5 seconds of TAI.

This bug influences bug#26164, regarding time arithmetic in UTC.  If one
were to ignore the rubber seconds era, an obvious way to correct UTC
time arithmetic would be to convert to TAI and the arithmetic there.
That handles UTC leaps correctly.  But with rubber seconds it would
still be wrong.  In the rubber seconds era the number of UTC seconds in
an interval differs from the number of TAI seconds.

-zefram

bug#26633: TAI<->UTC conversion botches pre-1961 era

2017-04-23 Thread Zefram

Asking SRFI-19 to perform a UTC-to-TAI conversion for an ordinary day
in 1960:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (time-utc->time-tai (date->time-utc (make-date 0 0 0 12 14 
3 1960 0)))
$1 = #

The answer given is incorrect.  Unlike previous conversion bugs where it
was necessary to perform some arithmetic to reveal that the conversion
had gone wrong, in this case the answer can be declared wrong without any
detailed interpretation of the TAI time structure.  It is incorrect for
this conversion to return any specific TAI time, upon which arithmetic
could be performed, because UTC is not defined for any time prior to 1961.
The only sane behaviour is for the conversion to signal an error.

The same goes for time-tai->time-utc, which at present accurately inverts
time-utc->time-tai for the above time.

-zefram

bug#22033: time-utc format is lossy

2017-04-24 Thread Zefram

I wrote:
>   These two seconds are perfectly
>distinct parts of the UTC time scale, and the time-utc format ought to
>preserve their distinction.

This is a problematic goal.  At the time I wrote the bug report I didn't
have a satisfactory idea of how to achieve it, but I think I've come up
with one now.

The essential problem is that the SRFI-19 time structure expects to
encapsulate a scalar value -- as it says, a count of seconds since
some epoch -- but there is no natural scalar representation of a UTC
time.  Because of the irregularity imposed by its leaps, the natural
representation of a UTC time is a two-part structure, consisting of an
integer identifying the day and a fractional count of seconds elapsed
within the day.  Because UTC days contain differing numbers of seconds,
this is a variable-radix system.  SRFI-19 doesn't offer any structure that
has this simple form.  The only structure that it describes as separating
representation of the day from time of day is the date structure, which
splits up the time representation much more and has the complication of
the timezone offset.

The present approach of the library is to squeeze a UTC time into the time
structure by converting the variable-radix value into a scalar by using
a fixed radix of 86400.  This has the advantage of producing a scalar,
and of the scalar behaving continuously on most UTC days, but the major
downside of being lossy, aliasing some UTC times.  The scalar also isn't
really a count of seconds since an epoch, as SRFI-19 expects, breaking
arithmetic on it.  It looks rather as though this part of SRFI-19 was
written expecting this sort of transformation of UTC, but conflictingly
expecting it to serve as an unambiguous encoding and as a genuine count
of seconds since an epoch.

A simple workaround would be to create a scalar in the same kind of
way but using a larger fixed radix: minimally 86401, or more roundly
131072.  This means we have a scalar value that fits easily into the time
structure, and unambiguously encodes all UTC times.  But it's still not
a count of seconds since an epoch, and it's appreciably less like such
a count because it's no longer continuous across (most) UTC day ends.

Since the time structure has separate fields for seconds and nanoseconds,
it would be possible to borrow a trick sometimes used with the Unix
struct timespec: extending the nanoseconds range to represent leap
seconds.  This would be mostly like the present arrangement, with
the seconds count increasing by 86400 per UTC day, but with a leap
second unambiguously represented by the seconds count of the preceding
second and a nanoseconds count in the range [10, 20).
This fixes the ambiguity, but retains all the other downsides of the
present badly-behaved scalar, and adds the substantial downside of
breaking expectations of normalisation.

The alternative to all of those hacks is to produce a continuous scalar
value that genuinely counts the seconds of UTC.  This is feasible.
It would have a distinct representation for all points on the UTC
time scale.  By being a true scalar value it would fully meet SRFI-19's
description of the time structure, would be represented in normalised
fashion, and would support arithmetic operations on the seconds of UTC
(fixing bug#26164 with no extra effort).

The downside is that this is an unusual and somewhat surprising
arrangement.  I've never previously seen a linear count of UTC
seconds brought out as a product of any time library.  It would
mean that a time-utc structure is not an encoding of a UTC time as
normally understood: the date structure would serve that purpose, and
a time-utc would instead have a hybrid meaning halfway between what we
usually think of as UTC and TAI times.  In the leap-seconds era (1972
onwards), the scalar value in a time-utc would be a constant offset
from the scalar value in the corresponding time-tai.  This implies that
conversion operations would be in a different place from where they
are now.  Whereas currently date/time-utc conversions are almost purely
arithmetical and time-utc/time-tai conversions involve the leap second
table, instead date/time-utc conversions would require the leap second
table and time-utc/time-tai conversions would be purely arithmetical
for the leap-seconds era.  (Frequency offsets would come into the
time-utc/time-tai conversions, for times in the rubber-seconds era.)

I'm pretty sure that this actually-linear treatment of time-utc is not
what the author of SRFI-19 envisioned.  But it fits the actual words of
the standard better than anything else I can imagine, and would fix a
bunch of problems that otherwise look painful.  I reckon this is the best
way forward.  What do you think?  If you like it, I could work up a patch.

-zefram

bug#26164: time-difference mishandles leap seconds

2018-11-05 Thread Zefram

Mark H Weaver wrote:
>You seem to be assuming that SRFI-19 durations should _always_ represent
>intervals of TAI time.

No, that is not my position.  Although SRFI-19 isn't entirely explicit
on this point, it is in the nature of the problem space that a duration
may be measured on any time scale, and it seems to be implied that
time-difference will determine the duration on the time scale of its
inputs.  Indeed, if the duration were always to be determined on one
specific scale then it would not be necessary for time-difference to
require its two inputs to be of the same time type.

With respect to UTC, my position is that time-difference on inputs of
type time-utc should determine the duration as measured in UTC seconds.
For times since 1972 this is always the same as the duration in TAI
seconds (elaborated further below).  For 1961 to 1971 UTC durations and
TAI durations differ, and that's the subject of my bug#26632.  Note that
in that bug report I explicitly converted time-utc->time-tai where I
wanted to determine a TAI duration.

>   every UTC day has
>exactly 86400 UTC seconds,

No, that's not how UTC works.  There are some time scales derived from UTC
that have exactly 86400 seconds for each UTC day, such as Markus Kuhn's
UTC-SLS, or that have exactly 86400 seconds per UTC day in the long run,
such as Google's "leap smear".  But SRFI-19 doesn't refer to any of those,
it refers to UTC.  The true UTC has a variable number of seconds per day
*as judged by UTC clocks*: the days are not merely different lengths as
judged by TAI.

The variable number of UTC seconds per day is the source of the famous
"23:59:60" notation.  On a day with a positive leap second, the first
second of the day is centred on 00:00:00.5, the 86400th second is centred
on 23:59:59.5, and the 86401st second is centred on 23:59:60.5.  These
are 86401 distinct seconds counted by UTC, each with a distinct label.
On a day with a negative leap second, UTC only counts 86399 seconds:
the time-of-day labels never reach 23:59:59.

It is intrinsic to the definition of UTC that durations (measured in
seconds) don't match up regularly with time of day.  It's just like
the way that intervals measured in days don't match up regularly with
day of month: the way to think about a day of UTC is a lot like the way
one thinks about a month of the Gregorian calendar.  (Though there's an
important difference in that we know the lengths of Gregorian months
arbitrarily far in advance but only know UTC day lengths months in
advance.)  Wanting to avoid all that irregularity is the motivation to
use UTC-SLS and the like.

>Having said all of this, I should admit that I'm not an expert on time
>standards,

I am.

Incidentally, there's an aspect of the present bug report that's
different in the pre-1972 era.  time-difference correctly shows a
duration of exactly 86400 seconds on the UTC scale for an ordinary day
in that era, such as 1967-03-14 of which I examined the TAI duration
in bug#26632.  But it incorrectly shows the same duration for a day
with a leap.  That's the same error that it makes for post-1972 leaps,
but there's a difference in that the duration of the leap (as judged
in UTC) is non-integral, being derived from a non-integral number of
TAI seconds and also affected by the frequency offset.  For example,
the UTC duration of 1968-01-31 (also examined in bug#26632) was exactly
8639990259200/10003 seconds (roughly 86399.90003 s).  This runs
into trouble with SRFI-19's insistence that the nanosecond field of a
time object only contain an integer.

-zefram

bug#26632: TAI<->UTC conversion botches 1961 to 1971

2018-11-05 Thread Zefram

Mark H Weaver wrote:
>patch adds the TAI-UTC tables for 1961-1971 and uses them to implement
>TAI<->UTC conversions over that time range with nanosecond accuracy.

On a quick inspection of the code, that looks good.

>I'm vaguely concerned about violating widely-held assumptions,
>e.g. that UTC runs at the same rate as TAI

If an application assumes that for pre-1972 times, then the application
is broken.  Note that any application currently using the srfi-19 library
for pre-1972 TAI<->UTC conversions already has a bigger problem, in that
it's getting false answers from the library.  It's hard to see how fixing
the library could make any previously-working program stop working.

> which might cause some code on top of Guile to misbehave if
>the system clock is set pre-1972,

If the system clock is incorrect by decades, there will be many other
problems to deal with.

>I'm curious to hear opinions on this.

My view is that this change should definitely be applied.  But it's also
worth thinking about what the alternative is, if the correct conversions
are somehow too shocking for innocent programs to be exposed to them.
Making no change isn't a realistic option: the library is producing
false answers, which are no use to anyone.  It's clearly a bug in the
library, and needs to be addressed somehow.  The only other defensible
option would be to declare pre-1972 UTC out of scope for the library,
having attempted conversions signal an error.  That would have to be
documented, and it seems like it would still amount to a deviation from
the requirements of SRFI-19.

-zefram

82 matches

Mail list logo