bug#16363: interactive use subject to compiler limitations
guile-2.0.9's compiler has some inconvenient restrictions, relative to its interpreter. Where the compiler is automatically applied to scripts, the restrictions aren't a serious problem, because if compilation fails then guile falls back to interpreting the script. But in an interactive REPL session, by default each form entered by the user is passed through the compiler, and if compilation fails then the error is signalled, with no fallback to interpretation. As a test case, consider a form in which a procedure object appears. The compiler can't handle forms that directly reference a wide variety of object types, including procedures (both primitive and user-defined) and GOOPS objects. In the interpreter these objects simply self-evaluate, and it can be useful to reference them without the usual indirection through a named variable. Here I'll show what happens to such a form in a script and interactively, in guile 1.8 and 2.0: $ cat t2 (cond-expand (guile-2 (eval-when (compile load eval) (fluid-set! read-eval? #t))) (else (fluid-set! read-eval? #t))) (define (p x y) (#.+ x y)) (write (p 2 3)) (newline) $ guile-1.8 t2 5 $ guile-2.0 --no-auto-compile t2 5 $ guile-2.0 t2 ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 ;;; or pass the --no-auto-compile argument to disable. ;;; compiling /home/zefram/usr/guile/t2 ;;; WARNING: compilation of /home/zefram/usr/guile/t2 failed: ;;; ERROR: build-constant-store: unrecognized object # 5 $ guile-1.8 guile> (fluid-set! read-eval? #t) guile> (define (p x y) (#.+ x y)) guile> (p 2 3) 5 guile> ^D $ guile-2.0 GNU Guile 2.0.9-deb+1-1 Copyright (C) 1995-2013 Free Software Foundation, Inc. Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. This program is free software, and you are welcome to redistribute it under certain conditions; type `,show c' for details. Enter `,help' for help. scheme@(guile-user)> (fluid-set! read-eval? #t) scheme@(guile-user)> (define (p x y) (#.+ x y)) While compiling expression: ERROR: build-constant-store: unrecognized object # scheme@(guile-user)> (p 2 3) :3:0: In procedure #:3:0 ()>: :3:0: In procedure #:3:0 ()>: Unbound variable: p There is a workaround for this problem: the REPL's "interp" option controls whether forms go through the compiler or the interpreter. Hence: scheme@(guile-user)> (fluid-set! read-eval? #t) scheme@(guile-user)> (#.+ 2 3) While compiling expression: ERROR: build-constant-store: unrecognized object # scheme@(guile-user)> ,o interp #t scheme@(guile-user)> (#.+ 2 3) $1 = 5 So the problem is merely that the REPL is broken *by default*. It should either default to the working mechanism, or fall back to it when compilation fails (as the file auto-compilation does). Debian incarnation of this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734108 -zefram
bug#16364: auto-compile noise can't be avoided by script
Guile 2.0.9 has a facility to automatically cache a compiled version of any Scheme source file that it loads, and it wants the world to know about it! If auto-compilation is enabled, which it is by default, then when guile loads a file (that was not already compiled) it emits a banner describing the auto-compilation. This interferes with the proper functionality of any program written as a guile script, by producing output that the program did not intend. Working around this is tricky (discussed below). There's no straightforward way for a script to avoid the noise while being portable between guile versions 1.8 and 2.0. There's also no way to avoid the noise while actually getting the auto-compilation behaviour. In my particular case, my script makes interesting use of the read-eval (#.) feature, which means that the compilation process actually can't work. This means that *every* time the script is run, not just the first time, guile emits the banner about auto-compilation, followed by a rather misleading warning/error about compilation failure. It's misleading because it then goes on to execute the script just fine. I can demonstrate this with a minimal test case (using read-eval in an uninteresting way, just making the compiler barf by not having applied eval-when to enable it): $ cat t0 #!/usr/bin/guile -s !# (fluid-set! read-eval? #t) (display #."hello world") (newline) $ guile-1.8 -s t0 hello world $ guile-2.0 -s t0 ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 ;;; or pass the --no-auto-compile argument to disable. ;;; compiling /home/zefram/usr/guile/t0 ;;; WARNING: compilation of /home/zefram/usr/guile/t0 failed: ;;; ERROR: #. read expansion found and read-eval? is #f. hello world $ I can turn off the auto-compilation from within the script by using the --no-auto-compile option, but that breaks compatibility to 1.8: $ cat t1 #!/usr/bin/guile \ --no-auto-compile -s !# (fluid-set! read-eval? #t) (display #."hello world") (newline) $ guile-2.0 '\' t1 hello world $ guile-1.8 '\' t1 guile-1.8: Unrecognized switch `--no-auto-compile' Usage: guile-1.8 OPTION ... Evaluate Scheme code, interactively or from a script. ... Aside from the portability concern, turning off auto-compilation doesn't actually fix the problem. If a compiled version has previously been cached for the filename of a script being run, guile will consider using the cached version even if --no-auto-compile was supplied: the switch only controls the attempt to compile for the cache. If the cached compilation is up to date then it is used silently, which is OK. But if it's out of date, because the cache was for a different script that previously existed under the same name, then guile emits a banner saying that it's out of date (implying that the cached compilation is therefore not being used). So the script's visible behaviour is defiled even if it applies the option. Observe what happens to the second script in this sequence: $ echo '(display "hello world\n")' >t10 $ guile-2.0 t10 ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 ;;; or pass the --no-auto-compile argument to disable. ;;; compiling /home/zefram/usr/guile/t10 ;;; compiled /home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t10.go hello world $ echo '(display "goodbye world\n")' >t10 $ guile-2.0 --no-auto-compile t10 ;;; note: source file /home/zefram/usr/guile/t10 ;;; newer than compiled /home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t10.go goodbye world I have, however, come up with a truly ugly workaround. The meta option system can be used to introduce a -c option that explicitly loads the script file via primitive-eval, which does not attempt compilation. (Nor does it look at the compilation cache, so this even avoids the problem that --no-auto-compile runs into.) Running the script this way yields a different command line (visible through (program-arguments)) from that which arrives when the script is run via -s, so if the script is to process its command line, for robustness it must pay attention to which way it was invoked. All together, this looks like: $ cat t11 #!/usr/bin/guile \ -c (begin\ \ \ \ (define\ arg-hack\ #t)\ \ \ \ (primitive-load\ (cadr\ (program-arguments !# (define argv (if (false-if-exception arg-hack) (cdr (program-arguments)) (program-arguments))) (write argv) (newline) $ guile-1.6 '\' t11 a b c ("t11" "a" "b" "c") $ guile-1.6 -s t11 a b c ("t11" "a" "b" "c") $ guile-1.8 '\' t11 a b c ("t11" "a" "b" "c") $ guile-1.8 -s t11 a b c ("t11" "a" "b" "c") $ guile-2.0 '\' t11 a b c ("t11" "a
bug#16359: "guild list" lists nothing
"guild list" is meant to list the available subcommands within guild. It actually shows an empty list: $ GUILE=/usr/bin/guile-2.0 guild list Usage: guild COMMAND [ARGS] Run command-line scripts provided by GNU Guile and related programs. Commands: For help on a specific command, try "guild help COMMAND". Report guild bugs to bug-guile@gnu.org GNU Guile home page: <http://www.gnu.org/software/guile/> General help using GNU software: <http://www.gnu.org/gethelp/> For complete documentation, run: info guile 'Using Guile Tools' $ Subcommands mentioned in the guile documentation are actually available, despite not being listed. This is guile-2.0.9 on Debian. Debian incarnation of this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734313 -zefram
bug#16361: compile cache confused about file identity
The automatic cache of compiled versions of scripts in guile-2.0.9 identifies scripts mainly by name, and partially by mtime. This is not actually sufficient: it is easily misled by a pathname that refers to different files at different times. Test case: $ echo '(display "aaa\n")' >t13 $ echo '(display "bbb\n")' >t14 $ guile-2.0 t13 ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 ;;; or pass the --no-auto-compile argument to disable. ;;; compiling /home/zefram/usr/guile/t13 ;;; compiled /home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t13.go aaa $ mv t14 t13 $ guile-2.0 t13 aaa You can see that the mtime is not fully used here: the cache is misapplied even if there is a delay of seconds between the creations of the two script files. The cache's mtime check will only notice a mismatch if the script currently seen under the supplied name was modified later than when the previous script was *compiled*. Obviously, in this test case the cache could trivially distinguish the two script files by looking at the inode numbers. On its own the inode number isn't sufficient, but exact match on device, inode number, and mtime would be far superior to the current behaviour, only going wrong in the presence of deliberate timestamp manipulation. As a bonus, if the cache were actually *keyed* by inode number and device, rather than by pathname, it would retain the caching of compilation across renamings of the script. Or, even better, the cache could be keyed by a cryptographic hash of the file contents. This would be immune even to timestamp manipulation, and would preserve the cached compilation even across the script being copied to a fresh file or being edited and reverted. This would be a cache worthy of the name. The only downside is the expense of computing the hash, but I expect this is small compared to the expense of compilation. Debian incarnation of this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734178 -zefram
bug#16357: insufficient print abbreviation in error messages
When guile is constructing error messages that display offending objects, in version 2.0.9 it never abbreviates long or deep structures. This can easily lead to pathologically-long messages that take stupid amounts of time and memory to construct and to display. By contrast, guile-1.8 applies abbreviation at a reasonable level, and objects appearing in stack traces have reasonable abbreviation on both versions. Two very mild examples: $ guile-1.8 --debug -c "(read (let aaa ((n 100) (v '())) (if (= n 0) v (aaa (- n 1) (cons n v)" Backtrace: In current input: 1: 0* [read {(1 2 3 4 5 6 7 8 9 ...)}] :1:1: In procedure read in expression (read (# 100 #)): :1:1: Wrong type argument in position 1 (expecting open input port): (1 2 3 4 5 6 7 8 9 10 ...) $ guile-2.0 --debug -c "(read (let aaa ((n 100) (v '())) (if (= n 0) v (aaa (- n 1) (cons n v)" Backtrace: In ice-9/boot-9.scm: 157: 7 [catch #t # ...] In unknown file: ?: 6 [apply-smob/1 #] In ice-9/boot-9.scm: 63: 5 [call-with-prompt prompt0 ...] In ice-9/eval.scm: 432: 4 [eval # #] In unknown file: ?: 3 [call-with-input-string "(read (let aaa ((n 100) (v '())) (if (= n 0) v (aaa (- n 1) (cons n v)" ...] In ice-9/command-line.scm: 180: 2 [# #] In unknown file: ?: 1 [eval (read (let aaa (# #) (if # v #))) #] ?: 0 [read (1 2 3 4 5 6 7 8 9 ...)] ERROR: In procedure read: ERROR: In procedure read: Wrong type argument in position 1 (expecting open input port): (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100) $ guile-1.8 --debug -c "(read (let aaa ((n 100) (v '())) (if (= n 0) v (aaa (- n 1) (cons v n)" Backtrace: In current input: 1: 0* [read {(((# . 3) . 2) . 1)}] :1:1: In procedure read in expression (read (# 100 #)): :1:1: Wrong type argument in position 1 (expecting open input port): (((# . 7) . 6) . 5) . 4) . 3) . 2) . 1) $ guile-2.0 --debug -c "(read (let aaa ((n 100) (v '())) (if (= n 0) v (aaa (- n 1) (cons v n)" Backtrace: In ice-9/boot-9.scm: 157: 7 [catch #t # ...] In unknown file: ?: 6 [apply-smob/1 #] In ice-9/boot-9.scm: 63: 5 [call-with-prompt prompt0 ...] In ice-9/eval.scm: 432: 4 [eval # #] In unknown file: ?: 3 [call-with-input-string "(read (let aaa ((n 100) (v '())) (if (= n 0) v (aaa (- n 1) (cons v n)" ...] In ice-9/command-line.scm: 180: 2 [# #] In unknown file: ?: 1 [eval (read (let aaa (# #) (if # v #))) #] ?: 0 [read (((# . 3) . 2) . 1)] ERROR: In procedure read: ERROR: In procedure read: Wrong type argument in position 1 (expecting open input port): () . 100) . 99) . 98) . 97) . 96) . 95) . 94) . 93) . 92) . 91) . 90) . 89) . 88) . 87) . 86) . 85) . 84) . 83) . 82) . 81) . 80) . 79) . 78) . 77) . 76) . 75) . 74) . 73) . 72) . 71) . 70) . 69) . 68) . 67) . 66) . 65) . 64) . 63) . 62) . 61) . 60) . 59) . 58) . 57) . 56) . 55) . 54) . 53) . 52) . 51) . 50) . 49) . 48) . 47) . 46) . 45) . 44) . 43) . 42) . 41) . 40) . 39) . 38) . 37) . 36) . 35) . 34) . 33) . 32) . 31) . 30) . 29) . 28) . 27) . 26) . 25) . 24) . 23) . 22) . 21) . 20) . 19) . 18) . 17) . 16) . 15) . 14) . 13) . 12) . 11) . 10) . 9) . 8) . 7) . 6) . 5) . 4) . 3) . 2) . 1) Debian incarnation of this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734128 -zefram
bug#16358: combinatorial explosion in elided stack trace
In guile 2.0.9, if an error is signalled in the interpreter, and the stack contains in a certain position an object whose unabbreviated print representation is very large, then the process of displaying the stack trace will take a huge amount of time and memory, pausing in the middle of output, even though the displayed stack trace doesn't actually show the object at all. Test case: $ cat t6 (define bs (let aaa ((n 100) (v '())) (if (= n 0) v (aaa (- n 1) (cons v v) (write (list bs (error "wibble"))) $ guile-2.0 --no-auto-compile t6 Backtrace: In ice-9/boot-9.scm: 157: 11 [catch #t # ...] In unknown file: ?: 10 [apply-smob/1 #] In ice-9/boot-9.scm: 63: 9 [call-with-prompt prompt0 ...] In ice-9/eval.scm: 432: 8 [eval # #] In ice-9/boot-9.scm: 2320: 7 [save-module-excursion #] 3968: 6 [#] 1645: 5 [%start-stack load-stack #] 1650: 4 [#] In unknown file: ?: 3 [primitive-load "/home/zefram/usr/guile/t6"] In ice-9/eval.scm: 387: 2 ^Z zsh: suspended guile-2.0 --no-auto-compile t6 $ jobs -l [1] + 32574 suspended guile-2.0 --no-auto-compile t6 $ ps vw 32574 PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 32574 pts/5T 0:36 0 3 2266300 1634556 9.9 guile-2.0 --no-auto-compile t6 With the test's size parameter at 100 as above, there is no realistic prospect of actually completing generation of the stack trace. For some range of values (about 25 on my machine) there will be a noticeable pause, after which the stack trace completes: ... 387: 2 [eval # ()] 387: 1 [eval # ()] In unknown file: ?: 0 [scm-error misc-error #f "~A" ("wibble") #f] It appears that it's generating the entire print representation of the object behind the scenes, though it then obviously throws it away. Experimentation with customising print methods for SRFI-9 record types shows that the delay and memory usage depend on the print representation per se, rather than on the amount of structure beneath the object. (A record-based cons-like type produces similar behaviour to the cons test when using the default print method that shows the content. Replacing it with a print method that emits a fixed string and doesn't recurse eliminates the delay entirely.) If my test program is run in compiled form (via auto-compilation) then it doesn't exhibit the pause. Actually it gets optimised such that the problem object isn't anywhere near what the stack trace displays, so for a fair test the program needs to be tweaked. It can be arranged for the problem object to be directly mentioned in the stack trace, and there is still no pause: the object appears in a highly abbreviated form, such as 2: 1 [vv ((# # # # ...) (# # # # ...) (# # # # ...) (# # # # ...) ...)] For comparison, guile-1.8 never exhibits this problem. By default it doesn't emit a stack trace for a script, but it can be asked to do so via --debug. It then behaves like the compiled form of guile-2.0: there is no delay, and the object is shown in very abbreviated form. Debian incarnation of this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734132 -zefram
bug#16356: doc out of date about (integer? +inf.0)
The "Integers" node of the guile info document contains this gem (source in doc/ref/api-data.texi): (integer? +inf.0) => #t Actual guile-2.0.9 behaviour: scheme@(guile-user)> (integer? +inf.0) $16 = #f The doc example matches the behaviour of guile-1.8, which classifies +inf.0 and -inf.0 as integers, and +nan.0 as rational but not integer. guile-2.0 follows R6RS in treating all three of these values as real but not rational, and the "Reals and Rationals" node describes this accurately. Debian incarnation of this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734323 Mathematically, infinities are not real, and NaN is, as the acronym says, not a number. The documentation could perhaps do with a note about the difference between mathematical terminology and Scheme terminology. I was rather surprised to find any discrepancy, as Scheme's numerical tower stands out among programming languages as being uniquely accurate in its use of mathematical terms. Scheme's concept of "real" more closely corresponds to the mathematical concept of "hyperreal", which includes infinities, although NaN doesn't fit. Scheme's "complex" is similarly extended relative to the mathematical complex numbers, but the mathematical term "hypercomplex" unfortunately refers to something quite different (quaternions and the like). -zefram
bug#16360: "guild help COMMAND" crashes
"guild help COMMAND" crashes for most existing guild subcommands. For example: $ GUILE=/usr/bin/guile-2.0 guild help frisk Usage: guild frisk OPTION... Show dependency information for a module. Backtrace: In ice-9/boot-9.scm: 157: 8 [catch #t # ...] In unknown file: ?: 7 [apply-smob/1 #] In ice-9/boot-9.scm: 63: 6 [call-with-prompt prompt0 ...] In ice-9/eval.scm: 432: 5 [eval # #] In /usr/bin/guild: 74: 4 [main ("/usr/bin/guild" "help" "frisk")] In scripts/help.scm: 181: 3 [main "frisk"] 155: 2 [show-help # #] In ice-9/boot-9.scm: 788: 1 [call-with-input-file #f ...] In unknown file: ?: 0 [open-file #f "r" #:encoding #f #:guess-encoding #f] ERROR: In procedure open-file: ERROR: Wrong type (expecting string): #f $ This is guile-2.0.9 on Debian. Debian incarnation of this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734314 -zefram
bug#16365: (* 0 +inf.0) rationale is flawed
Commit 5e7918077a4015768a352ab19e4a8e94531bc8aa says A note on the rationale for (* 0 +inf.0) being a NaN and not exact 0: The R6RS requires that (/ 0 0.0) return a NaN value, and that (/ 0.0) return +inf.0. We would like (/ x y) to be the same as (* x (/ y)), This identity doesn't actually hold. For example, on guile 2.0.9 with IEEE double flonums: scheme@(guile-user)> (/ (expt 2.0 -20) (expt 2.0 -1026)) $36 = 6.857655085992111e302 scheme@(guile-user)> (* (expt 2.0 -20) (/ (expt 2.0 -1026))) $37 = +inf.0 This case arises because the dynamic range of this flonum format is slightly asymmetric: 2^-1026 is representable, but 2^1026 overflows. So the rationale for (* 0 +inf.0) yielding +nan.0 is flawed. As the supposed invariant and the rationale are not in the actual documentation (only mentioned in the commit log) this is not necessarily a bug. But worth thinking again to determine whether the case for adopting the flonum behaviour here is still stronger than the obvious case for the exact zero to predominate. (Mathematically, multiplying zero by an infinite number does yield zero. Let alone multiplying it by a merely large finite number, which is what the flonum indefinite `infinity' really represents.) -zefram
bug#16362: compiler disrespects referential integrity
The guile-2.0.9 compiler doesn't preserve the distinctness of mutable objects that are referenced in code via the read-eval (#.) facility. (I'm not mutating the code itself, only quoted objects.) The interpreter, and for comparison guile-1.8, do preserve object identity, allowing read-eval to be used to incorporate direct object references into code. Test case: $ cat t9 (cond-expand (guile-2 (defmacro compile-time f `(eval-when (compile eval) ,@f))) (else (defmacro compile-time f `(begin ,@f (compile-time (fluid-set! read-eval? #t)) (compile-time (define aaa (cons 1 2))) (set-car! '#.aaa 5) (write '#.aaa) (newline) (write '(1 . 2)) (newline) $ guile-1.8 t9 (5 . 2) (1 . 2) $ guile-2.0 --no-auto-compile t9 (5 . 2) (1 . 2) $ guile-2.0 t9 ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 ;;; or pass the --no-auto-compile argument to disable. ;;; compiling /home/zefram/usr/guile/t9 ;;; compiled /home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t9.go (5 . 2) (5 . 2) $ guile-2.0 t9 (5 . 2) (5 . 2) In the test case, the explicitly-constructed pair aaa is conflated with the pair literal (1 . 2), and so the runtime modification of aaa (which is correctly mutable) affects the literal. This issue seems closely related to the problem described at <http://debbugs.gnu.org/cgi/bugreport.cgi?bug=11198>, wherein the compiler is entirely unable to handle code incorporating references to some kinds of object. In that case the failure mode is a compile-time error, so the problem can be worked around. The failure mode with pairs, silent misbehaviour, is a more serious problem. Between them, these problems break most of the interesting uses for read-eval, albeit only when using the compiler. Debian incarnation of this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734157 -zefram
bug#16362: compiler disrespects referential integrity
Mark H Weaver wrote: >I'm sorry that you've written code that assumes that this is allowed, >but in Scheme all literals are immutable. It's not a literal: the object was not constructed by the action of the reader. It was constructed by non-literal means, and merely *passed through* the reader. That's not to say your not-a-bug opinion is wrong, though. Scheme as defined by RnRS certainly doesn't support this kind of thing. It treats the print form of an expression as primary, and so doesn't like having anything unprintable in the object form. >It worked by accident in Guile 1.8, This is the bit that's really news to me. *Scheme* doesn't support it, but *Guile* is more than just Scheme, and I presumed that it was intentional that it took a more enlightened view of what constitutes an expression. If that was just an accident, then what you actually support ought to be documented. In principle it would also be a good idea to enforce this restriction in the interpreter, to avoid having this incompatibility between interpreter and compiler of the `same' implementation. >but there's simply no way to support >this robustly in an ahead-of-time compiler, which must serialize all >literals to an object file. Sure there is. The object in question is eminently serialisable: it contains only references to other serialisable data. All that needs to change is to distinguish between actual literal pairs (that can be merged) and non-literals whose distinct identity needs to be preserved. This might well be painful to add to your existing code, given the way you represent pairs. But that's a difficulty with the specific implementation, not an inherent limitation of compilation. -zefram
bug#16363: interactive use subject to compiler limitations
Mark H Weaver wrote: >that all code and literals be serialized, there's no sane way to support >the semantics you seem to want. We've addressed the semantics themselves on the other ticket, #16362. Accepting that the compiler semantics are preferred, there's still a problem in the scope of my intent for this ticket #16363: that interactive behaviour doesn't match the behaviour of a script. The mismatch is a problem for development regardless of which set of semantics is correct. As I mentioned in passing on the other ticket, you could fix this by enforcing the compiler restrictions in interpreting situations. A start on this would be for read-eval to refuse to accept any object without a readable print form, such as the procedure in my example on this ticket. For objects that do have a readable print form, such as the pair in #16362, it could break the referential identity by copying the object, as if by printing it to characters and reading it back. If, on the other hand, you actually intend for the compiler and interpreter to have visibly different semantics, there's still the problem that the REPL approaches that difference in a different way from script execution. In that case, either the REPL should perform the same fallback that script execution does (as I originally suggested on this ticket), or script execution should not perform the fallback. -zefram
bug#16362: compiler disrespects referential integrity
Mark H Weaver wrote: >In Scheme terminology, an expression of the form (quote ) is a >literal. Ah, sorry, I see your usage now. R6RS speaks of that kind of expression being a "literal expression". (Elsewhere it uses "literal" in the sense I was using it, referring to the readable representation of an object.) Section 5.10 "Storage model" says "It is desirable for constants (i.e. the values of literal expressions) to reside in read-only memory.". So in the Scheme model whatever that in the expression is it's a "constant". Of course, that's in the RnRS view of expressions that ignores the homoiconic representation. It's assuming that these "constants" will always be "literal" in the sense I was using. >Where does it say in the documentation that this is allowed? It doesn't: as far as I can see it doesn't document that aspect of the language at all. It would be nice if it did. >To my mind, Guile documents itself as Scheme plus extensions, I thought the documentation was attempting to document the language that Guile implements per se. It doesn't generally just refer to RnRS for the language definition; it actually tells you most of what it could have referred to RnRS for. For example, it fully describes tail recursion, without any reference to RnRS. It's good that it does this, and it would be good for it to be more complete in the areas such as this where it's lacking. So maybe I got the wrong impression of the documentation's role. As the documentation doesn't describe expressions in the RnRS character-based way, I got the impression that Guile had not necessarily adopted that restriction. As it doesn't describe expressions in the homoiconic way either, I interpreted it as silent on the issue, making experimentation appropriate to determine the intent. Maybe the documentation should have a note about its relationship to the Scheme language definition: say which things it tries to be authoritative about. >cannot determine what extensions you can depend on by experiment. Fair point, and I'm not bitter about my experiment turning out to have this limited applicability. >Consider this: you serialize an object to one file, and then the same >object to a second file. Now you load them both in from a different >Guile session. How can the Guile loader know whether these two objects >should have the same identity or be distinct? That's an interesting case, and I suppose I wouldn't expect that to preserve identity. I also wouldn't expect you to serialise an I/O port. But the case I'm concerned about is a standalone script, being compiled as a whole, and the objects it's setting up at compile time are made of ordinary data. I think some of our difference of opinion here comes because you're mainly thinking of the compiler as something to apply to modules, so you expect to deal with many compiled files in one session, whereas I'm thinking about compilation of a program as a whole. Your viewpoint is the more general. >For example, how do you correctly serialize a procedure produced by >make-counter? Assuming we're only serialising it to one file, it shouldn't be any more difficult than my test case with a mutable pair. The procedure object needs to contain a reference to the body expression and a reference to the lexical environment that it closed over. The lexical environment contains the binding of the symbol "n" to a variable, which contains some current numeric value. That variable is the basic mutable item whose identity needs to be maintained through serialisation. If we have multiple procedures generated by make-counter, they'll have distinct variables, and therefore distinct lexical environments, and therefore be distinct procedures, though they'll share bodies. The only part of this that looks at all difficult to me is that you may have compiled the function body down to VM code, which is not exactly a normal Lisp object and needs its own serialisation arrangements. Presumably you already have that solved in order to compile code that contains function definitions. Aside from that it's all ordinary Lisp objects that look totally serialisable. What do you think is the difficult part? -zefram
bug#16464: + folding differs between compiler and interpreter
The + procedure left-folds its arguments in interpreted code and right-folds its arguments in compiled code. This may or may not be a bug. Obviously, with exact numbers the direction of folding makes no difference. But the difference is easily seen with flonums, as flonum addition is necessarily non-associative. For example, where flonums are IEEE doubles: scheme@(guile-user)> ,o interp #f scheme@(guile-user)> (+ 1.0 (expt 2.0 -53) (expt 2.0 -53)) $1 = 1.0002 scheme@(guile-user)> (+ (expt 2.0 -53) (expt 2.0 -53) 1.0) $2 = 1.0 scheme@(guile-user)> ,o interp #t scheme@(guile-user)> (+ 1.0 (expt 2.0 -53) (expt 2.0 -53)) $3 = 1.0 scheme@(guile-user)> (+ (expt 2.0 -53) (expt 2.0 -53) 1.0) $4 = 1.0002 Compiler and interpreter agree when the order of operations is explicitly specified: scheme@(guile-user)> (+ (+ 1.0 (expt 2.0 -53)) (expt 2.0 -53)) $5 = 1.0 scheme@(guile-user)> (+ 1.0 (+ (expt 2.0 -53) (expt 2.0 -53))) $6 = 1.0002 If your flonums are not IEEE double then the exponent in the test case has to be adapted. R5RS and the Guile documentation are both silent about the order of operations in cases like this. I do not regard either left-folding or right-folding per se as a bug. A portable Scheme program obviously can't rely on a particular behaviour. My concern here is that the compiler and interpreter don't match, making program behaviour inconsistent on what is notionally a single implementation. That mismatch may be a bug. I'm not aware of any statement either way on whether you regard such mismatches as bugs. (An explicit statement in the documentation would be most welcome.) R6RS does have some guidance about the proper behaviour here. The description of the generic arithmetic operators doesn't go into such detail, just describing it as generic. It can be read as implying that the behaviour on flonums should match the behaviour of the flonum-specific fl+. The description of fl+ (libraries section 11.3 "Flonums") says it "should return the flonum that best approximates the mathematical sum". That suggests that it shouldn't use a fixed sequence of dyadic additions operations, and in my test case should return 1.0002 regardless of the order of operands. Obviously that's more difficult to achieve than just folding the argument list with dyadic addition. Interestingly, fl+'s actual behaviour differs both from + and from the R6RS ideal. It left-folds in both compiled and interpreted code: scheme@(guile-user)> (import (rnrs arithmetic flonums (6))) scheme@(guile-user)> ,o interp #f scheme@(guile-user)> (fl+ 1.0 (expt 2.0 -53) (expt 2.0 -53)) $7 = 1.0 scheme@(guile-user)> (fl+ (expt 2.0 -53) (expt 2.0 -53) 1.0) $8 = 1.0002 scheme@(guile-user)> ,o interp #t scheme@(guile-user)> (fl+ 1.0 (expt 2.0 -53) (expt 2.0 -53)) $9 = 1.0 scheme@(guile-user)> (fl+ (expt 2.0 -53) (expt 2.0 -53) 1.0) $10 = 1.0002 fl+'s behaviour is not a bug. The R6RS ideal is clearly not mandatory, and the Guile documentation makes no stronger claim than that its fl+ conforms to R6RS. As it is consistent between compiler and interpreter, it is not subject to the concern that I'm raising in this ticket about the generic +. -zefram
bug#16364: auto-compile noise can't be avoided by script
Ludovic Courtes wrote: >However, you can set the environment variable GUILE_AUTO_COMPILE=0. > >Do you think that would solve the problem? It does not solve the problem. Firstly, it can't be done from the #! line at all, so the script can't do it early enough. It only works if it's already been set by the user, which is no good for what should be an internal detail of the program. Secondly, it suffers the second problem that I noted with --no-auto-compile: if there's already a cached compilation then that'll be looked at, and if it's out of date then a "newer than" banner is emitted. With the environment variable set the cached version will never be updated, nor will it be deleted, so the banner then appears on every execution. -zefram
bug#16361: compile cache confused about file identity
Mark H Weaver wrote: >You could make the same complaint about 'make', 'rsync', or any number >of other programs. Not really. make does use this type of freshness check, but it's used in a specific situation where the freshness issue is immediately obvious and is part of the program's visible primary concern. That's quite unlike guile's compile cache, which as the name suggests is a cache. It's meant to be unobtrusive, and the cache semantics are not a direct part of the transaction that is ostensibly taking place, of running a program that happens to be written in Scheme. Those circumstances, of running an arbitrary program, are much broader than circumstances in which make's freshness checks become relevant. make also gets a pass from having always worked this way, whereas guile used to not cache compilations. rsync, by contrast, does not use this type of freshness checking; I believe it uses a hash mechanism. >It's true that a cryptographic hash would be more >robust, but it would also be considerably more expensive in the common >case where the .go file is already in the cache. > >I don't think it's worth paying this cost every time OK, you can rule that suggestion out, but I think you have erred in jumping from that to wontfix on the general problem. You have not addressed my prior suggestion of identifying programs by exact match on device, inode number, and mtime. (File size could also be included.) This freshness check is very cheap, because it's just a few fixed-size fields from the stat structure, and you're already necessarily doing a stat on the program file. Using the identifying fields as the cache key even saves you a stat on the cached file. Although not quite as effective as a hash comparison, it would be a huge practical improvement over the current filename-and-inexact-mtime comparison. -zefram
bug#20822: environment mangled by locale
When guile-2.0 is asked to read environment variables, via getenv, it always decodes the underlying octet string according to the current locale's nominal character encoding. This is a problem, because the environment variable's value is not necessarily encoded that way, and may not even be an encoding of a character string at all. The decoding is lossy, where the octet string isn't consistent with the character encoding, so the original octet string cannot be recovered from the mangled form. I don't see any Scheme interface that retrieves the environment without locale decoding. The decoding is governed by the currently selected locale at the time that getenv is called, so this can be controlled to some extent by setlocale. However, this doesn't provide a way round the lossy decoding problem, because there is no guarantee of a cooperative locale being available (and especially being available under a predictable name). On my Debian system here, the "POSIX" and "C" locales' nominal character encoding is ASCII, so decoding under these locales results in all high-half octets being turned into question marks. Retrieving environment without calling setlocale at all also yields this lossy ASCII decode. Demos: $ env - FOO=$'L\xc3\xa9on' guile-2.0 -c '(write (map char->integer (string->list (getenv "FOO" (newline)' (76 63 63 111 110) $ env - FOO=$'L\xc3\xa9on' guile-2.0 -c '(setlocale LC_ALL "POSIX") (write (map char->integer (string->list (getenv "FOO" (newline)' (76 63 63 111 110) $ env - FOO=$'L\xc3\xa9on' guile-2.0 -c '(setlocale LC_ALL "de_DE.utf8") (write (map char->integer (string->list (getenv "FOO" (newline)' (76 233 111 110) $ env - FOO=$'L\xc3\xa9on' guile-2.0 -c '(setlocale LC_ALL "de_DE.iso88591") (write (map char->integer (string->list (getenv "FOO" (newline)' (76 195 169 111 110) The actual data passed between processes is an octet string, and there really needs to be some reliable way to access that octet string. There's an obvious parallel with reading data from an input port. If setlocale is called, then input is by default decoded according to locale, including the very lossy ASCII decode for C/POSIX. But if setlocale has not been called, then input is by default decoded according to ISO-8859-1, preserving the actual octets. It would probably be most sensible that, if setlocale hasn't been called, getenv should likewise decode according to ISO-8859-1. It might also be sensible to offer some explicit control over the encoding to be used with the environment, just as I/O ports have a concept of per-port selected encoding. The same issue applies to other environment access functions too. For setenv the corresponding problem is the inability to *write* an arbitrary octet string to an environment variable. Obviously all the functions should have mutually consistent behaviour. -zefram
bug#20823: argv mangled by locale
When guile-2.0 stores argv for later access via program-arguments, it sometimes decodes the underlying octet string according to the nominal character encoding of the locale suggested by the environment. This is a problem, because the arguments are not necessarily encoded that way, and may not even be encodings of character strings at all. The decoding is lossy, where the octet string isn't consistent with the character encoding, so the original octet string cannot be recovered from the mangled form. I don't see any Scheme interface that reliably retrieves the command line arguments without locale decoding. The decoding doesn't follow the usual rules for locale control. It is not at all sensitive to setlocale, which is understandable due to the arguments being acquired before any of the actual program's code runs. Empirically, if the environment nominates no locale, "POSIX", or a non-existent locale, then argv is decoded according to ISO-8859-1, thus preserving the octets. If the environment nominates an extant locale other than "POSIX", then argv is decoded according to that locale's nominal character encoding. Demos: $ env - guile-2.0 -c '(write (map char->integer (string->list (cadr (program-arguments) (newline)' $'L\xc3\xa9on' (76 195 169 111 110) $ env - LANG=C guile-2.0 -c '(write (map char->integer (string->list (cadr (program-arguments) (newline)' $'L\xc3\xa9on' (76 63 63 111 110) $ env - LANG=de_DE.utf8 guile-2.0 -c '(write (map char->integer (string->list (cadr (program-arguments) (newline)' $'L\xc3\xa9on' (76 233 111 110) $ env - LANG=de_DE.iso88591 guile-2.0 -c '(write (map char->integer (string->list (cadr (program-arguments) (newline)' $'L\xc3\xa9on' (76 195 169 111 110) The actual data passed between processes is an octet string, and there really needs to be some reliable way to access that octet string. My comments about resolution in bug#20822 "environment mangled by locale" mostly apply here too, with a slight change: it seems necessary to store the original octet strings and decode at the time program-arguments is called. With that change, the decoding can be responsive to setlocale (and in particular can reliably use ISO-8859-1 in the absence of setlocale). -zefram
bug#21883: unnecessary bit shifting range limits
Not really outright bugs, but these responses are less than awesome: $ guile -c '(write (logbit? (ash 1 100) 123))' ERROR: Value out of range 0 to 18446744073709551615: 1267650600228229401496703205376 $ guile -c '(write (ash 0 (ash 1 100)))' ERROR: Value out of range -9223372036854775808 to 9223372036854775807: 1267650600228229401496703205376 $ guile -c '(write (ash 123 (ash -1 100)))' ERROR: Value out of range -9223372036854775808 to 9223372036854775807: -1267650600228229401496703205376 In all three cases, the theoretically-correct result of the expression is not only representable but easily computed. The functions could be improved to avoid failing in these cases, by adding logic amounting to: (define (better-logbit? b v) (if (>= b (integer-length v)) (< v 0) (logbit? b v))) (define (better-ash v s) (cond ((= v 0) 0) ((<= s (- (integer-length v))) (if (< v 0) -1 0)) (else (ash v s -zefram
bug#21894: escape continuation doc wrong about reinvokability
The manual says # Escape continuations are delimited continuations whose # only use is to make a non-local exit--i.e., to escape from the current # continuation. Such continuations are invoked only once, and for this # reason they are sometimes called "one-shot continuations". O RLY? scheme@(guile-user)> (use-modules (ice-9 control)) scheme@(guile-user)> (define cc #f) scheme@(guile-user)> (list 'a (let/ec e (list 'b (e (call-with-current-continuation (lambda (c) (set! cc c) 0)) $1 = (a 0) scheme@(guile-user)> (cc 1) $2 = (a 1) scheme@(guile-user)> (cc 2) $3 = (a 2) Clearly I have invoked this escape continuation, successfully, more than once. The semantics here are perfectly sensible, it's just the documentation that's off the mark, because it ignores how escape continuations interact with other kinds of continuation. I suggest changing "Such continuations are invoked only once" sentence to something like Such continuations can only be invoked from within the dynamic extent of the call to which they will jump. Because the jump ends that extent, if escape continuations are the only kind of continuations being used it is only possible to invoke an escape continuation at most once. For this reason they are sometimes called "one-shot continuations", but that is a misnomer when other kinds of continuations are also in use. Most kinds can reinstate a dynamic extent that has been exited, and if the extent of an escape continuation is reinstated then it can be invoked again to exit that extent again. Conversely, an escape continuation cannot be invoked from a separate thread that has its own dynamic state not including the continuation's extent, even if the continuation's extent is still in progress in its original thread and the continuation has never been invoked. -zefram
bug#21897: escape continuation passes barrier
scheme@(guile-user)> (use-modules (ice-9 control)) scheme@(guile-user)> (call/ec (lambda (c) (with-continuation-barrier (lambda () (c "through continuation"))) "c-w-b returned")) $1 = "through continuation" The continuation barrier works fine on call/cc continuations and on throw/catch, but doesn't block call/ec continuations. The manual doesn't mention any difference in behaviour for this case, nor can I see any obvious justification for it. The manual's statement that # Thus, `with-continuation-barrier' returns exactly once. is false in this case. I think a continuation barrier should block the use of the call/ec continuation. -zefram
bug#21899: let/ec continuations not distinct under compiler
With guile 2.0.11: scheme@(guile-user)> (use-modules (ice-9 control)) scheme@(guile-user)> (list 'a (let/ec ae (list 'b (let/ec be (be 2) $1 = (a (b 2)) scheme@(guile-user)> (list 'a (let/ec ae (list 'b (let/ec be (ae 2) $2 = (a (b 2)) scheme@(guile-user)> (list 'a (let/ec ae (list 'b (ae 2 $3 = (a 2) The middle of these three cases is wrong: it attempts to invoke the outer escape continuation, but only goes as far as the target of the inner one, which it isn't using. It therefore produces the same result as the first case, which invokes the inner escape continuation. It ought to behave like the third case, which shows that the outer escape continuation can be successfully invoked when the unused inner continuation is not present. The problem only affects let/ec, *not* call/ec: scheme@(guile-user)> (list 'a (call/ec (lambda (ae) (list 'b (call/ec (lambda (be) (be 2))) $4 = (a (b 2)) scheme@(guile-user)> (list 'a (call/ec (lambda (ae) (list 'b (call/ec (lambda (be) (ae 2))) $5 = (a 2) scheme@(guile-user)> (list 'a (call/ec (lambda (ae) (list 'b (ae 2) $6 = (a 2) It also only happens when compiling, not when interpreting: scheme@(guile-user)> ,o interp #t scheme@(guile-user)> (list 'a (let/ec ae (list 'b (let/ec be (be 2) $7 = (a (b 2)) scheme@(guile-user)> (list 'a (let/ec ae (list 'b (let/ec be (ae 2) $8 = (a 2) scheme@(guile-user)> (list 'a (let/ec ae (list 'b (ae 2 $9 = (a 2) -zefram
bug#21900: map is not continuation-safe
With Guile 2.0.11: scheme@(guile-user)> (define cc #f) scheme@(guile-user)> (map (lambda (v) (if (= v 0) (call/cc (lambda (c) (set! cc c) 0)) (+ v 1))) '(10 20 30 0 40 50 60)) $1 = (11 21 31 0 41 51 61) scheme@(guile-user)> (cc 5) $2 = (61 51 41 0 31 5 41 51 61) It worked correctly in Guile 1.8. -zefram
bug#21901: bit shift wrong on maximal right shift
With Guile 2.0.11: scheme@(guile-user)> (ash 123 (ash -1 63)) $1 = 123 Correct result would of course be zero. Problem only occurs for exactly this shift distance: one bit less produces the right answer. Problem also occurs on Guile 1.8.8. Looking at the implementation, the problem is attributable to the negation of the shift distance, which in twos-complement fails to produce the expected positive result. Note the resemblance to bug #14864, fixed in 2.0.10. This bug is of very similar form, but is distinct. The test cases of #14864 pass for me on the 2.0.11 that shows the problem with a 2^63 bit shift. My bug does occur with the rnrs bitwise-arithmetic-shift-right, which was used in #14864, as well as with ash. -zefram
bug#21902: doc incorrectly describes Julian Date
The manual says, in the section "SRFI-19 Introduction", #Also, for those not familiar with the terminology, a "Julian Day" is # a real number which is a count of days and fraction of a day, in UTC, # starting from -4713-01-01T12:00:00Z, ie. midday Monday 1 Jan 4713 B.C. There are two errors in the first statement of the epoch for Julian Date, in ISO 8601 format. The JD epoch is noon on 1 January 4713 BC *in the proleptic Julian calendar*. The ISO 8601 format is properly never used on the Julian calendar: ISO 8601 specifies the use of the Gregorian calendar, including proleptically where necessary (as it most certainly is here). On the proleptic Gregorian calendar, the JD epoch is noon on 24 November 4714 BC, and so the ISO 8601 expression should have some "-11-24". The second error is in how the year is expressed in ISO 8601. The initial "-" does not mean the BC era, it means that the year number is negative. ISO 8601 specifies that the AD era is always used, with year numbers going negative where necessary; this arrangement is commonly known as "astronomical year numbering". So "" means 1 BC, "-0001" means 2 BC, and "-4713" means 4714 BC. So the "-4713" is not correct for the attempted expression of the Julian calendar date, but happens to be correct for the Gregorian calendar date. Putting it together, a correct ISO 8601 expression for the Julian Date epoch is "-4713-11-24T12:00:00Z". The word-based statement of the JD epoch is correct as far as it goes, but would benefit considerably by the addition of a clause stating that it is in the proleptic Julian calendar. (Generally, a clarification of which calendar is being used is helpful with the statement of any date prior to the UK's switch of calendar in 1752.) The description of Modified Julian Date is essentially correct. However, there's a third problem: misuse of the term "UTC" for historical times. The description of Julian Date says it's counted "in UTC", and the statement of the MJD epoch describes its 1858 time as being specified in UTC. UTC is defined entirely by its relationship to TAI, which is defined by the operation of atomic clocks. TAI is therefore only defined for the period since the operation of the first caesium atomic clock in the middle of 1955. The UTC<->TAI relationship isn't actually defined even that far back: UTC begins at the beginning of 1961 (and that was not in the modern form with leap seconds). It is therefore incorrect to apply the term "UTC" to any time prior to 1961. These two references to UTC should instead be to "UT", the wider class of closely-matching time scales of which UTC is one representative. Also, in the first sentence of this doc section, the phrase "universal time (UTC)" should be either "universal time (UT)" or (more likely) "coordinated universal time (UTC)". -zefram
bug#21903: date->string duff ISO 8601 negative years
The date->string function from (srfi srfi-19), used on ISO 8601 formats "~1", "~4" and "~5", for years preceding AD 1, has an off-by-one error: scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (date->string (julian-day->date 0 0) "~4") $1 = "-4714-11-24T12:00:00Z" The date in question, the JD epoch, is 24 November 4714 BC (in the proleptic Gregorian calendar). In ISO 8601 format, that year is properly represented as "-4713", not "-4714", because ISO 8601 uses the AD era exclusively. 4714 BC = AD -4713. -zefram
bug#21904: date->string duff ISO 8601 format for non-4-digit years
The date->string function from (srfi srfi-19), used on ISO 8601 formats "~1", "~4", and "~5", gets the formatting of year numbers wrong when the year number doesn't have exactly four digits. There are multiple cases: scheme@(guile-user)> (date->string (julian-day->date 150 0) "~1") $1 = "-607-10-04" scheme@(guile-user)> (date->string (julian-day->date 170 0) "~1") $2 = "-59-05-05" scheme@(guile-user)> (date->string (julian-day->date 172 0) "~1") $3 = "-4-02-05" For year numbers -999 to -1 inclusive, date->string is using the minimum number of digits to express the number, but ISO 8601 requires the use of at least four digits, with zero padding on the left. So one should write "-0059" rather than "-59", for example. Note that this range is also affected by the off-by-one error in the selection of the year number that I described in bug #21903, but that's not the subject of the present bug report. Here I'm concerned with how the number is represented in characters, not with how the year is represented numerically. scheme@(guile-user)> (date->string (julian-day->date 1722000 0) "~1") $4 = "2-07-29" scheme@(guile-user)> (date->string (julian-day->date 173 0) "~1") $5 = "24-06-23" scheme@(guile-user)> (date->string (julian-day->date 200 0) "~1") $6 = "763-09-18" For year numbers 1 to 999 inclusive, again date->string is using the minimum number of digits to express the number, but ISO 8601 requires the use of at least four digits. If no leading "+" sign is used then the number must be exactly four digits, and that is the appropriate format to use in this situation. So one should write "0024" rather than "24", for example. The year number 0, representing the year 1 BC, logically also falls into this group, and should be represented textually as "". Currently this case doesn't arise in the function's output, because the off-by-one bug has it erroneously emit "-1" for that year. scheme@(guile-user)> (date->string (julian-day->date 1000 0) "~1") $7 = "22666-12-20" scheme@(guile-user)> (date->string (julian-day->date 1 0) "~1") $8 = "269078-08-07" For year numbers 1 and above, it is necessary to use more than four digits for the year, and that's permitted, but ISO 8601 requires that more than four digits are preceded by a sign. For positive year numbers the sign must be "+". So one should write "+22666" rather than "22666", for example. The formatting of year numbers for ISO 8601 purposes is currently only correct for numbers -1000 and lower (though the choice of number is off by one) and for year numbers 1000 to inclusive. -zefram
bug#21906: julian-day->date negative input breakage
scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (julian-day->date 0 0) $1 = # scheme@(guile-user)> (julian-day->date -1 0) $2 = # scheme@(guile-user)> (julian-day->date -10 0) $3 = # scheme@(guile-user)> (julian-day->date -1000 0) $4 = # Observe the various erroneous field values: negative hour, negative day-of-month, zero month. These occur in general for various negative JD inputs. Not only should the conversion not produce these kinds of values, the date structure type probably ought to reject them if they get that far. -zefram
bug#21907: date->string duff ISO 8601 zone format
scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (date->string (julian-day->date 245 3600) "~4") $1 = "1995-10-09T13:00:00+0100" scheme@(guile-user)> (date->string (julian-day->date 245 3630) "~4") $2 = "1995-10-09T13:00:30+0100" There are two problems here with date->string's representation of zone offsets for the ISO 8601 formats "~2" and "~4". Firstly, because the time-of-day is represented in the extended format with colon separators, the zone offset must also be represented with colon separators. So the first "+0100" above should be "+01:00". Secondly, the offset is being truncated to an integral minute, so the output doesn't fully represent the zone offset. More importantly, because the local time-of-day isn't being adjusted to match, it's not accurately representing the point in time. ISO 8601 doesn't permit a seconds component in the zone offset, so you have a choice of three not-entirely-satisfactory options. Firstly, you could round the zone offset and adjust the represented local time accordingly, so the 3630 conversion above would yield either "1995-10-09T13:00:00+01:00" or "1995-10-09T13:01:00+01:01". Secondly, you could use the obvious extension of the ISO 8601 format to a seconds component, outputting "1995-10-09T13:00:30+01:00:30". Or finally you could signal an error when trying to represent a zone offset that's not an integral minute. Incidentally, for offsets of -1 to -59 inclusive, the truncation isn't clearing the negative sign, so is producing the invalid output "-". The zero offset is required to be represented with a "+" sign. If you take the rounding option described above, anything that rounds to a zero-minutes offset must yield "+00:00" in the output. -zefram
bug#21911: TAI-to-UTC conversion leaps at wrong time
Probing the TAI-to-UTC conversion offered by srfi-19's time-tai->date, in the minutes around the leap second in 2012: scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (for-each (lambda (d) (write (list d (date->string (time-tai->date (add-duration (julian-day->time-tai 2456109) (make-time time-duration 0 d)) 0) "~4"))) (newline)) (list 43000 43160 43164 43165 43166 43167 43199 43200 43201 43202)) (43000 "2012-06-30T23:56:40Z") (43160 "2012-06-30T23:59:20Z") (43164 "2012-06-30T23:59:24Z") (43165 "2012-06-30T23:59:25Z") (43166 "2012-06-30T23:59:25Z") (43167 "2012-06-30T23:59:26Z") (43199 "2012-06-30T23:59:58Z") (43200 "2012-06-30T23:59:59Z") (43201 "2012-06-30T23:59:60Z") (43202 "2012-07-01T00:00:01Z") The julian-day->time-tai conversion is correct (the JD refers to 2012-06-30T12:00:00 UTC, which is 2012-06-30T12:00:34 TAI), and the duration addition works in a perfectly regular manner in TAI space. All the interesting stuff happens in the TAI-to-UTC conversion, between the time-tai structure and the date structure. The same thing happens if the conversion is performed by separate time-tai->time-utc and time-utc->date calls. The date->string part is correct and uninteresting. The conversion is initially correct, minutes before midnight, but a discontinuity is seen 35 seconds before midnight. Outputs from then up to one second after midnight are one second slow. At one second after midnight it recovers. Because 35 seconds happens to be the TAI-UTC difference prevailing immediately after this leap second, I suspect that this is down to a time_t value (as used in the time-utc structure) for the moment of the leap being misinterpreted as a time-tai seconds value. The UTC-to-TAI conversion is in better shape. As a result, time-tai->time-utc and time-utc->time-tai are not inverses during the 35 second erroneous period. Round-tripping through the two conversions produces an output not matching the input. -zefram
bug#21912: TAI<->UTC conversion botches the unknown
Probing the existence of leap seconds on particular days, via srfi-19's TAI-to-UTC conversion. The methodology here is to take noon UT on the day of interest, convert to TAI, add 86400 seconds, then convert to UTC and display. The resulting time of day is 11:59:59 if there is a leap second that day, and 12:00:00 if there is not. scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (date->string (time-tai->date (add-duration (julian-day->time-tai 2455743) (make-time time-duration 0 86400)) 0) "~4") $1 = "2011-07-01T12:00:00Z" scheme@(guile-user)> (date->string (time-tai->date (add-duration (julian-day->time-tai 2456109) (make-time time-duration 0 86400)) 0) "~4") $2 = "2012-07-01T11:59:59Z" scheme@(guile-user)> (date->string (time-tai->date (add-duration (julian-day->time-tai 2457204) (make-time time-duration 0 86400)) 0) "~4") $3 = "2015-07-01T12:00:00Z" scheme@(guile-user)> (date->string (time-tai->date (add-duration (julian-day->time-tai 2457935) (make-time time-duration 0 86400)) 0) "~4") $4 = "2017-07-01T12:00:00Z" For 2011-06-30 it is correct that there was not a leap second, and for 2012-06-30 it is correct that there was. But for 2015-06-30 it says there was not a leap second, when in fact there was. For 2017-06-30 it says there will not be a leap second, when in fact it is not yet determined whether there will be. Really both of these errors come from the same cause. At the time this Guile 2.0.11 was released, the leap status of 2015-06-30 had not yet been determined. Both 2015 and 2017 fall within the future period beyond the scope of this Guile's static leap second knowledge. The bug is not that Guile doesn't know that there was a leap second in 2015. As the 2017 case illustrates, it's impossible for it to know all the leap second scheduling about which it can be asked. The bug is that Guile *thinks* it knows about all future leap seconds. It specifically thinks that there will be no leaps at all beyond the historically-scheduled ones that it knows about. Guile ought to be aware of how far its leap table extends, and signal an error when asked to perform a TAI<->UTC conversion that falls outside its scope. -zefram
bug#21915: write inconsistent about #nil
The write function is inconsistent about whether it distinguishes between #nil and (): scheme@(guile-user)> '(#nil . a) $1 = (#nil . a) scheme@(guile-user)> '(a . #nil) $2 = (a) Thee latter behaviour, emitting #nil as if it were (), breaks the usual write/read round-tripping, and the traditional correspondence between equal? and matching of written representation. Admittedly those standards are not absolute, nor is the extent to which they're expected to hold documented, but #nil is clearly sufficiently atomic to be the kind of value to which one would expect them to apply. For these reasons, if a consistent behaviour is to be chosen, I think it should be to consistently distinguish the values. I think the behaviour should be consistent. The values should be distinguished or not without regard to the context in which they arise within an s-expression. Whatever is done, even if it's to endorse the inconsistency, the behaviour should be documented, with rationale. -zefram
bug#22033: time-utc format is lossy
In SRFI-19, round-tripping some UTC dates through the time-utc structure format, for the couple of seconds around a leap second: scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (define (tdate d) (write (list (date->string d "~4") (date->string (time-utc->date (date->time-utc d) 0) "~4"))) (newline)) scheme@(guile-user)> (tdate (make-date 0 59 59 23 30 6 2012 0)) ("2012-06-30T23:59:59Z" "2012-06-30T23:59:59Z") scheme@(guile-user)> (tdate (make-date 0 60 59 23 30 6 2012 0)) ("2012-06-30T23:59:60Z" "2012-06-30T23:59:60Z") scheme@(guile-user)> (tdate (make-date 0 0 0 0 1 7 2012 0)) ("2012-07-01T00:00:00Z" "2012-06-30T23:59:60Z") scheme@(guile-user)> (tdate (make-date 0 1 0 0 1 7 2012 0)) ("2012-07-01T00:00:01Z" "2012-07-01T00:00:01Z") Observe that the second immediately following the leap second, the first second of the following UTC day, isn't round-tripped correctly. It comes back as the leap second. These two seconds are perfectly distinct parts of the UTC time scale, and the time-utc format ought to preserve their distinction. -zefram
bug#22034: time-utc->date shows bogus zone-dependent leap second
time-utc->date seems to think that a leap second occurs at a different time in each time zone: scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (define (tdate d) (write (list (date->string d "~4") (date->string (time-utc->date (date->time-utc d) 3600) "~4"))) (newline)) scheme@(guile-user)> (tdate (make-date 0 59 59 22 30 6 2012 0)) ("2012-06-30T22:59:59Z" "2012-06-30T23:59:59+0100") scheme@(guile-user)> (tdate (make-date 0 0 0 23 30 6 2012 0)) ("2012-06-30T23:00:00Z" "2012-06-30T23:59:60+0100") scheme@(guile-user)> (tdate (make-date 0 1 0 23 30 6 2012 0)) ("2012-06-30T23:00:01Z" "2012-07-01T00:00:01+0100") These are three consecutive seconds that occur an hour before a genuine leap second (at 23:59:60Z). Observe that time-utc->date, applied to the middle second, describes it as a leap second happening at 23:59:60+01:00, which is bogus. Describing the same seconds on input as a date structure with a non-zero zone offset produces the same wrong output, and requesting output with a different zone offset changes which second is affected. The faulty output is always 23:59:60 in the output zone. Matching up with this, the actual leap second is never correctly described with a non-zero zone offset. It should be, for example, 00:59:60+01:00. However, probing for this side of the problem also runs into the round-tripping failure that I described in bug#22033. -zefram
bug#22901: drain-input doesn't decode
The documentation for drain-input says that it returns a string of characters, implying that the result is equivalent to what you'd get from calling read-char some number of times. In fact it differs in a significant respect: whereas read-char decodes input octets according to the port's selected encoding, drain-input ignores the selected encoding and always decodes according to ISO-8859-1 (thus preserving the octet values in character form). $ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! (current-input-port) "UCS-2BE") (write (port-encoding (current-input-port))) (newline) (write (map char->integer (let r ((l '\''())) (let ((c (read-char (current-input-port (if (eof-object? c) (reverse l) (r (cons c l))) (newline)' "UCS-2BE" (353 610 867) $ echo -n $'\1a\2b\3c' | guile-2.0 -c '(set-port-encoding! (current-input-port) "UCS-2BE") (write (port-encoding (current-input-port))) (newline) (peek-char (current-input-port)) (write (map char->integer (string->list (drain-input (current-input-port) (newline)' "UCS-2BE" (1 97 2 98 3 99) The practical upshot is that the input returned by drain-input can't be used in the same way as regular input from read-char. It can still be used if the code doing the reading is totally aware of the encoding, so that it can perform the decoding manually, but this seems a failure of abstraction. The value returned by drain-input ought to be coherent with the abstraction level at which it is specified. I can see that there is a reason for drain-input to avoid performing decoding: the problem that occurs if the buffer ends in the middle of a character. If drain-input is to return decoded characters then presumably in this case it would have to read further octets beyond the buffer contents, in an unbuffered manner, until it reaches a character boundary. If this is too unpalatable, perhaps drain-input should be permitted only on ports configured for single-octet character encodings. If, on the other hand, it is decided to endorse the current non-decoding behaviour, then the break of abstraction needs to be documented. -zefram
bug#22902: GUILE_INSTALL_LOCALE not equivalent to setlocale
The documentation claims that setting GUILE_INSTALL_LOCALE=1 in the environment is equivalent to calling (setlocale LC_ALL "") at startup. Actually there is at least one difference: calling setlocale causes ports (both primordial and later-opened) to be initially configured for the locale's nominal character encoding, but setting the environment variable does not. Setting the environment variable leaves the port encoding at #f, functioning as ISO-8859-1, just as if locale had not been invoked at all. I do see some effects from setting the environment variable, specifically message strings affecting strftime. $ echo -n $'L\xc3\xa9on' | LANG=de_DE.UTF-8 guile-2.0 -c '(write (strftime "%c" (gmtime 10))) (newline) (write (port-encoding (current-input-port))) (newline) (write (map char->integer (let r ((l '\''())) (let ((c (read-char (current-input-port (if (eof-object? c) (reverse l) (r (cons c l))) (newline)' "Sun Sep 9 01:46:40 2001" #f (76 195 169 111 110) $ echo -n $'L\xc3\xa9on' | GUILE_INSTALL_LOCALE=1 LANG=de_DE.UTF-8 guile-2.0 -c '(write (strftime "%c" (gmtime 10))) (newline) (write (port-encoding (current-input-port))) (newline) (write (map char->integer (let r ((l '\''())) (let ((c (read-char (current-input-port (if (eof-object? c) (reverse l) (r (cons c l))) (newline)' "So 09 Sep 2001 01:46:40 GMT" #f (76 195 169 111 110) $ echo -n $'L\xc3\xa9on' | LANG=de_DE.UTF-8 guile-2.0 -c '(setlocale LC_ALL "") (write (strftime "%c" (gmtime 10))) (newline) (write (port-encoding (current-input-port))) (newline) (write (map char->integer (let r ((l '\''())) (let ((c (read-char (current-input-port (if (eof-object? c) (reverse l) (r (cons c l))) (newline)' "So 09 Sep 2001 01:46:40 GMT" "UTF-8" (76 233 111 110) In case anyone trawls the archives later investigating the usage of GUILE_INSTALL_LOCALE: I am not attempting to use it myself, despite the scenario implied by the above test cases. I think it's a bloody stupid mechanism, imposing on the program something that needs to be under the program's control, and which previously was. I'm actually investigating how to make programs cope with the unpredictable situation caused by this mechanism with the unpredictable environment setting. -zefram
bug#22905: GUILE_INSTALL_LOCALE produces unavoidable noise
GUILE_INSTALL_LOCALE=1 breaks some of the robustness of non-locale-using programs, marring their stderr output if the environment's locale settings are faulty. Suppose you have a program written in Guile Scheme that doesn't use any locale facilities. To be portable to the GUILE_INSTALL_LOCALE=1 situation (which the documentation threatens will become the default in Guile 2.2), it must be prepared to start up with some locale already selected, and reconfigure from there as required. Being a conscientious programmer, you are of course willing to add the (setlocale LC_ALL "C") and whatever other invocations are required to recover the non-locale state. But then this situation arises: $ LANG=wibble GUILE_INSTALL_LOCALE=1 guile-2.0 -c '(setlocale LC_ALL "C") (write "hi") (newline)' guile: warning: failed to install locale "hi" The warning shown goes to the program's stderr. It does not come from the program's setlocale call, which is succeeding and would signal a perfectly ordinary (catchable) exception if it failed. The warning comes from the implicit setlocale call triggered by GUILE_INSTALL_LOCALE=1, before the program gains control. As far as I can see, there is no way for the program to prevent the failing setlocale attempt or to muffle the warning. Or even to detect that this has happened. Guile should not be saying anything on the program's stderr. This is damaging the program's visible behaviour, and making it impossible to effectively port non-locale programs to new Guile versions. If Guile must attempt this implicit setlocale and continue to run the program if it fails, then it needs to keep quiet about the failure. This is no disadvantage to a program that actually wants to use the environmental locale, because the program is free to call (setlocale LC_ALL "") itself and handle its failure in whatever manner it finds appropriate. Indeed, any such program predating Guile 2.0 must already be performing that call itself, because the implicit setlocale didn't occur then. The same for any program portable to pre-2.0 Guiles. And on Guile 2.0+ such a program still really needs to perform the call itself, because it can't predict how GUILE_INSTALL_LOCALE will be set in the environment, so still can't rely on the implicit setlocale happening. However, if it is deemed to be essential that Guile attempt the implicit setlocale and gripe about its failure, then the message should not precede or otherwise mix with the actual program run. The message should be emitted *instead of* running the program, declaring the absolute incompatibility of the Guile framework with this environmental condition. -zefram
bug#22910: read-only setlocale has side effect
A call to setlocale with no second argument is documented to be a read-only operation, querying the current locale configuration. In fact it has a side effect of setting the encoding on primordial ports: $ guile-2.0 -c '(write (port-encoding (current-input-port))) (newline) (setlocale LC_TIME) (write (port-encoding (current-input-port))) (newline)' #f "ANSI_X3.4-1968" Observe that this occurs even if the locale reading operation is for a category unrelated to character encoding. The actual decoding behaviour of read-char is altered in accordance with the reported encoding. Non-primordial ports opened before or after the setlocale call are not affected. -zefram
bug#22910: read-only setlocale has side effect
Additional information: setlocale's side effect on primordial ports happens even if the port's encoding has been individually set using set-port-encoding!. This means that to maintain a specific encoding on these ports (other than the locale's nominal encoding, which is likely to not be binary compatible) it is necessary to set the encoding repeatedly, before any I/O operation after setlocale might have been called. Since the read-only mode of setlocale has this effect, and arbitrary library code might feel entitled to call setlocale for read purposes without documenting that it does so, this really amounts to setting the encoding before every I/O operation. -zefram
bug#20822: environment mangled by locale
I wrote: >There's an obvious parallel with reading data from an input port. >If setlocale is called, then input is by default decoded according >to locale, including the very lossy ASCII decode for C/POSIX. But if >setlocale has not been called, then input is by default decoded according >to ISO-8859-1, preserving the actual octets. It would probably be most >sensible that, if setlocale hasn't been called, getenv should likewise >decode according to ISO-8859-1. It might also be sensible to offer >some explicit control over the encoding to be used with the environment, >just as I/O ports have a concept of per-port selected encoding. In the light of what I've learned recently about Guile's locale handling, this needs some revision. What I thought was a well-defined "setlocale not called" state is a mirage. The encoding of ports is not reliably fixed at ISO-8859-1; per bug#22910 it can be affected by ostensibly read-only calls to setlocale, and seems to be only accidentally ISO-8859-1 until that's done. So that's not a good model. Due to the GUILE_INSTALL_LOCALE mechanism, a program wanting no locale selected can't just never call setlocale in write mode. So setlocale not having been called is not really available as a way to control anything. So it would seem to be necessary to use some explicit control of character encoding for environment access. (This must be control of encoding per se, not merely of which locale to use for environment access, because, as I noted in the original report, there's no guarantee of a locale with a suitable encoding.) This could be an optional parameter to the environment access functions, or a settable variable that takes precedence over locale to determine encoding for all environment access. The latter would match the encoding model used by ports. -zefram
bug#20823: argv mangled by locale
I wrote: >My comments about resolution in bug#20822 "environment mangled by locale" >mostly apply here too, The revised comments that I have just made on that ticket also apply here. Short version: "absence of setlocale" isn't a useful criterion, so explicit control of encoding will be necessary. -zefram
bug#22913: filenames mangled by locale
It seems that guile-2.0 applies locale encoding and decoding to pathnames being used in system calls. This radically breaks file access anywhere that the locale's character encoding is anything other than a simple 8-bit encoding such as ISO-8859-1. For example, in the default C locale with its nominal ASCII encoding, $ guile-2.0 -c '(open-file (list->string (map integer->char '\''(76 195 169 111 110))) "w")' $ echo L*n | od -tc 000 L ? ? o n \n 006 Those are literal question marks in the name of the file actually created, apparently arising as substitutions for the high-half octets in the requested filename. Existing files with names containing high-half octets can't be found (resulting in an ENOENT error message that shows the actually-existing filename), and new ones can't be created (actually being created under the mangled name instead). There's no warning or exception advising that the requested name can't be used, just this misbehaviour. The equivalent problem arises with decoding when filenames are received: $ echo foo > $'L\303\251on.txt' $ guile-2.0 -c '(define d (opendir ".")) (let r () (let ((n (readdir d))) (if (eof-object? n) #t (begin (if (eq? (car (reverse (string->list n))) #\t) (begin (write (map char->integer (string->list n))) (newline))) (r)' (76 63 63 111 110 46 116 120 116) Again no warning or exception, just incorrect data returned. To work around this would require the program to select a locale with a more accommodating nominal character encoding. As I've previously noted, there's no guarantee of such a locale existing. Thus the above behaviour is fatal to any attempt to write in Guile Scheme a program to operate on arbitrarily-named files. Guile even applies this mangling to the pathname of a script that it is to load: $ echo '(write "hi")(newline)' > $'L\303\251on.scm' $ guile-2.0 -s L*n.scm [big error message saying it couldn't find the file that exists] Obviously, even if a program could turn off the locale mangling in general, this instance of it occurs too early for the program to avoid. The guile framework itself has acquired the kind of 8-bit-cleanliness bug that it is imposing on the programs that it interprets. -zefram
bug#16357: insufficient print abbreviation in error messages
Andy Wingo wrote: >Thoughts? How was this managed in Guile 1.8? It seems that you need the truncated-print mechanism to be always available internally, but this doesn't require that it be always visible to the user. You can still require the full libraries to be loaded for the user to get access. Lazy loading sounds like a bad idea. Error handling is a bad place to attempt something that complex and failure-prone. -zefram
bug#16365: (* 0 +inf.0) rationale is flawed
Mark H Weaver wrote: > I also suspect that (/ 0 ) should be 0, >although that conflicts with R6RS. We should probably investigate the >rationale behind R6RS's decision to specify that (/ 0 0.0) returns a NaN >before changing that, though. I think R6RS makes sense for (/ 0 0.0). A flonum zero really represents a range of values including both small non-zero numbers and actual zero. The mathematical result of the division could therefore be either zero or undefined. To return zero for it would be picking a particular result, on the assumption that the flonum zero actually represented a non-zero value, and that's not justified. So to use the flonum behaviour seems the best thing available. (/ 0 3.5) is a different case. Here the mathematical result is an exact zero, and I'm surprised that R6RS specifies that this should be an inexact zero. This seems inconsistent with (* 1.0 0), for which it specifies that the result may be either 0 or 0.0. I'd also question R6RS in the related case of (/ 0.0 0). Mathematically this division is definitely an error, regardless of whether the dividend represents zero or a non-zero number. So it would make sense for this to raise an exception in the same manner as (/ 3 0) or (/ 0 0), rather than get flonum treatment as R6RS specifies. But deviating from R6RS, even with a good rationale for other behaviour, would be a bad idea. The questionable R6RS requirements are not crazy, just suboptimal. The case I originally raised, (* 0 +inf.0), is one for which R6RS offers the choice. -zefram
bug#20823: argv mangled by locale
Andy Wingo wrote: >I also don't >know whether to supply an optional "encoding" argument, and use that >encoding to decode the command line arguments. That, or something that just retrieves octets, is necessary. Decoding via the selected locale does not suffice, because there's no guarantee that there'll be a locale with a cooperative encoding. -zefram
bug#21899: let/ec continuations not distinct under compiler
Andy Wingo wrote: > ,opt (let* ((x (list 'a)) > (y (list 'a))) > (list x y)) > ;; -> > (let* ((x (list 'a)) (y x)) (list x y)) Wow, that's a scary level of wrongitude. It's specific to let* (or equivalent nested let forms), but really easy to trigger within that: scheme@(guile-user)> (let ((x (list 'a)) (y (list 'a))) (eq? x y)) $1 = #f scheme@(guile-user)> (let* ((x (list 'a)) (y (list 'a))) (eq? x y)) $2 = #t scheme@(guile-user)> (let ((x (list 'a))) (let ((y (list 'a))) (eq? x y))) $3 = #t -zefram
bug#21899: let/ec continuations not distinct under compiler
One more variant: scheme@(guile-user)> (let ((x (list 'a))) (eq? x (list 'a))) $1 = #t scheme@(guile-user)> ,opt (let ((x (list 'a))) (eq? x (list 'a))) $2 = (let ((x (list 'a))) (eq? x x)) -zefram
bug#21902: doc incorrectly describes Julian Date
Andy Wingo wrote: >Would you like to propose a specific patch to the documentation? Sure. Patch attached. -zefram --- a/doc/ref/srfi-modules.texi 2014-03-20 20:21:21.0 + +++ b/doc/ref/srfi-modules.texi 2016-06-24 18:57:59.088243245 +0100 @@ -2461,8 +2461,8 @@ @cindex UTC @cindex TAI This module implements time and date representations and calculations, -in various time systems, including universal time (UTC) and atomic -time (TAI). +in various time systems, including Coordinated Universal Time (UTC) +and International Atomic Time (TAI). For those not familiar with these time systems, TAI is based on a fixed length second derived from oscillations of certain atoms. UTC @@ -2494,18 +2494,12 @@ @cindex julian day @cindex modified julian day Also, for those not familiar with the terminology, a @dfn{Julian Day} -is a real number which is a count of days and fraction of a day, in -UTC, starting from -4713-01-01T12:00:00Z, ie.@: midday Monday 1 Jan -4713 B.C. A @dfn{Modified Julian Day} is the same, but starting from -1858-11-17T00:00:00Z, ie.@: midnight 17 November 1858 UTC. That time -is julian day 240.5. - -@c The SRFI-1 spec says -4714-11-24T12:00:00Z (November 24, -4714 at -@c noon, UTC), but this is incorrect. It looks like it might have -@c arisen from the code incorrectly treating years a multiple of 100 -@c but not 400 prior to 1582 as non-leap years, where instead the Julian -@c calendar should be used so all multiples of 4 before 1582 are leap -@c years. +is a real number which is a count of days and fraction of a day, in UT, +starting from -4713-11-24T12:00:00Z, ie.@: midday UT on Monday 24 November +4714 BC in the proleptic Gregorian calendar (1 January 4713 BC in the +proleptic Julian calendar). A @dfn{Modified Julian Day} is the same, +but starting from 1858-11-17T00:00:00Z, ie.@: midnight UT on Wednesday +17 November AD 1858. That time is julian day 240.5. @node SRFI-19 Time
bug#20822: environment mangled by locale
Mark H Weaver wrote: > by convention they are >supposed to encoded in the locale encoding. This convention is bunk. The encoding aspect of the locale system is fundamentally broken: the model is that every string in the universe (every file content, filename, command line argument, etc.) is encoded in the same way, and the locale environment variable tells you which universe you're in. But in the real universe, files, filenames, and so on turn up encoded how their authors liked to encode them, and that's not always the same. In the real universe we have to cope with data that is not encoded in our preferred way. > If that convention is >violated, I don't see what a program could do about it. If the convention is violated, then there is some difficulty in presenting correctly-encoded (or even consistently-encoded) output to the user, but it is not insuperable. Perhaps the program knows by some non-locale means how a string is encoded, and can explicitly convert. Perhaps it doesn't know the real encoding, but can trust that the user will understand the octet string if it is passed through with neither decoding of input nor encoding for output. Or perhaps the program doesn't need to put the string into textual output at all, but only to use it some API or file format that's expecting an encodingless octet string. So there are many things a program can reasonably do about it, and which one to do depends on the application. >Can someone show me a realistic example of how this would be used in >practice? Looking specifically at environment variables: an environment variable could give the name of a file that is to be consulted under specified circumstances, and the right file may happen to have a name that is inconsistent with the encoding used by the user's terminal. (The filename is not required for output; it only needs to be passed as an uninterpreted octet string to the open(2) syscall.) An environment variable could specify a Unicode-using name of a language module to be loaded, while the user doesn't otherwise use Unicode, or doesn't use an encoding encompassing enough of it. (Name not required on output, again; will be either transformed into a filename or looked up in a file format that specifies its own encoding.) The program could be env(1), not interpreting the environment but needing to output the octets correctly. The program could be saving an uninterpreted environment, for a cron job to later run some other program with equivalent settings. -zefram
bug#22905: GUILE_INSTALL_LOCALE produces unavoidable noise
Andy Wingo wrote: >I believe this is consistent with other programs which call setlocale, >notably Perl and Bash. It is consistent with them, but the fact that others get it wrong isn't an excuse. >avoid the call to setlocale, and Guile offers the GUILE_INSTALL_LOCALE=0 >knob to do this. That knob is not available to the program. If you provide a knob that the program can control, independent of the environment, with backward compatibility to Guile 1.8, then we can consider the setlocale call avoidable. > Probably adding the suggestion to the warning is the right >thing; wdyt? No, that's not an improvement. Emitting a warning and then running the program anyway is fundamentally broken behaviour, and tweaking the content of the warning doesn't help. Some way for the program to detect that you've screwed up its output, so that it can decide to abort rather than continue with faulty output, would be another middle way. -zefram
bug#24186: setlocale can't be localised
In Guile 1.8 it was possible to localise the effect of a setlocale operation, but in Guile 2.0 it's no longer possible by natural use of the locale API. This loss of a useful facility is either a bug or something that needs to be discussed in the documentation. In Guile 1.8 one could perform a temporary setlocale for the execution of some piece of code, and revert its effect by another setlocale on unwind. This looks like: (define (call-with-locale cat newval body) (let ((oldval #f)) (dynamic-wind (lambda () (set! oldval (setlocale cat)) (setlocale cat newval)) body (lambda () (setlocale cat oldval) Some difficulty arises from this being temporally scoped, where dynamic or lexical scoping would be nicer, but in single-threaded programs it works pretty well. The C setlocale(3) API, after which Guile's setlocale is modelled, is obviously designed to enable this kind of mechanism: the read operation reports all relevant state, and the write operation with the old value sets it all back as it was. It is critical to this ability that the read operation does indeed report all the state that will be set. In Guile 2.0, the setlocale function no longer corresponds so closely to the C setlocale(3), and this critical guarantee has been lost. I have previously reported in bug#22910 that the setlocale read operation has a side effect on port encoding, and obviously that interferes with the above code, but actually there's still a problem if that's fixed. The setlocale *write* operation also affects port encoding (actually the default port encoding fluid and the encoding of currently-selected ports), and that seems to be an intentional change, but it also breaks the above code. The setlocale read operation doesn't report the encoding of the currently-selected ports, so doesn't represent everything that setlocale will set. The setlocale write operation is not even capable of setting the port encodings independently: it sets all three to the encoding nominated by the locale selected for LC_CTYPE purposes. I think adding this extra effect to setlocale was a mistake. It doesn't fit the locale API. If the extra effect is removed, that would resolve this problem. If you really want setlocale to have this effect, then something needs to be done to address the ability that has been lost. The documentation certainly needs to describe the effect on port encoding, which it currently doesn't. (There is a mention of some interaction with the %default-port-encoding fluid in the documentation of that fluid, but it doesn't match reality: it doesn't say that setlocale writes to the fluid.) It also ought to specifically warn that the setlocale save-and-restore dance that works in C doesn't work here. It should explain what needs to be done by library functions that want to achieve a localised locale change. Are they entirely forbidden to use setlocale? Are they expected to manually save and restore port encodings around setlocale calls? (This is complicated by set-port-encoding! not accepting #f as an encoding value, despite it actually being a permitted value for the encoding slot.) Some example code equivalent to the above call-with-locale would be useful. -zefram
bug#22905: GUILE_INSTALL_LOCALE produces unavoidable noise
Andy Wingo wrote: >If you would like for me to work on your bugs then I would appreciate it >if you would keep things constructive. Thanks :) I'm sorry that that bit came across badly. I do appreciate your efforts. >Serious question tho: what sort of back-compatibility can there be with >a Guile that only supports latin-1 strings? I'd expect that almost any program that runs on Guile 1.8 ought to be portable, with only minimal modifications, to later versions of Guile. Obviously this wouldn't work the other way round: if a program relies on 2.0's non-Latin-1 strings then it can't be easily ported back to 1.8. But lots of programs work fine on 1.8, either not processing non-Latin-1 data or processing it in forms other than the builtin string type. Scheme was a good programming language long before Unicode came along. > What property is it that >you are going for here? In that bit, I'm going for it being possible for a program to run on both Guile 1.8 and Guile 2.N while avoiding the new locale warning from Guile 2.N. This should be a single program file, starting with a "#!/usr/bin/guile" line, where /usr/bin/guile may refer to either version of Guile. This would be especially relevant for a program originally written for Guile 1.8, but more generally is relevant for any program that doesn't need any of 2.0's new capabilities. The particular problem that arises is that a possible form for a warning-muffling switch would be a command-line switch that goes on the #! line. Any new switch of that nature wouldn't be recognised by Guile 1.8, and would cause an error when attempting to run the program on 1.8. >What about GUILE_INSTALL_LOCALE=require or something like that? In the environment? That's still not controllable by the program. The environment is the wrong place for any switch that needs to be the choice of the program. Whether to engage with the environmentally-suggested locale ought to be the choice of the program. >How would this work? I imagine a builtin function that returns a truth value saying whether the Guile framework has emitted a warning before running the program. Suppose it's called "program-running-with-unclean-output". Then those who particularly want clean output can write something like (when (program-running-with-unclean-output) (error "can't run after warnings")) This doesn't avoid the warning appearing, but does avoid treating a run marred by the warning as a successful program run. The program's checking code can easily be made portable back to Guile versions lacking the new function, by using cond-expand, false-if-exception, or other metaprogramming facilities. -zefram
bug#24186: setlocale can't be localised
et from the current locale. You then have the fluid default to that value, and have setlocale not touch the fluid at all. This way, if the user doesn't touch the fluid but does call setlocale then the locale controls the encoding of new ports. But if the user does set the fluid (to something other than #:locale-at-open), indicating a desire to specifically control default port encoding, then setlocale doesn't clobber the user's choice. How does this sound to you? > But I don't think >it should change the encoding of already-open ports, should it? In a situation where setlocale is expected to deliberately side-effect the default port encoding fluid, I can't figure out whether to expect it to do more. I suppose on general principle it's less surprising for it to do less. It's certainly less work to work around it, where the side effects are unwanted. If you go with the #:locale-at-open plan that I described above, then setlocale should definitely not touch the encoding of already-open ports. Just so that it is localisable as originally designed. There's another way to get the best of both worlds. In addition to the #:locale-at-open value for the default port encoding fluid, there could also be some special encoding value for a port, #:locale-at-io, meaning to use whatever locale is in effect at the time of an I/O operation. #:locale-at-io is also a valid value for the fluid, which will be copied into a new port in the regular way. The stdin, stdout, and stderr ports that are automatically opened at program initialisation can be set to #:locale-at-io, and setlocale now doesn't directly set the encoding of any port. If the user calls setlocale without otherwise controlling port encoding then the locale controls the encoding of the primordial ports. I expect that's the effect that the setlocale code was aiming for, given that when setlocale is called it's too late to affect the opening of the primordial ports. -zefram
bug#24186: setlocale can't be localised
I wrote: >is my first time compiling a Guile myself. It's failing on a missing >library for which Debian supplies no package. Turns out there was a package. It was complaining about a lack of "bdw-gc", and Debian doesn't have anything of that name, but it does have it under the name "libgc". So I've now got 2.1.3 running. All of the code in my day-of-week-string-for-locale sketch works exactly the same on 2.1.3 as it did on 2.0. -zefram
bug#22905: GUILE_INSTALL_LOCALE produces unavoidable noise
Andy Wingo wrote: >#!/bin/sh >export FOO=bar >exec guile $0 "$@" >!# That introduces all the complexity of using another language interpreter, one I've chosen not to write my program in. I don't much fancy working round a gotcha by importing another series of gotchas. Fundamentally, it seems like an admission of defeat. With care it would work, but means that Guile is not itself the platform on which to write a Unix program. Maybe you're OK with the idea that Guile programs aren't meant to run in their own right. Would you be OK with documenting it? It also means that the Guile program isn't actually seeing the user's environment, and so doesn't accurately pass that environment through to anything that it runs in turn. Working around that would involve some hairy and error-prone shell code. >This is certainly possible to do. Actually I would guess that this >works: > > (setlocale LC_ALL "") That succeeds in signalling an error in any case where the environmental locale doesn't exist, but that's not really what I want. If the framework didn't perform an implicit setlocale, and so didn't mar my output, I don't then want to make things break. That approach is also totally specific to the setlocale warning. If program-running-with-unclean-output were to exist, it should also cover uncleanliness due to auto-compile banners (bug#16364). It would be the solution (though not a great one) to both problems. >Does any of this work for you? Shell script wrapper is the closest so far, but it's nasty. You haven't proposed any real solution. The really simple solution would be to remove this switch from the environment entirely, and remove the implicit setlocale from the startup sequence entirely. The environment was always the wrong place for the switch, and there's no benefit in the implicit setlocale being as early as it is. The decision on whether to engage with the user's locale is then made entirely by the program, as part of its ordinary execution. If it wants to use the user's locale, it executes (setlocale LC_ALL ""). If it wants non-default handling of errors, it executes that in the dynamic scope of whatever throw or catch handler it likes. If it doesn't want to use the user's locale, it doesn't execute that. Bonus: works identically on older Guile versions. If you won't go for the simple solution, then a proper solution that maintains the default implicit setlocale would be to have a switch in a magic comment in the program file. Something like "#!GUILE_INSTALL_LOCALE=0\n!#\n" immediately following the program's initial #!...!# block. This is ignored as a comment by older Guile versions. The semantic on newer versions would be that the setting given there (which may be 0 or 1) determines conclusively whether the implicit setlocale happens. The environment variable would take effect as it currently does only for programs not containing this kind of setting. -zefram
bug#20823: argv mangled by locale
Andy Wingo wrote: >I also don't >know whether to supply an optional "encoding" argument, and use that >encoding to decode the command line arguments. If you don't fancy the profusion of extra "encoding" parameters on argv access (this ticket), environment access (bug#20822), and all sorts of syscalls (bug#22913), you could bundle them all together in a fluid. This would be a bit like the %default-port-encoding fluid, but setlocale should absolutely not modify it. It should follow the scheme that I laid out in bug#24186: its value can be either a string naming an encoding, or #:locale-at-io meaning that whenever encoding is required the currently selected locale is consulted. There should also be a fluid determining the conversion strategy, like the existing %default-port-conversion-strategy. These two fluids together would control the encoding and decoding for all operations that currently apply the locale encoding to arbitrary data. (Decoding locale-supplied messages is a different matter.) -zefram
bug#24186: setlocale can't be localised
Ludovic Courtes wrote: >That wouldn't help with the "setlocale" issue you describe per se, but >this would address such use cases in a different way. > >WDYT? Yes, explicit locale objects and locale parameters to relevant functions are a good thing. In general, the model of a global locale state is broken, at least by threading, so some advance beyond the setlocale system is necessary. Note the new(er) "uselocale" system in libc, which gives a per-thread locale state, fixing the biggest problem with setlocale. Some form of that could also be mapped into Guile; it would be reasonable to have a fluid that determines the locale to use where not overridden by an explicit parameter. All of that is welcome, but, as you say, doesn't deal with the actual problem I identified with setlocale. One can expect that setlocale will continue to be used for the foreseeable future, and it needs to be shorn of its unwanted side effects. -zefram
bug#26149: SRFI-19 doc erroneously warns about Gregorian reform
The documentation, near the start of the section on SRFI-19, says !*Caution*: The current code in this module incorrectly extends the ! Gregorian calendar leap year rule back prior to the introduction of ! those reforms in 1582 (or the appropriate year in various countries). ! The Julian calendar was used prior to 1582, and there were 10 days ! skipped for the reform, but the code doesn't implement that. ! !This will be fixed some time. Until then calculations for 1583 ! onwards are correct, but prior to that any day/month/year and day of the ! week calculations are wrong. The statements that the code is incorrect in this behaviour are erroneous. SRFI-19 itself says # A Date object, which is distinct from all existing types, represents a # point in time as represented by the Gregorian calendar as well as by a # time zone. The code is thus correct in always using the Gregorian calendar in date structures. Per ISO 8601 it is also correct in always using the Gregorian calendar in string output in that standard's formats. SRFI-19 isn't explicit about the calendar used as the basis for the other string output formats, but since the formatting proceeds from a date structure it seems implied that they should use the same basis as the date structure. For string input it is explicit that the parseable numeric formats correspond directly to fields of the date structure. There is no part of SRFI-19 that looks like it is ever intended to use the Julian calendar. So the code should not be `fixed', and the statements about that and about incorrectness should be removed from the documentation. It is sensible to keep an explicit statement about the treatment of the Gregorian reform, but the decision to use the Gregorian calendar proleptically should be credited to SRFI-19 (the standard), not to the code. -zefram
bug#26151: date-year-day screws up leap days prior to AD 1
In SRFI-19, the date-year-day function is meant to return the ordinal day of the year for a date structure. This value is properly 1 for the first day of each calendar year, and on all other days 1 greater than the value for the preceding day. But the implementation occasionally has it repeat a value: scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (date-year-day (julian-day->date 1719657 0)) $1 = 59 scheme@(guile-user)> (date-year-day (julian-day->date 1719658 0)) $2 = 60 scheme@(guile-user)> (date-year-day (julian-day->date 1719659 0)) $3 = 60 and occasionally has it skip a value: scheme@(guile-user)> (date-year-day (julian-day->date 1720023 0)) $4 = 59 scheme@(guile-user)> (date-year-day (julian-day->date 1720024 0)) $5 = 61 These errors happen around the end of February in years preceding AD 1. In each leap year a value is repeated (ordinal values 1 too low from March to December), and in each year immediately following a leap year a value is skipped (ordinal values 1 too high from March to December). Looking at the code, the bug arises from confusion between astronomical year numbering (which leap-year? expects to receive) and the bizarre zero-skipping year numbering that the library uses in the date structure (which date-year-day passes, via year-day, to leap-year?). Since the subject's come up: that year numbering used in the date structures is surprising, and I'm not sure quite what to make of it. It matches AD year numbering for years AD 1 onwards, but then numbers AD 0 (1 BC) as -1, and numbers all earlier years in accordance with that. It's almost a straight linear numbering of years, except that it skips the number 0. (At least you've documented it.) This is not a convention that I've seen in real use anywhere else, and that weird exception to the linearity makes it a pain to use. It's likely to cause bugs in user code, along the lines of the library bug that I've reported above and the previously-reported bug#21903. However, I haven't reported the year numbering per se as a bug, because SRFI-19 doesn't actually say what numbering is to be used for the date-year slot. If I had implemented SRFI-19 myself, without reference to existing implementations, I would have implemented astronomical year numbering (consistent AD year numbering, extending linearly in both directions), as used in ISO 8601. This is the most conventional year numbering, and at a stretch one could read SRFI-19 as implying it, by using some AD year numbering and not saying to deviate from that scheme. But really the standard is silent on the issue. Since the signification of date-year is an interoperability issue, this silence is a problem, and it is troubling that you and I have reached different interpretations of the standard on this point. Where did you get the idea to use a non-linear year numbering? What's your opinion of SRFI-19's (lack of) text on this matter? You should consider the possibility of changing your implementation to use the conventional astronomical year numbering in this slot. -zefram
bug#26162: time-duration screws up negative durations
Computing a difference between two SRFI-19 times, using time-difference, produces sensible results if the result is positive, but often nonsense if it's negative: scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (time-difference (make-time time-tai 0 1) (make-time time-tai 1000 0)) $1 = # scheme@(guile-user)> (time-difference (make-time time-tai 1000 0) (make-time time-tai 0 1)) $2 = # The above is computing the same interval both ways round. The first time is correct, but the second is obviously not the negative of the first. The correct result for the second would be # or possibly, at a stretch, # (SRFI-19 isn't clear about which way it's meant to be normalised. Having the nanoseconds field always non-negative is less surprising and easier to maintain through computation.) -zefram
bug#26163: time-difference doesn't detect error of differing time types
scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (time-difference (make-time time-tai 0 1) (make-time time-utc 0 1)) $1 = # SRFI-19 is explicit that it "is an error" if the arguments to time-difference are of different time types, and correspondingly the Guile documentation says the arguments "must be" of the same type. It would be very easy for time-difference to detect and signal this error. It's not absolutely a bug that it currently doesn't, but it would be a useful improvement if it did. -zefram
bug#26164: time-difference mishandles leap seconds
Computing the duration of the period between two UTC times, using SRFI-19 mechanisms: scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (define t0 (date->time-utc (make-date 0 59 59 23 30 6 2012 0))) scheme@(guile-user)> (define t1 (date->time-utc (make-date 0 1 0 0 1 7 2012 0))) scheme@(guile-user)> (time-difference t1 t0) $1 = # The two times are 2012-06-30T23:59:59 and 2012-07-01T00:00:01, so at first glance one would expect the duration to be 2 s as shown above, the two seconds being 23:59:59 and 00:00:00. But in fact there was a leap second 2012-06-30T23:59:60, so the duration of this period is actually 3 s. The SRFI-19 library is aware of this leap second, and will compute the duration correctly if it's translated into TAI: scheme@(guile-user)> (time-difference (time-utc->time-tai t1) (time-utc->time-tai t0)) $2 = # The original computation in UTC space should yield a result of 3 s, not the 2 s that it did. Since 1972, the seconds of UTC are of exactly the same duration as the seconds of TAI. (They're also phase-locked to TAI seconds.) Thus the period of three TAI seconds is also a period of three UTC seconds. It is not somehow squeezed into two UTC seconds. -zefram
bug#26165: date-week-day screws up prior to AD 1
Looking at day of the week, via SRFI-19's date-week-day: scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (julian-day->date 1721426 0) $1 = # scheme@(guile-user)> (date-week-day (julian-day->date 1721426 0)) $2 = 1 scheme@(guile-user)> (date-week-day (julian-day->date 1721425 0)) $3 = 6 The output for 0001-01-01, Monday, is correct. The preceding day is actually a Sunday, but Saturday was shown. Looking at the code, this bug arises for the same reason as the problem with date-year-day raised in bug#26151. The date-year value, of the weird zero-skipping year numbering, is passed to an algorithm that obviously expects astronomical year numbering. Looking at the code also reveals a second problem: the algorithm is written to perform divisions with quotient where it obviously needs modulo. This will manifest in erroneous computations for some earlier years once the above is fixed. -zefram
bug#26165: date-week-day screws up prior to AD 1
I wrote: >written to perform divisions with quotient where it obviously needs >modulo. Oops, thinko there. It needs floor-quotient, the quotient-like function that uses floor rounding. modulo is the *remainder*-like function that uses floor rounding. -zefram
bug#26182: cond-expand doc omits guile-2.2 feature
In Guile 2.2.0, the SRFI-0 (cond-expand) documentation says: ! The Guile core has the following features, ! ! guile ! guile-2 ;; starting from Guile 2.x ! r5rs ! srfi-0 ... As implemented in Guile 2.2.0, the unlisted feature guile-2.2 is also recognised by cond-expand. Since the documentation's list is otherwise complete, presumably it is intended to be a complete list, and the omission of this feature from the list is a mistake. In any case, it would be helpful to list it. -zefram
bug#26259: ~f SRFI-19 format broken for small nanoseconds values
The ~f format specifier in SRFI-19's date->string function is supposed to produce a decimal string representation of the seconds and nanoseconds portions of a date together: scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (date->string (make-date 55000 56 34 12 26 3 2017 0) "~f") $1 = "56.55" but it screws up for nanoseconds values in the range (0, 100), i.e., for any time that lies strictly within the first millisecond of a second: scheme@(guile-user)> (date->string (make-date 55 56 34 12 26 3 2017 0) "~f") $2 = "56.5e-4" Looks like the fractional seconds value is being formatted through a mechanism that is not suitable for this purpose, which uses exponent notation for sufficiently small values and thereby surprises the date->string code. Note that just assembling the seconds+fraction value and putting the whole thing through the same formatter, as opposed to putting the fractional part through on its own, would fix the above test cases, and any others with non-zero integer seconds, but would leave the bug unfixed for the case where the integer seconds value is zero. Fixing this requires not using any formatting mechanism that would ever resort to exponent notation for values in the relevant range. -zefram
bug#26260: ~f SRFI-19 format specifier mishandles one-digit seconds value
The ~f format specifier for SRFI-19's date->string is documented as: #~f seconds and fractional seconds, with locale # decimal point, eg. `5.2' Let's test that example: scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (date->string (make-date 2 5 34 12 26 3 2017 0) "~f") $1 = "05.2" That's not the documented format: the doc and the SRFI itself show "5.2" with no leading padding, but actual behaviour is to zero pad. There is much that is ambiguous in the SRFI's specification of ~f, but with that example it does at least seem clear that there should be no padding there. -zefram
bug#26261: ~N mishandles small nanoseconds value
The ~N format specifier in SRFI-19's date->string is documented to show the nanoseconds value, with zero padding. The documentation explicates further by showing as an example a string of nine zeroes. In fact the implementation only pads to seven digits, and so produces incorrect output for and nanoseconds value in the range [0, 1): scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (date->string (make-date 0 5 34 12 26 3 2017 0) "~N") $1 = "000" scheme@(guile-user)> (date->string (make-date 2 5 34 12 26 3 2017 0) "~N") $2 = "002" scheme@(guile-user)> (date->string (make-date 200 5 34 12 26 3 2017 0) "~N") $3 = "200" scheme@(guile-user)> (date->string (make-date 20 5 34 12 26 3 2017 0) "~N") $4 = "020" scheme@(guile-user)> (date->string (make-date 5 34 12 26 3 2017 0) "~N") $5 = "" scheme@(guile-user)> (date->string (make-date 2 5 34 12 26 3 2017 0) "~N") $6 = "2" The padding clearly has to be to the full nine digits. -zefram
bug#26329: monotonic time not supplied by current-time
The SRFI-19 current-time function can return several flavours of the current time: scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (current-time time-utc) $1 = # scheme@(guile-user)> (current-time time-tai) $2 = # scheme@(guile-user)> (current-time time-monotonic) $3 = # The last of these three is erroneous: a time structure of type time-monotonic was requested and must be returned, but instead the type is time-tai. Although the implementation gives these two time types numerically identical behaviour, it does treat them as nominally distinct in other operations: scheme@(guile-user)> (eqv? time-tai time-monotonic) $4 = #f scheme@(guile-user)> (julian-day->time-tai 245) $5 = # scheme@(guile-user)> (julian-day->time-monotonic 245) $6 = # -zefram
bug#26149: SRFI-19 doc erroneously warns about Gregorian reform
Andy Wingo wrote: >This makes sense to me, FWIW. Patch attached. -zefram >From 444703940983d559935c4dd2a2c89d7888c67119 Mon Sep 17 00:00:00 2001 From: Zefram Date: Wed, 19 Apr 2017 17:08:30 +0100 Subject: [PATCH] correct note about Gregorian reform in SRFI-19 SRFI-19 specifies proleptic use of the Gregorian calendar, so it was incorrect of the documentation to describe the code as erroneous in doing so. Rewrite the caution more neutrally, and move it to the section about the "date" structure, where it seems most relevant. --- doc/ref/srfi-modules.texi | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/doc/ref/srfi-modules.texi b/doc/ref/srfi-modules.texi index 95509b2..3d44156 100644 --- a/doc/ref/srfi-modules.texi +++ b/doc/ref/srfi-modules.texi @@ -2383,17 +2383,6 @@ functions and variables described here are provided by (use-modules (srfi srfi-19)) @end example -@strong{Caution}: The current code in this module incorrectly extends -the Gregorian calendar leap year rule back prior to the introduction -of those reforms in 1582 (or the appropriate year in various -countries). The Julian calendar was used prior to 1582, and there -were 10 days skipped for the reform, but the code doesn't implement -that. - -This will be fixed some time. Until then calculations for 1583 -onwards are correct, but prior to that any day/month/year and day of -the week calculations are wrong. - @menu * SRFI-19 Introduction:: * SRFI-19 Time:: @@ -2593,6 +2582,16 @@ The fields are year, month, day, hour, minute, second, nanoseconds and timezone. A date object is immutable, its fields can be read but they cannot be modified once the object is created. +Historically, the Gregorian calendar was only used from the latter part +of the year 1582 onwards, and not until even later in many countries. +Prior to that most countries used the Julian calendar. SRFI-19 does +not deal with the Julian calendar at all, and so does not reflect this +historical calendar reform. Instead it projects the Gregorian calendar +back proleptically as far as necessary. When dealing with historical +data, especially prior to the British Empire's adoption of the Gregorian +calendar in 1752, one should be mindful of which calendar is used in +each context, and apply non-SRFI-19 facilities to convert where necessary. + @defun date? obj Return @code{#t} if @var{obj} is a date object, or @code{#f} if not. @end defun -- 2.1.4
bug#26164: time-difference mishandles leap seconds
Andy Wingo wrote: >Makes sense to me. Would you like to submit a patch and test case? This particular bug has interactions with other bugs that make me uncomfortable about attempting to fix it right now. The right way to fix this is especially influenced by the approach taken to bug#22033 and to the bug regarding pre-1972 UTC. The latter I haven't even reported yet because it's difficult to formulate in the presence of some of the other UTC-related bugs such as bug#21911 and bug#21912. So I think this is one to postpone until some of those are out of the way. -zefram
bug#26163: time-difference doesn't detect error of differing time types
Patch attached. -zefram >From 6f9d9b355233b578eb3ce13549c8fdc9d7fb8364 Mon Sep 17 00:00:00 2001 From: Zefram Date: Wed, 19 Apr 2017 19:02:13 +0100 Subject: [PATCH] signal error of time-difference on differing types It is an error to apply SRFI-19's time-difference to time structures of differing time types. Detect and signal the error. --- module/srfi/srfi-19.scm | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/module/srfi/srfi-19.scm b/module/srfi/srfi-19.scm index c6a55a2..8da711f 100644 --- a/module/srfi/srfi-19.scm +++ b/module/srfi/srfi-19.scm @@ -413,12 +413,14 @@ ;; -- Time arithmetic (define (time-difference! time1 time2) - (let ((sec-diff (- (time-second time1) (time-second time2))) -(nsec-diff (- (time-nanosecond time1) (time-nanosecond time2 -(set-time-type! time1 time-duration) -(set-time-second! time1 sec-diff) -(set-time-nanosecond! time1 nsec-diff) -(time-normalize! time1))) + (if (not (eq? (time-type time1) (time-type time2))) + (time-error 'time-difference 'incompatible-time-types time2) + (let ((sec-diff (- (time-second time1) (time-second time2))) + (nsec-diff (- (time-nanosecond time1) (time-nanosecond time2 + (set-time-type! time1 time-duration) + (set-time-second! time1 sec-diff) + (set-time-nanosecond! time1 nsec-diff) + (time-normalize! time1 (define (time-difference time1 time2) (let ((result (copy-time time1))) -- 2.1.4
bug#21907: date->string duff ISO 8601 zone format
A sequence of two patches is attached. The first fixes the ~2/~4 bug, signalling an error for any unrepresentable offset. The second is a bonus patch, which fixes related problems in ~z, the RFC 822 zone format specifier. Prior to the patch, ~z outputs "Z" for UT, which would be correct for ISO 8601 format but is deprecated (along with all the other single-letter syntax) for RFC 822. The patch changes that to the approved "+". ~z also had exactly the same problems as ~2/~4 regarding unrepresentable offsets, so the patch fixes them in the same way. I could report the ~z problems in a separate ticket if you like. Beware that the second of these patches has some textual dependence on the first, so trying to handle them separately might just be confusing. -zefram >From e6db0e40e5464591df204f9d07e66b3d7853c0d7 Mon Sep 17 00:00:00 2001 From: Zefram Date: Wed, 19 Apr 2017 21:50:39 +0100 Subject: [PATCH 1/2] fix SRFI-19's ISO 8601 zone output formats The ISO 8601 timezone formats offered by SRFI-19's date->string function, in the ~2 and ~4 format specifiers, were erroneously in the basic format despite juxtaposition with extended-format date and time. Fix that by switching them to extended format. This incidentally means that the ISO 8601 zone format is no longer implemented as identical to the RFC 822 zone format (~z), so stop documenting them in terms of ~z. The same format specifiers also made too much of an attempt to display zone offsets that are not representable in ISO 8601 format. They would truncate an offset that is not an integral number of minutes, thus producing inaccurate output. The truncation of an offset in the range (-60, 0) yielded a non-conforming "-". An offset of 100 hours or more (in either direction) resulted in non-conforming extra digits. In all of these cases, signal as an error that the zone offset is not representable. --- doc/ref/srfi-modules.texi | 4 ++-- module/srfi/srfi-19.scm | 22 +++--- 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/doc/ref/srfi-modules.texi b/doc/ref/srfi-modules.texi index ec3bb20..da7850f 100644 --- a/doc/ref/srfi-modules.texi +++ b/doc/ref/srfi-modules.texi @@ -2818,9 +2818,9 @@ with locale decimal point, eg.@: @samp{5.2} @item @nicode{~z} @tab time zone, RFC-822 style @item @nicode{~Z} @tab time zone symbol (not currently implemented) @item @nicode{~1} @tab ISO-8601 date, @samp{~Y-~m-~d} -@item @nicode{~2} @tab ISO-8601 time+zone, @samp{~H:~M:~S~z} +@item @nicode{~2} @tab ISO-8601 time+zone, @samp{~3} plus zone @item @nicode{~3} @tab ISO-8601 time, @samp{~H:~M:~S} -@item @nicode{~4} @tab ISO-8601 date/time+zone, @samp{~Y-~m-~dT~H:~M:~S~z} +@item @nicode{~4} @tab ISO-8601 date/time+zone, @samp{~5} plus zone @item @nicode{~5} @tab ISO-8601 date/time, @samp{~Y-~m-~dT~H:~M:~S} @end multitable @end defun diff --git a/module/srfi/srfi-19.scm b/module/srfi/srfi-19.scm index f09ec7a..ed88242 100644 --- a/module/srfi/srfi-19.scm +++ b/module/srfi/srfi-19.scm @@ -152,7 +152,6 @@ (define locale-date-time-format "~a ~b ~d ~H:~M:~S~z ~Y") (define locale-short-date-format "~m/~d/~y") (define locale-time-format "~H:~M:~S") -(define iso-8601-date-time-format "~Y-~m-~dT~H:~M:~S~z") ;;-- Miscellaneous Constants. ;;-- only the tai-epoch-in-jd might need changing if @@ -970,6 +969,21 @@ (display (padding hours #\0 2) port) (display (padding minutes #\0 2) port +(define (iso-8601-tz-print offset port) + (let* ((neg? (negative? offset)) + (all-secs (abs offset)) + (seconds (remainder all-secs 60)) + (all-mins (quotient all-secs 60)) + (minutes (remainder all-mins 60)) + (hours (quotient all-mins 60))) +(if (or (not (= seconds 0)) (> hours 99)) + (time-error 'date-printer 'unrepresentable-zone-offset offset) + (begin + (display (if neg? #\- #\+) port) +(display (padding hours #\0 2) port) + (display #\: port) +(display (padding minutes #\0 2) port) + ;; A table of output formatting directives. ;; the first time is the format char. ;; the second is a procedure that takes the date, a padding character @@ -1119,11 +1133,13 @@ (cons #\1 (lambda (date pad-with port) (display (date->string date "~Y-~m-~d") port))) (cons #\2 (lambda (date pad-with port) - (display (date->string date "~H:~M:~S~z") port))) + (display (date->string date "~3") port) + (iso-8601-tz-print (date-zone-offset date) port))) (cons #\3 (lambda (date pad-with port) (display (date->string date "~H:~M:~S") port))) (cons #\4 (lambda (date pad-with port) - (display (date->string date "~Y-~m-~dT~H:~M:~S~z") port))) + (display (date->string date "~5") port) + (iso-8601-
bug#26570: GC_is_heap_ptr() dep for 2.2.1
Compilation of 2.2.1 fails for me, producing a lot of warnings about implicit declaration of GC_is_heap_ptr(), and ultimately CCLD guile ./.libs/libguile-2.2.so: undefined reference to `GC_is_heap_ptr' collect2: error: ld returned 1 exit status Makefile:2439: recipe for target 'guile' failed make[3]: *** [guile] Error 1 At a guess, maybe this is supposed to be supplied by libgc. But I have the version of libgc that README says is required (7.2), and configure was happy with it. Maybe a higher version is now required, and README and configure need updating? -zefram
bug#21904: date->string duff ISO 8601 format for non-4-digit years
A patch to fix this is attached. The ISO 8601 date formats were implemented by using the ~Y formatter for the year portion, but SRFI-19 doesn't require ~Y to follow ISO 8601, so this raises the question of whether ~Y should. It could be fixed by changing ~Y to conform to ISO 8601, retaining the existing factoring of the formatters. Or a separate internal formatting function could be instituted to do ISO 8601 year formatting, with ~1 et al using that and ~Y left unchanged. I chose the former strategy, partly because the funny non-linear year number doesn't seem a useful thing to support in date->string at all, but more strongly because it's useful to have access to ISO 8601 year formatting on its own. There isn't any other format specifier for that job; it looks like SRFI-19 imagines that ~Y will fill that need. -zefram >From 43dfb5fabc9debb80f87b17d82a1adde356e547c Mon Sep 17 00:00:00 2001 From: Zefram Date: Thu, 20 Apr 2017 00:42:54 +0100 Subject: [PATCH 1/2] fix SRFI-19's ISO 8601 year syntax The ISO 8601 date formats offered by SRFI-19's date->string function were emitting incorrect syntax for most years. At least four digits of year must be given, but it wasn't padding shorter numbers. And any number with more than four digits requires a leading sign, but this was being omitted for positive numbers. These problems are now fixed. The ISO 8601 date formats were formerly implemented in terms of the ~Y format, which was not specified to be an ISO 8601 format. The fix is achieved by altering ~Y to behave in the ISO 8601 manner, and ~Y is now documented to conform to ISO 8601. Doing it this way means that ISO 8601 year numbering is available in isolation, which is a useful facility not otherwise available. --- doc/ref/srfi-modules.texi | 1 + module/srfi/srfi-19.scm | 5 - 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/doc/ref/srfi-modules.texi b/doc/ref/srfi-modules.texi index da7850f..8a5f1a0 100644 --- a/doc/ref/srfi-modules.texi +++ b/doc/ref/srfi-modules.texi @@ -2815,6 +2815,7 @@ with locale decimal point, eg.@: @samp{5.2} @item @nicode{~y} @tab year, two digits, @samp{00} to @samp{99} @item @nicode{~Y} @tab year, full, eg.@: @samp{2003} +(in ISO 8601 format, though SRFI-19 doesn't specify so) @item @nicode{~z} @tab time zone, RFC-822 style @item @nicode{~Z} @tab time zone symbol (not currently implemented) @item @nicode{~1} @tab ISO-8601 date, @samp{~Y-~m-~d} diff --git a/module/srfi/srfi-19.scm b/module/srfi/srfi-19.scm index 4b8445f..d4308bb 100644 --- a/module/srfi/srfi-19.scm +++ b/module/srfi/srfi-19.scm @@ -1128,7 +1128,10 @@ 2) port))) (cons #\Y (lambda (date pad-with port) - (display (date-year date) port))) + (let ((y (date-year date))) + (cond ((negative? y) (display #\- port)) + ((>= y 1) (display #\+ port))) + (display (padding (abs y) #\0 4) port (cons #\z (lambda (date pad-with port) (rfc-822-tz-print (date-zone-offset date) port))) (cons #\Z (lambda (date pad-with port) -- 2.1.4
bug#21904: date->string duff ISO 8601 format for non-4-digit years
I wrote: >I chose the former strategy, partly because the funny non-linear year >number doesn't seem a useful thing to support in date->string at all, Sorry, this comment is misplaced. It relates to bug#21903; the choice about ~Y applies to both of these bugs. -zefram
bug#21903: date->string duff ISO 8601 negative years
A patch to fix this is attached. It applies on top of my patch for bug#21904. The choice that I described for that bug about whether to change ~Y or to have a separate ISO 8601 year formatter actually applies to both bugs, and the comment that I made there about exposing the non-linear year numbering is really only about this bug. -zefram >From 3d39f1dfa0e210282db48a9af828646d7e9acef3 Mon Sep 17 00:00:00 2001 From: Zefram Date: Thu, 20 Apr 2017 00:53:40 +0100 Subject: [PATCH 2/2] fix SRFI-19's ISO 8601 year numbering The ISO 8601 date formats offered by SRFI-19's date->string function were emitting incorrect year numbers for years preceding AD 1. It was following the non-linear numbering that the library uses in the date structure, rather than the standard astronomical year numbering required by ISO 8601. This is now fixed. As with the preceding fix for the syntax of year numbers, the fix is actually applied to the ~Y format, which SRFI-19 doesn't require to follow ISO 8601. --- module/srfi/srfi-19.scm | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/module/srfi/srfi-19.scm b/module/srfi/srfi-19.scm index d4308bb..0e56c31 100644 --- a/module/srfi/srfi-19.scm +++ b/module/srfi/srfi-19.scm @@ -1128,7 +1128,8 @@ 2) port))) (cons #\Y (lambda (date pad-with port) - (let ((y (date-year date))) + (let* ((yy (date-year date)) + (y (if (negative? yy) (+ yy 1) yy))) (cond ((negative? y) (display #\- port)) ((>= y 1) (display #\+ port))) (display (padding (abs y) #\0 4) port -- 2.1.4
bug#26632: TAI<->UTC conversion botches 1961 to 1971
The SRFI-19 library gets TAI<->UTC conversions badly wrong in the years 1961 to 1971 (inclusive). This has to be examined somewhat indirectly, because SRFI-19 doesn't offer any way to display a TAI time in its conventional form as a date-like structure, nor to input a TAI time from such a structure. SRFI-19's date structure, as implemented, is always interpreted according to UTC. The only operations supported on TAI time structures are conversions to and from the various forms of UTC, conversions to and from the less-useful `monotonic' time, and arithmetic operations. Thus the erroneous TAI<->UTC conversions only come out through arithmetic operations in TAI space. One must also be careful to avoid unrelated bugs such as bug#21911. First I'll consider an ordinary day in 1967: scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (time-difference ... (time-utc->time-tai (date->time-utc (make-date 0 0 0 0 15 3 1967 0))) ... (time-utc->time-tai (date->time-utc (make-date 0 0 0 0 14 3 1967 0 $1 = # This takes the start and end of 1967-03-14, as judged by UTC, converts both of these times to TAI, and asks for the duration of that TAI interval. It's asking how many TAI seconds long that UTC day was. As described in <http://maia.usno.navy.mil/ser7/tai-utc.dat>, there was no UTC leap on that day, but throughout 1967 UTC had a frequency offset from TAI such that each UTC second lasted exactly 1.0003 TAI seconds. The correct answer to the above question is therefore exactly 86400.002592 s. The answer shown above, of 86400.00 s, is incorrect. If time-tai->time-utc is applied to the times in the above example, it accurately inverts what time-utc->time-tai did. It is good that the conversions are mutually consistent, but in this case it means they are both wrong. Second, I'll consider a less ordinary day: scheme@(guile-user)> (time-difference ... (time-utc->time-tai (date->time-utc (make-date 0 0 0 12 1 2 1968 0))) ... (time-utc->time-tai (date->time-utc (make-date 0 0 0 12 31 1 1968 0 $2 = # This time the period considered is from noon 1968-01-31 to noon 1968-02-01. The same frequency offset described above applies throughout this period. The additional complication here is that at the end of 1968-01-31 there was a leap of -0.1 (TAI) seconds. The true duration of this day is therefore exactly 86399.902592 s. The answer shown above, of 86400.00 s, is incorrect in two ways, accounting for neither the frequency offset nor the leap. Once again, time-tai->time-utc accurately inverts the incorrect time-utc->time-tai. The failure to handle UTC's leaps in this era is not specific to the relatively unusual negative leaps: it's equally clueless about the positive leaps. The full extent of the conversion errors, integrated across the entire "rubber seconds" era from 1961-01-01 to 1972-01-01, is a little over 8.5 seconds of TAI. This bug influences bug#26164, regarding time arithmetic in UTC. If one were to ignore the rubber seconds era, an obvious way to correct UTC time arithmetic would be to convert to TAI and the arithmetic there. That handles UTC leaps correctly. But with rubber seconds it would still be wrong. In the rubber seconds era the number of UTC seconds in an interval differs from the number of TAI seconds. -zefram
bug#26633: TAI<->UTC conversion botches pre-1961 era
Asking SRFI-19 to perform a UTC-to-TAI conversion for an ordinary day in 1960: scheme@(guile-user)> (use-modules (srfi srfi-19)) scheme@(guile-user)> (time-utc->time-tai (date->time-utc (make-date 0 0 0 12 14 3 1960 0))) $1 = # The answer given is incorrect. Unlike previous conversion bugs where it was necessary to perform some arithmetic to reveal that the conversion had gone wrong, in this case the answer can be declared wrong without any detailed interpretation of the TAI time structure. It is incorrect for this conversion to return any specific TAI time, upon which arithmetic could be performed, because UTC is not defined for any time prior to 1961. The only sane behaviour is for the conversion to signal an error. The same goes for time-tai->time-utc, which at present accurately inverts time-utc->time-tai for the above time. -zefram
bug#22033: time-utc format is lossy
I wrote: > These two seconds are perfectly >distinct parts of the UTC time scale, and the time-utc format ought to >preserve their distinction. This is a problematic goal. At the time I wrote the bug report I didn't have a satisfactory idea of how to achieve it, but I think I've come up with one now. The essential problem is that the SRFI-19 time structure expects to encapsulate a scalar value -- as it says, a count of seconds since some epoch -- but there is no natural scalar representation of a UTC time. Because of the irregularity imposed by its leaps, the natural representation of a UTC time is a two-part structure, consisting of an integer identifying the day and a fractional count of seconds elapsed within the day. Because UTC days contain differing numbers of seconds, this is a variable-radix system. SRFI-19 doesn't offer any structure that has this simple form. The only structure that it describes as separating representation of the day from time of day is the date structure, which splits up the time representation much more and has the complication of the timezone offset. The present approach of the library is to squeeze a UTC time into the time structure by converting the variable-radix value into a scalar by using a fixed radix of 86400. This has the advantage of producing a scalar, and of the scalar behaving continuously on most UTC days, but the major downside of being lossy, aliasing some UTC times. The scalar also isn't really a count of seconds since an epoch, as SRFI-19 expects, breaking arithmetic on it. It looks rather as though this part of SRFI-19 was written expecting this sort of transformation of UTC, but conflictingly expecting it to serve as an unambiguous encoding and as a genuine count of seconds since an epoch. A simple workaround would be to create a scalar in the same kind of way but using a larger fixed radix: minimally 86401, or more roundly 131072. This means we have a scalar value that fits easily into the time structure, and unambiguously encodes all UTC times. But it's still not a count of seconds since an epoch, and it's appreciably less like such a count because it's no longer continuous across (most) UTC day ends. Since the time structure has separate fields for seconds and nanoseconds, it would be possible to borrow a trick sometimes used with the Unix struct timespec: extending the nanoseconds range to represent leap seconds. This would be mostly like the present arrangement, with the seconds count increasing by 86400 per UTC day, but with a leap second unambiguously represented by the seconds count of the preceding second and a nanoseconds count in the range [10, 20). This fixes the ambiguity, but retains all the other downsides of the present badly-behaved scalar, and adds the substantial downside of breaking expectations of normalisation. The alternative to all of those hacks is to produce a continuous scalar value that genuinely counts the seconds of UTC. This is feasible. It would have a distinct representation for all points on the UTC time scale. By being a true scalar value it would fully meet SRFI-19's description of the time structure, would be represented in normalised fashion, and would support arithmetic operations on the seconds of UTC (fixing bug#26164 with no extra effort). The downside is that this is an unusual and somewhat surprising arrangement. I've never previously seen a linear count of UTC seconds brought out as a product of any time library. It would mean that a time-utc structure is not an encoding of a UTC time as normally understood: the date structure would serve that purpose, and a time-utc would instead have a hybrid meaning halfway between what we usually think of as UTC and TAI times. In the leap-seconds era (1972 onwards), the scalar value in a time-utc would be a constant offset from the scalar value in the corresponding time-tai. This implies that conversion operations would be in a different place from where they are now. Whereas currently date/time-utc conversions are almost purely arithmetical and time-utc/time-tai conversions involve the leap second table, instead date/time-utc conversions would require the leap second table and time-utc/time-tai conversions would be purely arithmetical for the leap-seconds era. (Frequency offsets would come into the time-utc/time-tai conversions, for times in the rubber-seconds era.) I'm pretty sure that this actually-linear treatment of time-utc is not what the author of SRFI-19 envisioned. But it fits the actual words of the standard better than anything else I can imagine, and would fix a bunch of problems that otherwise look painful. I reckon this is the best way forward. What do you think? If you like it, I could work up a patch. -zefram
bug#26164: time-difference mishandles leap seconds
Mark H Weaver wrote: >You seem to be assuming that SRFI-19 durations should _always_ represent >intervals of TAI time. No, that is not my position. Although SRFI-19 isn't entirely explicit on this point, it is in the nature of the problem space that a duration may be measured on any time scale, and it seems to be implied that time-difference will determine the duration on the time scale of its inputs. Indeed, if the duration were always to be determined on one specific scale then it would not be necessary for time-difference to require its two inputs to be of the same time type. With respect to UTC, my position is that time-difference on inputs of type time-utc should determine the duration as measured in UTC seconds. For times since 1972 this is always the same as the duration in TAI seconds (elaborated further below). For 1961 to 1971 UTC durations and TAI durations differ, and that's the subject of my bug#26632. Note that in that bug report I explicitly converted time-utc->time-tai where I wanted to determine a TAI duration. > every UTC day has >exactly 86400 UTC seconds, No, that's not how UTC works. There are some time scales derived from UTC that have exactly 86400 seconds for each UTC day, such as Markus Kuhn's UTC-SLS, or that have exactly 86400 seconds per UTC day in the long run, such as Google's "leap smear". But SRFI-19 doesn't refer to any of those, it refers to UTC. The true UTC has a variable number of seconds per day *as judged by UTC clocks*: the days are not merely different lengths as judged by TAI. The variable number of UTC seconds per day is the source of the famous "23:59:60" notation. On a day with a positive leap second, the first second of the day is centred on 00:00:00.5, the 86400th second is centred on 23:59:59.5, and the 86401st second is centred on 23:59:60.5. These are 86401 distinct seconds counted by UTC, each with a distinct label. On a day with a negative leap second, UTC only counts 86399 seconds: the time-of-day labels never reach 23:59:59. It is intrinsic to the definition of UTC that durations (measured in seconds) don't match up regularly with time of day. It's just like the way that intervals measured in days don't match up regularly with day of month: the way to think about a day of UTC is a lot like the way one thinks about a month of the Gregorian calendar. (Though there's an important difference in that we know the lengths of Gregorian months arbitrarily far in advance but only know UTC day lengths months in advance.) Wanting to avoid all that irregularity is the motivation to use UTC-SLS and the like. >Having said all of this, I should admit that I'm not an expert on time >standards, I am. Incidentally, there's an aspect of the present bug report that's different in the pre-1972 era. time-difference correctly shows a duration of exactly 86400 seconds on the UTC scale for an ordinary day in that era, such as 1967-03-14 of which I examined the TAI duration in bug#26632. But it incorrectly shows the same duration for a day with a leap. That's the same error that it makes for post-1972 leaps, but there's a difference in that the duration of the leap (as judged in UTC) is non-integral, being derived from a non-integral number of TAI seconds and also affected by the frequency offset. For example, the UTC duration of 1968-01-31 (also examined in bug#26632) was exactly 8639990259200/10003 seconds (roughly 86399.90003 s). This runs into trouble with SRFI-19's insistence that the nanosecond field of a time object only contain an integer. -zefram
bug#26632: TAI<->UTC conversion botches 1961 to 1971
Mark H Weaver wrote: >patch adds the TAI-UTC tables for 1961-1971 and uses them to implement >TAI<->UTC conversions over that time range with nanosecond accuracy. On a quick inspection of the code, that looks good. >I'm vaguely concerned about violating widely-held assumptions, >e.g. that UTC runs at the same rate as TAI If an application assumes that for pre-1972 times, then the application is broken. Note that any application currently using the srfi-19 library for pre-1972 TAI<->UTC conversions already has a bigger problem, in that it's getting false answers from the library. It's hard to see how fixing the library could make any previously-working program stop working. > which might cause some code on top of Guile to misbehave if >the system clock is set pre-1972, If the system clock is incorrect by decades, there will be many other problems to deal with. >I'm curious to hear opinions on this. My view is that this change should definitely be applied. But it's also worth thinking about what the alternative is, if the correct conversions are somehow too shocking for innocent programs to be exposed to them. Making no change isn't a realistic option: the library is producing false answers, which are no use to anyone. It's clearly a bug in the library, and needs to be addressed somehow. The only other defensible option would be to declare pre-1972 UTC out of scope for the library, having attempted conversions signal an error. That would have to be documented, and it seems like it would still amount to a deviation from the requirements of SRFI-19. -zefram