bug#75998: [guile-lib] html->sxml does not decode entities in attributes

2025-02-01 Thread tomas
On Sat, Feb 01, 2025 at 09:10:04PM +0100, Tomas Volf wrote:
> 
> Hello,
> 
> I think I found a bug in the htmlprag module in guile-lib.  When parsing
> attributes, the values are not properly decoded:
> 
> --8<---cut here---start->8---
> scheme@(guile-user)> ,use (htmlprag)
> scheme@(guile-user)> (html->sxml "")
> $1 = (*TOP* (hr (@ (aaa "bbb"ccc'ddd"
> scheme@(guile-user)> (html->sxml "")
> $2 = (*TOP* (a (@ (href "a&b"
> --8<---cut here---end--->8---
> 
> I think that $1 should be "bbb\"ccc'ddd" and $2 should be "a&b".

Ouch. Have you contacted Oleg Kiselyov about it? He's usually pretty
responsive and very friendly.

> The annoying part is that this cannot really be changed now, because
> people (me included) already have workarounds in place, and
> automatically decoding now would lead to double decoding.
> 
> I see few ways forward:
> 
> 1. Document the current behavior and keep it as it is.
> 2. Add argument #:decode-attributes, defaulting to #f, to the relevant
>procedures, so that people can opt into the fixed behavior.
> 3. Introduce parameter %decode-attributes, so that people can opt into
>the fixed behavior.
> 
> I am sure there are also other approaches possible.

If it were me, I'd take 2.

Cheers
-- 
tomás


signature.asc
Description: PGP signature


bug#75998: [guile-lib] html->sxml does not decode entities in attributes

2025-02-01 Thread Tomas Volf


Hello,

I think I found a bug in the htmlprag module in guile-lib.  When parsing
attributes, the values are not properly decoded:

--8<---cut here---start->8---
scheme@(guile-user)> ,use (htmlprag)
scheme@(guile-user)> (html->sxml "")
$1 = (*TOP* (hr (@ (aaa "bbb"ccc'ddd"
scheme@(guile-user)> (html->sxml "")
$2 = (*TOP* (a (@ (href "a&b"
--8<---cut here---end--->8---

I think that $1 should be "bbb\"ccc'ddd" and $2 should be "a&b".

The annoying part is that this cannot really be changed now, because
people (me included) already have workarounds in place, and
automatically decoding now would lead to double decoding.

I see few ways forward:

1. Document the current behavior and keep it as it is.
2. Add argument #:decode-attributes, defaulting to #f, to the relevant
   procedures, so that people can opt into the fixed behavior.
3. Introduce parameter %decode-attributes, so that people can opt into
   the fixed behavior.

I am sure there are also other approaches possible.

Have a nice day,
Tomas

-- 
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.





bug#75997: (ice-9 match): warning: unused variable `failure'

2025-02-01 Thread Tomas Volf

Hi,

--8<---cut here---start->8---
(use-modules (ice-9 match))

(match-lambda (_ #f))
--8<---cut here---end--->8---

This source code leads to a warning when compiled:

--8<---cut here---start->8---
$ guix shell guile-next -- guild compile -W3 -o /tmp/xx.go /tmp/xx.scm
/tmp/xx.scm:3:0: warning: unused variable `failure'
wrote `/tmp/xx.go'
--8<---cut here---end--->8---

Looking at the expansion

--8<---cut here---start->8---
(lambda (expr)
  (let* ((v expr)
 (failure
   (lambda ()
 ((@@ (ice-9 match) throw)
  'match-error
  "match"
  "no matching pattern"
  v)
 #f)))
#f))
--8<---cut here---end--->8---

the `failure' is indeed unused.  I took a look at the source code but it
is bit beyond my current abilities, so I am not sure how to fix it.

Tomas

-- 
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.


signature.asc
Description: PGP signature