Great, in that case I'll use this patched version of htmlprag with guile-lib now.
After a little bit of testing it looking like this patch did the trick - here's the shtml of the HTML file in my original email: (*TOP* (*DECL* DOCTYPE html) (html (head (title Example) ) (body (header (@ (class exampleHeader)) (img (@ (id bannerImage) (src https://www.gnu.org/software/guile/static/base/img/branding.png))) (div (p (@ (id labelName)) A label for the header.) ) (p (@ (id labelDescription)) Some description of the header.) ) (div (@ (id exampleDiv)) (hr) (div (@ (id divMessage)) An example message.) ) (footer (@ (id footer))) ) ) ) Which looks a lot better - the p tag is nested inside the div, and my sxpath expression '(// html body (header (@ (equal? (class "exampleHeader")))) div) gives me the following (I've taken the escaped characters and whitespace strings out): ((div (p (@ (id "labelName")) "A label for the header."))) Thanks a lot for your help with this, it's very much appreciated! (Likewise if I ever find myself with some time, I might review the code for html-parsing and see whether porting it to RnRS is something I could realistically work on.) Cheers, Kenan On 7/9/19 6:55 am, Neil Van Dyke wrote: > Kenan Toker wrote on 9/6/19 12:09 AM: > > With that in mind, if I were to choose one of the 'distributions' of > htmlprag, is there one you yourself would pick? > > I suspect that the version in guile-lib (plus the patch I sent > yesterday) is best. > > (Realistically, I probably can't work on anything better anytime soon, > unless a deep-pocketed dotcom wants to get into the Scheming business.) >
signature.asc
Description: OpenPGP digital signature