Mark H Weaver writes:
> Ricardo Wurmus writes:
>
>> I’m having a problem with http-post and I think it might be a bug. I’m
>> talking to a Debbugs SOAP service over HTTP by sending (via POST) an XML
>> request. The Debbugs SOAP service responds with a string of XML.
[...]
> The problem is simply that our Content-Type header parser is broken.
> It's very simplistic and merely splits the string wherever ';' is found,
> and then checks to make sure there's only one '=' in each parameter,
> without taking into account that quoted strings in the parameters might
> include those characters.
>
> I'll work on a proper parser for Content-Type headers.
I've attached preliminary patches to fix the Content-Type header parser,
and also to fix the parsing of response header lines to support
continuation lines.
With these patches applied, I'm able to fetch and decode the SOAP
response that you fetched with your 'wget' example, as follows:
--8<---cut here---start->8---
mhw@jojen ~/guile-stable-2.2 [env]$ meta/guile
GNU Guile 2.2.4.10-4c91d
Copyright (C) 1995-2017 Free Software Foundation, Inc.
Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.
Enter `,help' for help.
scheme@(guile-user)> (use-modules (web http) (web uri) (web client) (sxml
simple) (ice-9 receive))
scheme@(guile-user)> ,pp (let ((req-xml "http://schemas.xmlsoap.org/soap/envelope/\";
xmlns:xsi=\"http://www.w3.org/1999/XMLSchema-instance\";
xmlns:xsd=\"http://www.w3.org/1999/XMLSchema\";
xmlns:soapenc=\"http://schemas.xmlsoap.org/soap/encoding/\";
soapenc:encodingStyle=\"http://schemas.xmlsoap.org/soap/encoding/\";>http://schemas.xmlsoap.org/soap/encoding/\";>32514"))
(receive (response body-port)
(http-post "https://debbugs.gnu.org/cgi/soap.cgi";
#:streaming? #t
#:body req-xml
#:headers
`((content-type . (text/xml))
(content-length . ,(string-length
req-xml
(set-port-encoding! body-port "UTF-8")
(xml->sxml body-port #:trim-whitespace? #t)))
$1 = (*TOP* (*PI* xml "version=\"1.0\" encoding=\"UTF-8\"")
(http://schemas.xmlsoap.org/soap/envelope/:Envelope
(@ (http://schemas.xmlsoap.org/soap/envelope/:encodingStyle
"http://schemas.xmlsoap.org/soap/encoding/";))
(http://schemas.xmlsoap.org/soap/envelope/:Body
(urn:Debbugs/SOAP:get_bug_logResponse
(http://schemas.xmlsoap.org/soap/encoding/:Array
(@ (http://www.w3.org/1999/XMLSchema-instance:type
"soapenc:Array")
(http://schemas.xmlsoap.org/soap/encoding/:arrayType
"xsd:ur-type[4]"))
(urn:Debbugs/SOAP:item
(urn:Debbugs/SOAP:header
(@ (http://www.w3.org/1999/XMLSchema-instance:type
"xsd:string"))
"Received: (at submit) by debbugs.gnu.org; 23 Aug 2018
20:17:46 +\nFrom debbugs-submit-boun...@debbugs.gnu.org [...]
[...]
--8<---cut here---end--->8---
Note that I needed to make two other changes to your preliminary code,
namely:
* I passed "#:streaming? #t" to 'http-post', to ask for a port to read
the response body instead of reading it eagerly.
* I explicitly set the port encoding to "UTF-8" on that port before
using 'xml->sxml' to read it.
Otherwise, the entire 'body' response will be returned as a bytevector,
because the response Content-Type is not recognized as a textual type.
The HTTP Content-Type is "multipart/related", with a parameter:
type="text/xml". I'm not sure if we should be automatically
interpreting that as a textual type or not.
There's no 'charset' parameter in the Content-Type header, but the XML
internally specifies: encoding="UTF-8".
Anyway, here are the preliminary patches.
Mark
>From 41764d60dba80126b3c97f883d0225510b55f3fa Mon Sep 17 00:00:00 2001
From: Mark H Weaver
Date: Tue, 28 Aug 2018 18:39:34 -0400
Subject: [PATCH 1/2] web: Add support for HTTP header continuation lines.
* module/web/http.scm (spaces-and-tabs, space-or-tab?): New variables.
(read-header-line): After reading a header, if a space or tab follows,
then read the continuation lines and append them all together.
---
module/web/http.scm | 31 ---
1 file changed, 24 insertions(+), 7 deletions(-)
diff --git a/module/web/http.scm b/module/web/http.scm
index de61c9495..15f173173 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -1,6 +1,6 @@
;;; HTTP messages
-;; Copyright (C) 2010-2017