Op 25-09-2023 om 18:48 schreef Vivien Kraus:
* module/web/uri.scm (remove-dot-segments): Implement algorithm 5.2.4. (merge-paths): Implement algorithm 5.2.3. (resolve-relative-reference): Implement algorithm 5.2.2. (module): Export resolve-relative-reference. * NEWS: Reference it here. --- Dear Guile developers, When you request https://example.com/resource an URI and get redirected to "here", you end up with 2 URI references: - https://example.com/resource - here What should you request next? The answer is, "https://example.com/here". It seems evident how we go from one to the other. However, there are more subtle cases. What if you get redirected to "../here", for instance? RFC 3986 has you covered, in section 5.2. It explains how we go from a base URI and a URI reference to the new URI. What do you think? Best regards,
Sounds very useful. However, there are also some dangers on doing this thing -- the ‘external’ page https://example.com/data.json could redirect to http://localhost/unsecured-secret-but-its-localhost-only-so-it-is-safe.
I forgot the name of this attack, but there is probably a page somewhere that documents the danger and how to mitigate it (I think I read some Firefox documentation somewhere?). Perhaps there exists an informative RFC about it? I think it put a warning about this somewhere in the documentation.
(Another related problem is that example.com could have IP address ::1, but that's a different problem.)
Vivien NEWS | 7 ++ module/web/uri.scm | 152 +++++++++++++++++++++++++++++++++- test-suite/tests/web-uri.test | 68 +++++++++++++++
You forgot modifying the non-docstring documentation to properly document the new procedure.
3 files changed, 226 insertions(+), 1 deletion(-)
Given the existence of resolve-relative-reference, it is easy to expect http-request to do redirection. I think it would be best to adjust to the documentation of http-request / http-get / ... to mention whether automatic redirection is done or not.
+(define (resolve-relative-reference base relative) + "Resolve @var{relative} on top of @var{base}, as RFC3986, section 5.2."
I don't like the mutation, but it's a completely deterministic procedure (ignoring memory allocation) so it can't cause problems and hence I suppose it's not worth rewriting (unless you or someone else really wants to rewrite it, I suppose).
I suppose it also avoids the risk of accidentally deviating from the RFC it is supposed to implement, which is major advantage of sticking with the mutation.
I like that you say _which_ resolution method you are using instead of saying or implying that this is the always the _right_ way of resolving relative references, because some URI schemes are rather quirky. (I don't know any quirkiness w.r.t. relative references, but wouldn't be surprised if it exists.)
Also I think it's worth stating that both base and relative are URIs -- with the current docstring, I find (resolve-... uri "./whatever") a reasonable thing to do.
IIUC, there currently is nothing preventing (resolve-... (uri object for "https://example.com/a") (uri object for "https://example.com/b")).IIUC, this is supposed to return (uri object for "https://example.com/b"), but that could be more explicit with a change of variable name.
(define (resolve-... base maybe-relative) [...])
+(with-test-prefix "resolve relative reference" + ;; Test suite in RFC3986, section 5.4. + (let ((base (string->uri "http://a/b/c/d;p?q")) + (equal/encoded? + ;; The test suite checks for ';' characters, but Guile escapes + ;; them in URIs. Same for '='.
IIUC, that's a bug! 6.2.2.2. Percent-Encoding Normalization The percent-encoding mechanism (Section 2.1) is a frequent source of variance among otherwise identical URIs. In addition to the case normalization issue noted above, some URI producers percent-encode octets that do not require percent-encoding, resulting in URIs that are equivalent to their non-encoded counterparts. __These URIs __should be normalized by decoding any percent-encoded octet that corresponds to an unreserved character, as described in Section 2.3.__ (Emphasis added.)I am assuming here that ; is an unreserved character, if it isn't, there isn't a bug here.
However, I sense a lack of normalisation in resolve-relative-reference, so unless Guile already does normalisation elsewhere (perhaps it does during the URI object construction?), there might be a bug here -- ok, technically perhaps not a bug because the docstring only mentions ‘implements section 5.2 and 5.2. doesn't seem to mention section 6 (and section 6 says ‘should’, not ‘shall/must’, but some people use ‘should’ as ‘shall/must’, so dunno), but in that case that's still a potential pitfall that could be mentioned in the docstring. Could be as simple as "No normalisation is performed.".
I guess it shouldn't do normalisation, but guesswork seems better to be avoided/confirmed or disconfirmed when possible.
Best regards, Maxime Devos.
OpenPGP_0x49E3EE22191725EE.asc
Description: OpenPGP public key
OpenPGP_signature
Description: OpenPGP digital signature