On Mon, 13 Jul 2020, Stephan Mühlstrasser via curl-library wrote:

when using curl's URL functions, is it possible to validate the URL?

Let's start with what does it mean to "validate" a URL ? By which critera is the URL correct? URLs are unfortunately not defined very strictly (these days - if they ever were).

What exactly does "RFC 3986+" mean?

It means that libcurl parses URLs as RFC 3986 dictates they work, with a few extra extensions that we've deemed necessary to make curl slightly more "browser and real world"-compatible.

For example you can specify the URL with one, two or three slashes...

But also, as Jakub already said, libcurl focuses on extracting the right parts from the URL. It will accept a little more than what a strict RFC 3986 adhering parser would (if there ever was one).

And what happens if the URL string is not a correct "RFC 3986+" URL?

The libcurl URL parser returns an error *when it detects a problem*. Which of course isn't the same thing as "validating" a URL.

  validate_url(url_handle, "https://curl.haxx.se/<invalid>");
  validate_url(url_handle, "https://curl.haxx.se/%XY";);

But curl_url_set() returns CURLUE_OK for them. Is this expected?

Yes, more or less expected anyway.

The first one is probably pointless to refuse since the brackets have no other meaning to URLs and people use those characters already in URLs with browsers.

The second one is accepted by the parser since it doesn't verify percent-encoded octets. Maybe it should.

--

 / daniel.haxx.se | Commercial curl support up to 24x7 is available!
                  | Private help, bug fixes, support, ports, new features
                  | https://www.wolfssl.com/contact/
-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Reply via email to