Re: [FFmpeg-devel] [PATCH] Accept a colon in the path of a URI, instead of stripping preceding characters.

2025-05-21 Thread Timothy Allen via ffmpeg-devel
On Wed, 2025-05-21 at 19:25 +, softworkz . wrote:
> What I mean and what the comment in the ticket is probably
> suggesting, 
> is that the HLS demuxer should URL-encode the URL after combining the
> base url with the segment file name before making a request for the 
> segment.

Ah, sure. That was my original plan, but then I realised that the URL
canonicalisation (or decomposition) step was already there, and I was
concerned about adding an extra step that changed the dynamics of URLs
for this specific case.

Let me take another look and see if I can figure out a reasonable way
of percent-encoding the URL in the specific case of playlists (or
possibly just HLS).

Thanks!

Tim
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] Accept a colon in the path of a URI, instead of stripping preceding characters.

2025-05-21 Thread Timothy Allen via ffmpeg-devel
On Wed, 2025-05-21 at 18:56 +, softworkz . wrote:
> Why do you think it would require control over the server side?

The original ticket is referring to HLS, and specifically the manifest
of HLS, which means a remotely-hosted M3U playlist.

In principle, the user could download the playlist, convert any
relative URLs to absolute URLs, and percent-encode the URLs.

In practice, for most users, the URL will simply not play (or will skip
any segments containing colons in the URL) without any indication of
why the URL is failing.

Thanks,

Tim
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] Percent-encode URL paths in HLS playlists

2025-05-23 Thread Timothy Allen via ffmpeg-devel
This commit closes trac ticket 10679.

Signed-off-by: Timothy Allen 
---
 libavformat/hls.c |  9 +
 libavformat/url.c | 35 +++
 libavformat/url.h |  9 +
 3 files changed, 53 insertions(+)

diff --git a/libavformat/hls.c b/libavformat/hls.c
index c7b655c83c..49436d8184 100644
--- a/libavformat/hls.c
+++ b/libavformat/hls.c
@@ -1021,6 +1021,15 @@ static int parse_playlist(HLSContext *c, const char *url,
 seg->key = NULL;
 }
 
+ff_percent_encode_url(tmp_str, sizeof(tmp_str), line);
+if (!tmp_str[0]) {
+ret = AVERROR_INVALIDDATA;
+if (seg->key)
+av_free(seg->key);
+av_free(seg);
+goto fail;
+}
+strcpy(line, av_strdup(tmp_str));
 ff_make_absolute_url(tmp_str, sizeof(tmp_str), url, line);
 if (!tmp_str[0]) {
 ret = AVERROR_INVALIDDATA;
diff --git a/libavformat/url.c b/libavformat/url.c
index d5dd6a4666..69d68d4248 100644
--- a/libavformat/url.c
+++ b/libavformat/url.c
@@ -324,6 +324,41 @@ int ff_make_absolute_url(char *buf, int size, const char 
*base,
 return ff_make_absolute_url2(buf, size, base, rel, HAVE_DOS_PATHS);
 }
 
+int ff_percent_encode_url(char *buf, int size, const char *url)
+{
+const char *hex = "0123456789abcdef";
+
+av_assert0(url);
+
+int p = 0;
+int len = strlen(url);
+for (int i = 0; i < len; i++) {
+if (i + 1 < len && ':' == url[i] && '/' == url[i+1])
+buf[p++] = url[i];
+else if (('a' <= url[i] && url[i] <= 'z')
+  || ('A' <= url[i] && url[i] <= 'Z')
+  || ('0' <= url[i] && url[i] <= '9')
+  || '/' == url[i] || '.' == url[i] || '~' == url[i]
+  || '-' == url[i] || '_' == url[i] || '+' == url[i]
+  || '?' == url[i] || '=' == url[i] || '&' == url[i]
+  || '#' == url[i])
+{
+buf[p++] = url[i];
+} else if (' ' == url[i])
+{
+buf[p++] = '+';
+} else
+{
+buf[p++] = '%';
+buf[p++] = hex[url[i] >> 4];
+buf[p++] = hex[url[i] & 15];
+}
+}
+buf[p] = '\0';
+
+return 0;
+}
+
 AVIODirEntry *ff_alloc_dir_entry(void)
 {
 AVIODirEntry *entry = av_mallocz(sizeof(AVIODirEntry));
diff --git a/libavformat/url.h b/libavformat/url.h
index 0784d77b64..85f94f06ce 100644
--- a/libavformat/url.h
+++ b/libavformat/url.h
@@ -331,6 +331,15 @@ int ff_make_absolute_url2(char *buf, int size, const char 
*base,
 int ff_make_absolute_url(char *buf, int size, const char *base,
  const char *rel);
 
+/**
+ * Percent-encode a URL, to avoid it being incorrectly parsed.
+ *
+ * @param buf the buffer where output absolute url is written
+ * @param size the size of buf
+ * @param url the url to encode
+ */
+int ff_percent_encode_url(char *buf, int size, const char *url);
+
 /**
  * Allocate directory entry with default values.
  *
-- 
2.43.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] Accept a colon in the path of a URI, instead of stripping preceding characters.

2025-05-20 Thread Timothy Allen via ffmpeg-devel
Good day

I wanted to offer a discussion of the referenced patch.

I have found that, when a link in an extended M3U file (as used by HLS)
includes a colon, FFmpeg will fail to load the file.

The bug has already been reported: https://trac.ffmpeg.org/ticket/10679

The error reads:
[hls @ 0x78dea4000c80] Failed to open segment 0 of playlist 0 [hls @
0x78dea4000c80] Segment 0 of playlist 0 failed too many times, skipping

The referenced patch fixes the issue. However, it is worth noting that
the patch changes the behavior of one of the unit tests:

- http://a/b/c/d;p?q g:h  => g:h
+ http://a:b/c/d;p?q e=> http://a:b/c/e

The original unit test derives from the following two trac tickets:
https://trac.ffmpeg.org/ticket/8813
https://trac.ffmpeg.org/ticket/8814

This is a breaking change, and, in particular, violates one specific
example given in the rfc at
https://tools.ietf.org/html/rfc3986#section-5.4 (the first example). In
particular, it will affect cases where a URL consists only of
host:port, with no scheme or path, and the base URL links to an
unrelated host. Where before, the new link would take the form of
"host:port", the new link will now use "host:port" as the last element
of the path.

I believe that this behaviour is more intuitive given modern use of
URIs (and is replicated in many browsers), but I recognise this is a
matter of taste.

Thank you,

Tim
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH] Accept a colon in the path of a URI, instead of stripping preceding characters.

2025-05-20 Thread Timothy Allen via ffmpeg-devel
This commit closes trac ticket 10679.

Signed-off-by: Timothy Allen 
---
 libavformat/tests/url.c | 8 +++-
 libavformat/url.c   | 8 +++-
 tests/ref/fate/url  | 7 ++-
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/libavformat/tests/url.c b/libavformat/tests/url.c
index 8644a3e826..8bc3aab6be 100644
--- a/libavformat/tests/url.c
+++ b/libavformat/tests/url.c
@@ -148,9 +148,15 @@ int main(void)
 test("http://a/b";, "//srv/shr/file");
 test("http://a/b";, "d:\\file");
 test("http://a/b";, "C:/file");
+test("http://a:b/c/d;p?q";, "e"); // http://a:b/c/e
+test("http://a:b/c/d;p?q";, "e:f/g"); // http://a:d/c/e:f/g
+test("http://u:w@a:p/b/c/d";, "e");   // http://u:w@a/b/c/e
+test("http://u:w@a:p/b/c/d";, "/e");  // http://u:w@a/e
+test("", "g:h"); // g:h
+test("http://a/b/c/d;p?q";, "g:h");   // http://a/b/c/g:h
 
 /* From https://tools.ietf.org/html/rfc3986#section-5.4 */
-test("http://a/b/c/d;p?q";, "g:h");   // g:h
+//test("http://a/b/c/d;p?q";, "g:h");   // g:h
 test("http://a/b/c/d;p?q";, "g"); // http://a/b/c/g
 test("http://a/b/c/d;p?q";, "./g");   // http://a/b/c/g
 test("http://a/b/c/d;p?q";, "g/");// http://a/b/c/g/
diff --git a/libavformat/url.c b/libavformat/url.c
index d5dd6a4666..abcd3d2366 100644
--- a/libavformat/url.c
+++ b/libavformat/url.c
@@ -100,7 +100,13 @@ int ff_url_decompose(URLComponents *uc, const char *url, 
const char *end)
 /* scheme */
 uc->scheme = cur;
 p = find_delim(":/?#", cur, end); /* lavf "schemes" can contain options 
but not some RFC 3986 delimiters */
-if (*p == ':')
+/* A colon can indicate a separator for the scheme
+   (https://host/), or the userinfo (user:pass@host/), or
+   the host/port (host:port). It can also be valid within the
+   path. To distinguish the scheme, require it to be followed
+   by a slash to indicate the scheme.
+ */
+if (*p == ':' && (p[1] == '/' || p[1] == '\\' || p[1] == '.'))
 cur = p + 1;
 
 /* authority */
diff --git a/tests/ref/fate/url b/tests/ref/fate/url
index 8489d10968..229f9815f5 100644
--- a/tests/ref/fate/url
+++ b/tests/ref/fate/url
@@ -99,7 +99,12 @@ Testing ff_make_absolute_url:
 http://a/b //srv/shr/file   => 
http://srv/shr/file
 http://a/b d:\file  => 
d:\file
 http://a/b C:/file  => 
C:/file
-http://a/b/c/d;p?q g:h  => g:h
+http://a:b/c/d;p?q e=> 
http://a:b/c/e
+http://a:b/c/d;p?q e:f/g=> 
http://a:b/c/e:f/g
+  http://u:w@a:p/b/c/d e=> 
http://u:w@a:p/b/c/e
+  http://u:w@a:p/b/c/d /e   => 
http://u:w@a:p/e
+   g:h  => g:h
+http://a/b/c/d;p?q g:h  => 
http://a/b/c/g:h
 http://a/b/c/d;p?q g=> 
http://a/b/c/g
 http://a/b/c/d;p?q ./g  => 
http://a/b/c/g
 http://a/b/c/d;p?q g/   => 
http://a/b/c/g/
-- 
2.43.0
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] Accept a colon in the path of a URI, instead of stripping preceding characters.

2025-05-21 Thread Timothy Allen via ffmpeg-devel
On Tue, 2025-05-20 at 20:03 +, softworkz . wrote:
> I was just about to reply and suggest to replace those colons with
> %3A
> (url-encoded) when I read the ticket, which already suggests that.
> 
> Have you tried it? It sounds like a much better way to me.

I think this would be a common-sense solution as long as one controls
the server/content.

The reason I submitted the patch anyway was because not everyone will
control the content they're consuming, and because, although I
acknowledge it technically breaks RFCs, there is an argument that the
RFC's behaviour is surprising; we can certainly see that other
applications (Safari, in the linked ticket) break the RFC as well.

I can certainly understand if the patch is rejected -- even if it
didn't break the RFC, Postel's Law is not in vogue any longer. However,
I think the workflows that it might break ("host:port", with no scheme
or path) are much less obvious and intuitive than the workflows that it
rescues.

Regardless of the decision, thanks for reviewing the patch!

Best,

Tim
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".