Le perjantaina 23. toukokuuta 2025, 17.53.02 Itä-Euroopan kesäaika Timothy 
Allen via ffmpeg-devel a écrit :
> This commit closes trac ticket 10679.
> 
> Signed-off-by: Timothy Allen <t...@treehouse.org.za>
> ---
>  libavformat/hls.c |  9 +++++++++
>  libavformat/url.c | 35 +++++++++++++++++++++++++++++++++++
>  libavformat/url.h |  9 +++++++++
>  3 files changed, 53 insertions(+)
> 
> diff --git a/libavformat/hls.c b/libavformat/hls.c
> index c7b655c83c..49436d8184 100644
> --- a/libavformat/hls.c
> +++ b/libavformat/hls.c
> @@ -1021,6 +1021,15 @@ static int parse_playlist(HLSContext *c, const char
> *url, seg->key = NULL;
>                  }
> 
> +                ff_percent_encode_url(tmp_str, sizeof(tmp_str), line);
> +                if (!tmp_str[0]) {
> +                    ret = AVERROR_INVALIDDATA;
> +                    if (seg->key)
> +                        av_free(seg->key);
> +                    av_free(seg);
> +                    goto fail;
> +                }
> +                strcpy(line, av_strdup(tmp_str));
>                  ff_make_absolute_url(tmp_str, sizeof(tmp_str), url, line);
>                  if (!tmp_str[0]) {
>                      ret = AVERROR_INVALIDDATA;
> diff --git a/libavformat/url.c b/libavformat/url.c
> index d5dd6a4666..69d68d4248 100644
> --- a/libavformat/url.c
> +++ b/libavformat/url.c
> @@ -324,6 +324,41 @@ int ff_make_absolute_url(char *buf, int size, const
> char *base, return ff_make_absolute_url2(buf, size, base, rel,
> HAVE_DOS_PATHS); }
> 
> +int ff_percent_encode_url(char *buf, int size, const char *url)
> +{
> +    const char *hex = "0123456789abcdef";
> +
> +    av_assert0(url);
> +
> +    int p = 0;
> +    int len = strlen(url);
> +    for (int i = 0; i < len; i++) {
> +        if (i + 1 < len && ':' == url[i] && '/' == url[i+1])
> +            buf[p++] = url[i];
> +        else if (('a' <= url[i] && url[i] <= 'z')
> +          || ('A' <= url[i] && url[i] <= 'Z')
> +          || ('0' <= url[i] && url[i] <= '9')
> +          || '/' == url[i] || '.' == url[i] || '~' == url[i]
> +          || '-' == url[i] || '_' == url[i] || '+' == url[i]
> +          || '?' == url[i] || '=' == url[i] || '&' == url[i]
> +          || '#' == url[i])
> +        {
> +            buf[p++] = url[i];
> +        } else if (' ' == url[i])
> +        {
> +            buf[p++] = '+';
> +        } else
> +        {
> +            buf[p++] = '%';
> +            buf[p++] = hex[url[i] >> 4];
> +            buf[p++] = hex[url[i] & 15];
> +            }
> +    }
> +    buf[p] = '\0';
> +
> +    return 0;
> +}
> +

Percent-encoding is an algorithm that applies to individual URL fragments, and 
then it only makes sense for some types of fragments (path fragments and 
request parameters). You cannot apply it a whole URL. A whole URL is by 
*definition* already encoded. A URL simply cannot exist in a not encoded form, 
be it relative or absolute.

Applying an encoding algorithm meant for fragment on the URL is bound to 
corrupt some legal URLs. An obvious example is if the URL already contains 
percentage signs.

If you need to deal with corrupt URLs or with file paths mistaken for URLs, 
then you first need to ensure that the input is not a valid URL. Only then, you 
can try to apply some heuristics to "fix" it, but systematic percent-encoding 
only makes sense for host-relative and path-relative URLs, not for absolute 
and scheme-relative URls.

So this systematic algorithm is just plain wrong. And for that matter, 
strictly speaking, percent-encoding also encodes colons, brackets, slashes, 
hashes, etc (which goes back to the point about URL fragments vs URLs).

-- 
ヅニ-クーモン・レミ
Tapio's place new town, former Finnish Republic of Uusimaa



_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to