On 2 Jun 2025, at 22:29, Rémi Denis-Courmont wrote:
> Le torstaina 22. toukokuuta 2025, 21.38.32 Itä-Euroopan kesäaika Marvin Scholz > a écrit : >> When using a literal IPv6 address as hostname, >> it can contain a Zone ID >> especially in the case of link-local addresses. Sending this to the >> server in the Host header is not useful to the server and in some cases >> servers refuse such requests. >> >> To prevent any such issues, strip the Zone ID from the address if it's >> an IPv6 address. This also removes it for the Cookies lookup. >> >> Based on a patch by: Daniel N Pettersson <danie...@axis.com> >> --- >> libavformat/http.c | 60 +++++++++++++++++++++++++++++++++++++++++++++- >> 1 file changed, 59 insertions(+), 1 deletion(-) >> >> diff --git a/libavformat/http.c b/libavformat/http.c >> index f7b2a8a029..3bde616b43 100644 >> --- a/libavformat/http.c >> +++ b/libavformat/http.c >> @@ -24,6 +24,7 @@ >> #include "config.h" >> #include "config_components.h" >> >> +#include <string.h> >> #include <time.h> >> #if CONFIG_ZLIB >> #include <zlib.h> >> @@ -209,6 +210,63 @@ void ff_http_init_auth_state(URLContext *dest, const >> URLContext *src) sizeof(HTTPAuthState)); >> } >> >> +static bool host_is_numeric_ipv6(const char *host) >> +{ >> + bool res = false; >> +#if defined(AF_INET6) >> + struct addrinfo hints = { .ai_flags = AI_NUMERICHOST }, *ai; >> + if (getaddrinfo(host, NULL, &hints, &ai) == 0) { >> + if (ai->ai_family == AF_INET6) >> + res = true; >> + freeaddrinfo(ai); >> + } >> +#else >> + // Just guess based on if the host contains a ':' >> + if (strchr(host, ':') != NULL) >> + res = true; >> +#endif >> + return res; >> +} Hi, thanks for the review. > > At least in a URL, the distinction is done by the presence of surrounding > brackets, not actually parsing the address. And on the flip side, to my > knowledge, there are no guarantees that getaddrinfo() even copes well with > scope IDs. That's platform-dependent. Given the first thing that is done to the URL is to split it into components, that removes the [] so I can't just check for those, sadly. Anyway if this does not work, it is a pre-existing issue as this is essentially the same that ff_url_join did internally before. Not saying we should not fix it, but apparently no one ran into this till now. I guess I can just check for presence of : though as I cant think of any valid case where a non-IPv6 host would contain a :? > >> + >> +/** >> + * Copy the normalized host to the given buffer >> + * >> + * If the host is a normal hostname, this just returns >> + * host:port. However in case of an IPv6 address, it >> + * ensures proper escaping with [] and removes the >> + * zone identifier, if any, making the return suitable >> + * for example for use in the HTTP Host header. >> + */ >> +static unsigned copy_normalized_host(char *out, unsigned size, >> + const char *host, const int port) >> +{ >> + AVBPrint bp; >> + av_bprint_init_for_buffer(&bp, out, size); >> + >> + if (host_is_numeric_ipv6(host)) { >> + // This is an IPv6 address, so we need to strip the Zone ID, >> + // if any. >> + // While technically we could have percent encoding even in >> + // the Zone ID, this doesn't seem to be a relevant case in >> + // the real world on any platform. >> + char *percent = strrchr(host, '%'); > > Uh, doesn't Linux actually use % in interface names sometimes? I never encountered that, but I can just use strchr here anyway I just realized as percent encoding is not used in other cases anyway, in the hostname. > >> + if (percent) { >> + int len = (percent - host); >> + av_bprintf(&bp, "[%.*s]", len, host); >> + } else { >> + av_bprintf(&bp, "[%s]", host); >> + } >> + } else { >> + // Host is not an IPv6 address, so just use as-is >> + av_bprintf(&bp, "%s", host); >> + } > > This looks like reverse abstraction and kinda sketchy. How do you end up with > a scope ID in the input? Simplest example is a user providing a URL with a IPv6 literal that has a scope ID. > > While it's true that it shouldn't be sent to the server, it's also so that it > shouldn't appear in the URL. There is RFC 6874 that solely focuses an how to represent a scope ID in a URI, why is that invalid? And if it were, how would the user tell ffmpeg which interface to use for link local IPv6? > In other words, it should have been stripped > earlier than the HTTP input module. The Host field should be the same as the > host in the absolute URL. I don't think there is an earlier point that makes sense to strip it? I am stripping it in the internal http context open function, I can not strip it before that as the underlying connection still needs to know the proper zone id to correctly establish the connection using the right interface. Unless I am missing something or misunderstand what you meant, I am not sure how I can change this. > > -- > Rémi Denis-Courmont > Villeneuve de Tapiola, ex-République finlandaise d´Uusimaa > > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".