On Mon, 6 May 2024 18:36:24 GMT, Sergey Chernyshev <schernys...@openjdk.org> wrote:
>> There are two distinct approaches to parsing IPv4 literal addresses. One is >> the Java baseline "strict" syntax (all-decimal d.d.d.d form family), another >> one is the "loose" syntax of RFC 6943 section 3.1.1 [1] (POSIX `inet_addr` >> allowing octal and hexadecimal forms [2]). The goal of this PR is to provide >> interface to construct InetAddress objects from literal addresses in POSIX >> form, to applications that need to mimic the behavior of `inet_addr` used by >> standard network utilities such as netcat/curl/wget and the majority of web >> browsers. At present time, there's no way to construct `InetAddress` object >> from such literal addresses because the existing APIs such as >> `InetAddress.getByName()`, `InetAddress#ofLiteral()` and >> `Inet4Address#ofLiteral()` will consume an address and successfully parse it >> as decimal, regardless of the octal prefix. Hence, the resulting object will >> point to a different IP address. >> >> Historically `InetAddress.getByName()/.getAllByName()` were the only way to >> convert a literal address into an InetAddress object. `getAllByName()` API >> relies on POSIX `getaddrinfo` / `inet_addr` which parses IP address segments >> with `strtoul` (accepts octal and hexadecimal bases). >> >> The fallback to `getaddrinfo` is undesirable as it may end up with network >> queries (blocking mode), if `inet_addr` rejects the input literal address. >> The Java standard explicitly says that >> >> "If a literal IP address is supplied, only the validity of the address >> format is checked." >> >> @AlekseiEfimov contributed JDK-8272215 [3] that adds new factory methods >> `.ofLiteral()` to `InetAddress` classes. Although the new API is not >> affected by the `getaddrinfo` fallback issue, it is not sufficient for an >> application that needs to interoperate with external tooling that follows >> POSIX standard. In the current state, `InetAddress#ofLiteral()` and >> `Inet4Address#ofLiteral()` will consume the input literal address and >> (regardless of the octal prefix) parse it as decimal numbers. Hence, it's >> not possible to reliably construct an `InetAddress` object from a literal >> address in POSIX form that would point to the desired host. >> >> It is proposed to extend the factory methods with >> `Inet4Address#ofPosixLiteral()` that allows parsing literal IP(v4) addresses >> in "loose" syntax, compatible with `inet_addr` POSIX api. The implementation >> is based on `.isBsdParsableV4()` method added along with JDK-8277608 [4]. >> The changes in the original algorithm are as follows: >> >> - `IPAddressUtil#parseB... > > Sergey Chernyshev has updated the pull request incrementally with one > additional commit since the last revision: > > changed class level docs > Ok - so now we seem to be missing the fact that the _only_ form supported by > the existing methods is the decimal form. > Can we find a wording that makes this clear? Would you think this should be mentioned in the class level doc of Inet4Address class? Alternatively, the method description of `ofPosixLiteral` could say that "this is the only method accepting non-decimal forms of address literals". There's no such thing as "existing methods", there are "all methods consuming address literals with an exception of `ofPosixLiteral`" - this could be a phrasing in the class level doc. Please let me know what you think. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18493#issuecomment-2107658936