When Java 9 was released, our use of RFC3986 meant that modules with
jrt:/ URL's were supported out of the box, as it was compliant with the
same configuration files running on Java 8, even though we weren't
loading these files. Had we used URL without a jrt provider, it would
have caused runtime failures.
Reading OpenJDK code, I see that CodeSource has been replaced with
CodeSourceKey or similar in various places replacing URL's with a semi
normalized string or unmodified string form.
When the class below was written, SecureClassLoader still used
CodeSource as a key, now it uses a string, however the URL string has
been RFC3986 normalized, so String comparison is still valid.
JGDMS/JGDMS/jgdms-platform/src/main/java/org/apache/river/api/net/RFC3986URLClassLoader.java
at trunk · pfirmstone/JGDMS
<https://github.com/pfirmstone/JGDMS/blob/trunk/JGDMS/jgdms-platform/src/main/java/org/apache/river/api/net/RFC3986URLClassLoader.java>
JGDMS loads classes from many places, eg local file system, local
network, IPv6 global network, JGDMS requires end to end connectivity, so
only supports IPv4 on local networks, however we need consistent
normalized form for IPv6.
Our RFC3986 Uri implementation has a few methods to fix common issues
with malformed URL's, such as those with escaped characters that
shouldn't be escaped, or characters that aren't that need to be etc, we
use it to clean up file paths or url's prior to opening a file or making
a url connection.
--
Regards,
Peter
On 13/11/2024 7:13 am, Peter Firmstone wrote:
They are incompatible.
The existing URI implementation is backward compatible, but its use should be
discouraged in new applications, so use diminishes over time. It's unique to
Java.
RFC3986 is good for unique identity and high performance, best for computer
processed data, we use it for identity, checking URL strings prior to
establishing URL connections, it's also the current standard.
RFC3987 IRI - good for human readability, but not performance, eg manual typing
of IRI.
Thinking out loud:
Would a provider mechanism be appropriate, as the existing api is suitable for
all implementations? Serialized form is a simple string, parsed during
deserialization, but how to distinguish, or does the provider order choose?
Regards,
Peter.
Sent from my iPhone
On 12 Nov 2024, at 12:59 AM, Alan Bateman<alan.bate...@oracle.com> wrote:
On 10/11/2024 12:04, Peter Firmstone wrote:
:
Java doesn't implement RFC2396 strictly, as it has an expanded character set
that doesn't require escaping and can result in more than one normalized form.
My understanding is its these types of corner cases regarding character
escaping are what prevented Java's URI implementation from being upgraded to
RFC3986.
java.net.URI (as opposed to legacy and JDK 1.0 era java.net.URL) rigorously
specifies the deviations from RFC 2396, and the reasons for the deviations.
A big part of the difference between RFC 2396 and 3986 is how the authority
component is treated. With RFC 2396 it gets parsed as either a registry-based
or server-based authority so very different to the newer RFC. Relative
Resolution (in the new RFC) is another significant difference, if URI were
upgraded then its resolve method would produce very different results.
-Alan.