On 27/10/2022 4:26 pm, Alan Bateman wrote:
On 26/10/2022 23:53, Peter Firmstone wrote:
The change will have some performance impact, by requiring redundant
parsing.
Just thought I'd mention it, in case it hasn't been thought of. If
you do an internet search there are other implementations of RFC3986
in java also.
https://github.com/pfirmstone/JGDMS/blob/e4a5012e71fd9a61b6e1e505f07e6c5358a4ccbc/JGDMS/jgdms-platform/src/main/java/org/apache/river/api/net/Uri.java#L1966
We have a strict URI 3986 implementation, which we use to create all
URL instances from.
If your parser is using the one-arg URL constructor to create the URL
then it will be parsed again, so you may already have duplicate
parsing. That said, there may be an argument that libraries should be
able to do their own parsing and continue to construct a URL from its
components with the non-validating constructors.
You are right, it was over 10 years ago when I originally wrote it, I'm
a little rusty on the reason why I chose the single argument
constructor, it's possible I would do it differently if implementing it
today. I might have been worried about what might break at the time.
We did have a developer in a downstream project who used a URN and a
custom URLHandler, made possible by the change to RFC3986.
I think today we could safely change it to a non parsing constructor.
As I'm sure you know, changing URI to strictly implement RFC 3986 is
not a compatible move. It was attempted in JDK 6 but had to backed out
quickly as it caused widespread breakage. Hierarchical URIs using
registry based authority components was one of significant issues.
There has been exploration and prototypes since then to try to find a
direction but there isn't a proposal right now.
The performance cost of retaining compatibility was a problem as we had
to rely on DNS and file system access to determine equality as
java.net.URI wasn't suitable, we wanted a normalized form, with
guaranteed equality without network or file system access, so we decided
to break compatibility, as the tradeoff was significantly in
performance's favour.
It's used throughout our software, there's a URI3986ClassLoader, which
extends SecureClassLoader, it's also used in unicast discovery of
services (over network) and it's also used for Policy decisions...
https://github.com/pfirmstone/JGDMS/blob/trunk/JGDMS/jgdms-platform/src/main/java/org/apache/river/api/net/RFC3986URLClassLoader.java
https://github.com/pfirmstone/JGDMS/blob/trunk/JGDMS/jgdms-platform/src/main/java/net/jini/core/discovery/LookupLocator.java
In Policy to replace CodeSource#implies functionality:
https://github.com/pfirmstone/JGDMS/blob/e4a5012e71fd9a61b6e1e505f07e6c5358a4ccbc/JGDMS/jgdms-platform/src/main/java/org/apache/river/api/security/URIGrant.java#L132
Historically our software was horribly impacted by URL#equals and
URL#hashCode and we went to a great deal of trouble to eliminate all
such calls and replace them with Uri. Which is why Uri is immutable,
normalized and not Serializable.
We utilised bitshift operations for case conversion, after String
methods became hotspots.
https://github.com/pfirmstone/JGDMS/blob/trunk/JGDMS/jgdms-platform/src/main/java/org/apache/river/api/net/URIEncoderDecoder.java
I suspect the best solution might be to leave java.net.URI alone and
write a new implementation that isn't compatible, then eventually
deprecate and remove java.net.URI. Tip: Don't make it Serializable,
let people Serialize the string form instead.
Regards,
Peter.