RFR: JDK-8051713 - URL constructor/equals/hashCode/sameFile/openConnection synchronization issues

2014-07-23 Thread Peter Levart

I created an issue for this:

https://bugs.openjdk.java.net/browse/JDK-8051713

The proposed patch is still the following:

http://cr.openjdk.java.net/~plevart/jdk9-dev/URL.synchronization/webrev.01/

Regards, Peter


On 07/11/2014 05:11 PM, Peter Levart wrote:

Hi,

java.net.URL is supposed to behave as an immutable object, so URL 
instances can be shared among threads and among parts of code without 
fear that they will be modified. URL class has an unusual way to 
achieve this (or at least it tries to). Partly because of the design 
which uses:


- URL constructor(s) that take a 'spec' String to be parsed into URL 
object
- parsing is delegated to various URLStreamHandler(s) which are chosen 
in the URL constructor (depending on the protocol used in URL string 
to be parsed)


An unitialized URL instance (this) is passed from constructor to the 
chosen URLStreamHandler which has the responsibility to parse the 
string and set back the fields that hold various parts of URL object 
being constructed. Consequently, these fields can not be declared as 
final, as definite assignment analysis doesn't cross method borders. 
It is therefore illegal to unsafely publish URL instances (via data 
races) to non-constructing threads because they can appear not fully 
initialized. Nevertheless URL, with the help of various 
URLStreamHandler implementations, tries hard to make URL appear stable 
at least where it is required to be stable. For example: 
URL.hashCode() is (almost) stable even if URL instance is unsafely 
published. This is achieved by making hashCode() synchronized and 
cache the result. At least one way of constructing URLs - constructors 
that take 'spec' String to be parsed - is also making sure that 
hashCode is computed from fully initialized fields, as parsing is 
delegated to URLStreamHandler which uses package-private URL.set() 
method to set back the parsed fields and the set() method is also 
synchronized. But making URL appear stable even though it is published 
unsafely doesn't seem to be the primary concern of URL 
synchronization. Other public URL constructors that take individual 
URL parts and don't delegate parsing to URLStreamHandler but set 
fields directly (not via set() method), are not synchronized.


Primary concern of synchronization in URL appears to be driven from 
the fact that some URL operations like hasCode(), equals(), sameFile() 
and openConnection() read multiple URL fields and URL.set() which can 
be called from custom URLStreamHandler at any time (although this is 
not it's purpose - it should only call-back while parsing/constructing 
the URL) can set those fields. And those multi-field operations would 
like to see a "snapshot" of field values that is consistent. But 
synchronization that is performed to achieve that is questionable. 
Might be that in Java 1.0 times the JVM implementation assumptions 
were different and synchronization was correct, but nowadays Java 
memory model makes them invalid.


URL.hasCode() apears to be the only method properly synchronized which 
makes it almost stable (doesn't return different results over time) 
but even hashCode() has a subtle bug or two. The initializer for 
hashCode field sets it to value -1 which represents "not yet computed" 
state. If URL is published unsafely, hashCode() method could see the 
"default" value of 0, which would be returned. A later call to 
hashCode() would see value -1 which would trigger computation and a 
different value would be returned. The other subtle bug is a 
relatively improbable event that hashCode computation results in value 
-1 which means "not yet computed". This can be seen as performance 
glitch (as hashCode will never be cached for such URL instance) or an 
issue which makes hashCode unstable for one of the reasons why 
equals() is unstable too (see below).


If URL.hasCode() method is almost stable (doesn't return different 
results over time) and at least one way of URL construction makes sure 
that hashCode is also calculated from fully initialized parts, then 
URL.equals() and other methods that delegate to URLStreamHandler are a 
special story. URL.equals() can't be synchronized on URL instance, 
because it would have to be synchronized on both URL instances that 
are being compared and this is prone do dead-locks. Imagine:


thread1: url1.equals(url2)
thread2: url2.equals(url1)

So equals() chooses to not be synchronized and therefore risks not 
being stable if URL instances are published unsafely. But it 
nevertheless uses synchronization. equals() delegates it's work to the 
1st URL's URLStreamHandler which synchronizes on itself when 
calculating and caching the InetAddress of each individual URL's host 
name. InetAddress (if resolvable) is used in preference to host name 
for comparison (and also in hashCode()). URL.equals() risks not being 
stable for the following reasons:


- URL instances published unsafely can appear not fully initialized to 
equals() even though 

Re: Do not JDK-6967684 backport to JDK6?

2014-07-23 Thread KUBOTA Yuji

(2014/07/17 1:20 JST), Omair Majid  wrote:
> Bug and patch look straight-forward and correct. No objections from me.

Thank you for your help! and sorry for delayed reply.

I do not have any role of OpenJDK projects.
So, can someone please backport of JDK-6967684 to JDK6?

Best regards,
KUBOTA Yuji.