Hi Peter,

Thanks for the review and suggestion.

This appears to be a better approach. I was wondering if I should go this way to cache
those lookup tables as well, but ...

Webrev has been updated as suggested.

Thanks!
Sherman

On 2/3/16 3:26 AM, Peter Levart wrote:
Hi Again,

I also have a comment on the implementation of the hash table and it's GC-ness. You keep the string pool under a SoftReference because it is the biggest object so it makes most sense to clear it when heap becomes full. Other arrays are kept strongly reachable, but you re-generate them nevertheless when string pool is cleared by GC. Would it make sense to:

- define all int[] arrays (including strPool) as final instance variables of CharacterName
- have one static field:

private static SoftReference<CharacterName> refInstance;

- convert initNamePool() into a private constructor.

- convert public static getName/getCodePoint into public instance methods

- introduce public static method:

public static CharacterName getInstance() {
    SoftReference<CharacterName> ref = refInstance;
    CharacterName instance;
    if (ref != null && ((instance = ref.get) != null) {
        return instance;
    }
    instance = new CharacterName();
    refInstance = new SoftReference<>(instance);
    return instance;
}

...in this arrangement, you don't need volatile fields or explicit fences, as all instance fields can be final and JMM guarantees safe publication in this case.


Regards, Peter

On 02/03/2016 12:07 PM, Peter Levart wrote:
Hi Sherman,

I don't currently have any idea how to squeeze the hashtable any more further. It is already very compact in my opinion. But I noticed a data race that could result in navigating the half-initialized data structure. It is a classical unsafe publication bug. It has been present before in get(int cp) and it is present now in both getName(int cp) and getCodePoint(String name). For example:

 151     static int getCodePoint(String name) {
 152         byte[] strPool = null;
153 if (refStrPool == null || (strPool = refStrPool.get()) == null) {
 154             strPool = initNamePool();
 155         }

vs.

111             refStrPool = new SoftReference<>(strPool);

...the static refStrPool field is not marked volatile.

One way to fix this is to mark field volatile and then rearrange code in getName/getCodePoint to only read from it once by introducing a local var. The other would be to change the line 111 into something like:

SoftReference<byte[]> rsp = new SoftReference<>(strPool);
unsafe.storeFence();
refStrPool = rsp;

...*and* also rearrange code in getName/getCodePoint to only read from field once by introducing a local var.


Regards, Peter

On 02/02/2016 10:25 PM, Xueming Shen wrote:
Hi,

Have not heard any feedback on this one so far. I'm adding
a little more to make it attractive for reviewers :-)

On top of the \N now the webrev includes the proposal to add
two more matchers, \X for unicode extended grapheme cluster
and \b{g} for the corresponding boundary.

Issue: https://bugs.openjdk.java.net/browse/JDK-7071819
Issue: https://bugs.openjdk.java.net/browse/JDK-8147531
webrev: http://cr.openjdk.java.net/~sherman/8147531_7071819/webrev/

Thanks!
Sherman

On 01/18/2016 11:52 PM, Xueming Shen wrote:
Hi,

Please help review the change to add \N support in regex.

Issue: https://bugs.openjdk.java.net/browse/JDK-8147531
webrev: http://cr.openjdk.java.net/~sherman/8147531/webrev

This is one of the items we were planning to address via JEP111
http://openjdk.java.net/jeps/111
https://bugs.openjdk.java.net/browse/JDK-8046101

Some of the constructs had been added already in early release. I'm
planning to address the rest as individual rfe separately.

Thanks,
Sherman




Reply via email to