Re: [External] : Re: New candidate JEP: 431: Sequenced Collections

Stuart Marks Fri, 28 Oct 2022 17:16:35 -0700



On 10/18/22 8:04 AM, fo...@univ-mlv.fr wrote:

Introduction of new types always poses a dilemma for library maintainers. They
can
move forward aggressively and embrace new features, or they can remain on older
releases in order to reach a broader audience. That's not a reason to avoid
introducing new types.


I agree in absolute, but in this specific case you want to introduce interfaces 
deep into the existing hierarchy of interfaces, not on the top.
The introduction of CharSequence has been painful and very slow, that interface 
was added in 1.4 but we had to wait 9 more than 10 years later to have 
parseInt() that takes a CharSequence.
As an application developer, a decade is the kind of time i will have to wait 
before i can use a sequenced collection.

Adding default methods is far faster in term of adoption, computeIfAbsent or 
removeIf were available at the time of the release of 1.8.
Default methods have issues too, it only works if a good default implementation 
exists.

You seem to be talking about introduction of new types in opposition to introductionof default methods. This proposal does both. Clearly if you're using an old library,you can't use new types with it. But if the old library consumes or produces say, aList, new applications can definitely use all the default methods added to List bythis proposal.

We've previously discussed adding new default methods (getFirst, reversed) to someexisting type (like Collection) instead of on a new type. The problem is that theyappear on all collections, even those that have no encounter order. Theirspecifications would thus be so weak as to be meaningless. That's the justificationfor adding a new type: to give these methods meaning.

- LinkedHashMap can be tweaked in two ways, by passing an access order as 3rd
parameter of the constructor or by overriding removeEldesEntry(), in both cases
the resulting LinkedHashMap is at odds with the contract of SequencedMap but
the compiler will happily allow to see those LinkedHashMap as SequencedMap.


SequencedMap specifies that entries have an encounter order but it doesn't make
any
requirements on how that order is established. LinkedHashMap's access-order mode
is
unusual in that it specifies that specific methods additionally have the side
effect
of reordering the relevant entry. The removeEldestEntry() method modifies the
behavior of mapping-insertion methods to optionally remove the first ("eldest")
entry. Neither of these behaviors conflicts with anything in SequencedMap.


It conflicts with the idea of the first element (or last element)
   SequencedSet set = ...
   var first = set.getFirst();
   assertNotNull(first);
   set.add("foo");
   var first2 = set.getFirst();
   assertEquals(first, first2);   // should always be true, right ?

(LinkedHashSet doesn't have a removeEldestEntry() method; that's only onLinkedHashMap. But let's consider the example to be rewritten as if it were aboutLinkedHashMap, or a SequencedSet built from the keySet of a LinkedHashMap.)


See draft specification here:

https://github.com/openjdk/jdk/pull/7387/files#diff-be823ac1fb09a9858a40bdd29359911e534f5d9d3079e69ed31a34e37f72c7a0

There is a definition of encounter order, but there is no specification of how theencounter order is established or how it's modified. That's left to implementationsand subtypes.

This example 1) gets the first element, 2) adds another element, and 3) gets thefirst element again. Step 2 is a modification to the collection, so there's no basisfor asserting that the element obtained in step 1 always equals the element obtainedin step 3.

In particular, LinkedHashMap's "removeEldestEntry" feature says that add()potentially has additional side effects beyond adding the argument element,including of course removing the first element.

Also, in the example, if SequencedSet is a TreeSet<String> containing "x", thenadding "foo" quite sensibly changes the first element between steps 1 and 3.

- LinkedHashMap/LinkedHashSet are dinosaurs, there are more efficient
implementations of the concepts of ordered set / ordered map both in term of
memory footprint and runtime execution, so adding new interfaces without
exploring new implementations using Valhalla value class and in relation with
the Collection Literal JEP seems premature to me.


LinkedHashMap and LinkedHashSet are in fact used fairly frequently. They're not
used
as much as HashMap, but they're used more than ArrayDeque or TreeMap. Presumably
this is because people find LinkedHashMap/Set useful and that the overhead
compared
to their non-linked counterparts is acceptable for the additional services they
provide.

The performance and footprint of LinkedHashMap/Set are not impacted either way
by
the JEP, nor are any potential future performance improvements (including those
that
might arise because of Valhalla) precluded by the JEP.


I don't disagree with all you say above, i'm saying that with Valhalla, we need 
to explore implementations of HashMap based on open addressing to get better 
cache performance.
So adding addFirst() to LinkedHashMap now, is a risk.

As the LinkedHashMap implementation stands today (prior to JEP 431) it needs toperform a variety of operations that affect the order of the elements. TheaddFirst() method is a new operation, but it's only a small amount of additionalwork, given that there are already other operations that operate on both ends.

In a hypothetical future open-addressed Map, there would need to be some additionaldata structure (likely an array?) that maintains the encounter order. This structurewould need to support various element shifting and copying operations to maintainthe ordering; consider what would need to happen with a get() operation on anaccess-ordered map. I don't see any problem with implementing an addFirst()operation on this structure.

Also, addFirst on a sequenced collection is a dangerous method, exactly like 
List.get() is, depending on the implementation the worst case complexity can be 
either O(1) or O(n).
I believe we will all live in a better place if we do not add new methods with such 
"complexity range".

Not the same at all. The problem with LinkedList.get(i) is that people put it in afor-loop over int indexes, which seems like it should be O(n), but as we know it'sO(n^2). That occurs a lot more frequently than indexed add() or remove().


s'marks

Re: [External] : Re: New candidate JEP: 431: Sequenced Collections

Reply via email to