I understand that null is prohibited by default, but can we also provide a set of factory methods that accept null? They can be named like List.ofNullable/copyOfNullable.
On Tue, Jan 31, 2023 at 10:13 AM Stuart Marks <stuart.ma...@oracle.com> wrote: > In this reply I'll focus on the null handling issues in collections. > > As you've noted, there are really (at least) two distinct issues here: > whether a > collection permits nulls, and the behavior of contains(null) queries. > There have > been continual complaints about both, and the issues are somewhat distinct. > > The collection implementations in the JDK are all over the place with > regard to > nulls. I believe this is because the original collections and the > concurrent > collections up through JDK 1.6 or so were mostly done by Josh Bloch and > Doug Lea, > who disagreed about null handling. They had this exchange in 2006: > > Doug Lea wrote: [1] > > > The main reason that nulls aren't allowed in ConcurrentMaps > > (ConcurrentHashMaps, ConcurrentSkipListMaps) is that > > ambiguities that may be just barely tolerable in non-concurrent > > maps can't be accommodated. The main one is that if > > map.get(key) returns null, you can't detect whether the > > key explicitly maps to null vs the key isn't mapped. > > In a non-concurrent map, you can check this via map.contains(key), > > but in a concurrent one, the map might have changed between calls. > > > > Further digressing: I personally think that allowing > > nulls in Maps (also Sets) is an open invitation for programs > > to contain errors that remain undetected until > > they break at just the wrong time. (Whether to allow nulls even > > in non-concurrent Maps/Sets is one of the few design issues surrounding > > Collections that Josh Bloch and I have long disagreed about.) > > > Josh Bloch wrote: [2] > > > I have moved towards your position over the years. It was probably a > > mistake to allow null keys in Maps and null elements in Sets. I'm > > still not sure about Map values and List elements. > > > > In other words, Doug hates null more than I do, but over the years > > I've come to see it as quite troublesome. > > > (I haven't talked to Josh or Doug about this particular issue recently, so > I suppose > it's possible their opinions have changed in the intervening time.) > > I had to decide what to do about nulls when I added the unmodifiable > collections in > JDK 9. Given that Doug "hates" nulls, and Josh finds them at least quite > troublesome, I started from a position of disallowing null members in all > the new > collections. Most of the unmodifiable collections are still this way (but > see below). > > What about contains(null)? Surely it should be ok to query for a null, and > return > false, even if the collection itself can't contain null. However, a bunch > of > collections and collection views in the JDK throw NPE on contains(null), > so the > unmodifiable collections don't set a precedent here. > > If you're dealing with all non-null values, and a null creeps in somehow, > then > calling contains(null) might be an error, and it would be good for the > library to > inform you of that instead of just returning false and letting the program > continue > until it fails later (fail-fast). A similar issue arises with the > non-generic > contains() method. If you have a Collection<Integer>, then shouldn't > contains("abc") > be an error? It's not, and it simply returns false, because the signature > is > contains(Object). But there have been complaints that this should have > been an error > because a Collection of Integer cannot possibly contain a String. Similar > reasoning > applies to contains(null) on a collection that can't contain null. (Yes, > the error > occurs at runtime instead of compile time, but better late than never.) > > A meta-issue is: should a new thing correct a perceived design flaw in the > older > thing, or should it be consistent with the old thing? Either way, you lose. > > Another factor in my original decision was that it's much easier to start > with a > restriction and relax it later than it is to go the other way. We have > some > flexibility to allow nulls and contains(null) if we want; but it's > essentially a > non-starter to disallow nulls once they've been allowed somewhere. > > That's why my starting position for the design prohibited nulls everywhere. > > Over time we've relaxed some restrictions around null handling in the > unmodifiable > collections. Streams permit nulls, and the List returned from > Stream.toList() also > permits nulls, and it also allows contains(null). So there's a different > unmodifiable List implementation under there, accessible only via streams. > (Note > that the null-permitting unmodifiable lists lose the optimizations for > sizes 0, 1, > and 2, which are in part based on nulls being disallowed.) > > I'm fairly sympathetic to the case for changing the unmodifiable > collections to > allow contains(null), given that the current NPE-throwing behavior has > generated a > fair amount of complaints over the years. And the situation with having > two flavors > of unmodifiable list is quite odd and is an uncomfortable point in the > design space. > > I should point out however that even if the unmodifiable collections were > changed to > allow contains(null), it's still kind of unclear as to whether this code > is safe: > > public void addShoppingItems(Collection<String> shoppingItems) { > if (shoppingItems.contains(null)) { // this looks quite reasonable > and safe... > throw new IllegalArgumentException("shoppingItems should not > contain nulls"); > } > ... > } > > If you're only ever passing ArrayList, Arrays.asList(), and List.of() > lists here, > then sure, this would work great. If the source collection is of unknown > provenance, > you can still run into trouble: > > 1. contains(null) calls on the keySet and values collections of > ConcurrentHashMap > and Hashtable all throw NPE. So does contains(null) on a set from > Collections.newSetFromMap(), which is built from those maps' keySets. (I > suppose > Doug Lea might be convinced to relax the CHM behavior.) > > 2. Speaking of concurrency, if the source collection is being modified > concurrently, > then a null could be introduced after the check (TOCTOU) so you'd need to > make a > defensive copy before checking it. > > 3. NavigableSet and friends delegate contains() to the comparison method, > which > might or might not accept nulls. There's no way the collection > implementation can > tell in advance. > > You might consider List.copyOf() as an alternative, as that makes a > null-free > defensive copy in one shot, and avoids copying if the source is an > unmodifiable > list. If you don't care about concurrency, then > > shoppingItems.forEach(Objects::requireNonNull) > > might be suitable. > > If unmodifiable collections were to change to allow contains(null), this > probably > would make them somewhat less annoying, but it wouldn't necessarily solve > yours or > everyone's problems. There's simply no way at this point to make the > collections' > treatment of nulls uniform. > > s'marks > > > [1] > > https://web.archive.org/web/20210510190135/https://cs.oswego.edu/pipermail/concurrency-interest/2006-May/002485.html > > [2] > > https://web.archive.org/web/20200930213909/http://cs.oswego.edu/pipermail/concurrency-interest/2006-May/002486.html > > > > > On 1/29/23 7:28 AM, John Hendrikx wrote: > > TLDR; why does contains(null) not just return false for collections that > don't allow > > nulls. Just because the interface allows it, does not mean it should do > it as it > > devalues the usefulness of the abstraction provided by the interface. > > > > Background: > > > > I'm someone that likes to check correctness of any constructor or method > parameter > > before allowing an object to be constructed or to be modified, in order > to maintain > > invariants that are provided by the class to its users. > > > > This ranges from simple null checks, to range checks on numeric values, > to complete > > checks like "is a collection sorted" or is a list of nodes acyclic. > Anything I can > > check in the constructor that may avoid problems further down the line > or that may > > affect what I can guarantee on its own API methods. For example, if I > check in the > > constructor that something is not null, then the associated getter will > guarantee > > that the returned value is not null. If I check that a List doesn't > contain nulls, > > then the associated getter will reflect that as well (assuming it is > immutable or > > defensivily copied). > > > > For collections, this is currently becoming harder and harder because > more and more > > new collections are written to be null hostile. It is fine if a > collection doesn't > > accept nulls, but throwing NPE when I ask such a collection if it > contains null > > seems to be counter productive, and reduces the usefulness of the > collection > > interfaces. > > > > This strict interpretation makes the collection interfaces harder to use > than > > necessary. Interfaces are only useful when their contract is well > defined. The more > > things an interface allows or leaves unspecified, the less useful it is > for its users. > > > > I know that the collection interfaces allow this, but you have to ask > yourself this > > important question: how useful is an interface that makes almost no > guarantees about > > what its methods will do? Interfaces like `List`, `Map` and `Set` are > passed as > > method parameters a lot, and to make these useful, implementations of > these > > interfaces should do their best to provide as consistent an experience > as is > > reasonably possible. The alternative is that these interfaces will > slowly decline in > > usefulness as methods will start asking for `ArrayList` instead of > `List` to avoid > > having to deal with a too lenient specification. > > > > With the collection interfaces I get the impression that recently there > has been too > > much focus on what would be easy for the collection implementation > instead of what > > would be easy for the users of said interfaces. In my opinion, the > concerns of the > > user of interfaces almost always far outweigh the concerns of the > implementors of > > said interfaces. > > > > In the past, immutable collections were rare, but they get are getting > more and more > > common all the time. For example, in unit testing. Unfortunately, these > immutable > > collections differ quite a bit from their mutable counterparts. Some > parts are only > > midly annoying (not accepting `null` as the **value** in a `Map` for > example), but > > other parts require code to be rewritten for it to be able to work as a > drop-in > > replacement for the mutable variant. A simple example is this: > > > > public void addShoppingItems(Collection<String> shoppingItems) { > > if (shoppingItems.contains(null)) { // this looks quite > reasonable and > > safe... > > throw new IllegalArgumentException("shoppingItems should > not contain > > nulls"); > > } > > > > this.shoppingItems.addAll(shoppingItems); > > } > > > > Testing this code is annoying: > > > > x.addShoppingItems(List.of("a", null")); // can't construct > immutable > > collection with null > > > > x.addShoppingItems(Arrays.asList("a", null")); // fine, go back > to what we > > did before then... > > > > The above problems, I suppose we can live with it; immutable collections > don't want > > `null` although I don't see any reason to not allow it as I can write a > similar > > `List.of` that returns immutable collections that do allow `null`. For > JDK code this > > is a bit disconcerting, as it is supposed to be as flexible as is > reasonable without > > having too much of an opinion about what is good or bad. > > > > This next one however: > > > > assertNoExceptionThrown(() -> x.addShoppingItems(List.of("a", > "b"))); // not > > allowed, contains(null) in application code throws NPE > > > > This is much more insidious; the `contains(null)` check has been a very > practical > > way to check if collections contain null, and this works for almost all > collections > > in common use, so there is no problem. But now, more and more > collections are > > starting to just throw NPE immediately even just to **check** if a null > element is > > present. This only serves to distinguish themselves loudly from other > similar > > collections that will simply return `false`. > > > > I think this behavior is detrimental to the java collection interfaces > in general, > > especially since there is a clear answer to the question if such a > collection > > contains null or not. In fact, why `contains` was ever allowed to throw > an exception > > aside from "UnsupportedOperationException" is a mystery to me, and > throwing one when > > one could just as easily return the correct and expected answer seems > very opiniated > > and almost malicious -- not behavior I'd expect from JDK core libraries. > > > > Also note that there is no way to know in advance if `contains(null)` is > going to be > > throwing the NPE. If interfaces had a method `allowsNulls()` that would > already be > > an improvement. > > > > --John > > > > >