On 07/08/2024 21:38, Andy Goryachev wrote:
John:
Looks like the issue in PR#1123 was caused by application code
accessing FX from a non-FX thread.
That does seem likely, but it is incredibly hard to be sure.
I know the discussion is about trying to detect thread misuse early
on, as a way to help the application developers. I vaguely recall the
discussion we had at some point about adding the check to property
invalidation code or some such, and the general consensus was that it
might pose a compatibility risk or something along these lines.
Perhaps there is a place where such a check might be safely made (I
don't know of any such place though)? We did, after all, step on the
path of simplification (?) for the sake of students (see
https://openjdk.org/jeps/445 ). Perhaps such a check could be hidden
behind a system property so as not to break existing, if poorly
written, applications.
I agree that we should be careful not to break anything, it is a check,
but I also think that most developers will want this check on (but
probably not by default initially, but definitely **after** they fixed
any mistakes it detected to avoid recurrence). There is nothing worse
than code that only rarely and inexplicably fails on other machines.
For me, I wouldn't hesitate for a moment to turn checks on that alert me
of incorrect use of FX -- I value my time, and bugs resulting from these
threading problems can be very costly to track down. I already wrote my
own check that scans for dead "nodes" (ie. nodes that were once part of
a scene graph, but are no longer, but are still not being GC'd).
Personally, I feel that it should be the first thing taught - always
call FX from the FX application thread.
Even Java collections will help you avoid threading problems by throwing
ConcurrentModificationException. Imagine a moment a world where this
check did not exist, as it does cost some performance (a mod counter is
increased on every collection modification):
- Concurrent access to things like HashMap and ArrayList would seem to
work fine
- Sometimes randomly, an item goes missing, is overwritten or
duplicated, or the code gets stuck in an infinite loop (these are all
possible outcomes of concurrent collection access, as the check isn't
perfect, so when you do see it, you must investigate thoroughly)
- The bug is reported (collection framework is buggy!)
- After analysis, culprit is that the user somewhere in their 100.000+
line code base is accessing a collection on different threads... sorry,
we can't help you further.
- Except, they could help us further: ConcurrentModificationException
was included at a slight cost of performance. It can't even be disabled
(I'm not implying this was a later addition, the designers were just
aware that this would lead to hard to track bugs and added this
immediately).
The parallels with the FX threading model, and also with weak references
are huge:
- Both "seem" to work fine, until they don't
- Failures are very random, may not occur at all on some setups (like
the developer's), and failures can be subtle, deceiving or complete
- The bug is reported (FX doesn't even check for nulls in Scene code!)
- After analysis, culprit is (likely) that the user somewhere in their
100.000+ line code base is accessing the scene graph on different
threads / forgot to maintain a hard reference... sorry, we can't help
you further.
These kinds of mistakes are not just happening to beginners. This can
happen when code gets rearranged, its original operating parameters long
forgotten by you, or some future maintainer. All it takes is a moment
of inattention and you have a bug that you may never personally
encounter, and when it does occur is impossible to reproduce with any
certainty and does NOT give any indication of the cause (you see only
symptoms). Such bugs are the nastiest kind, and it is (IMHO) a failure
of a framework to not help you detect these as early as possible.
--John
I am curious what other members of community think of this issue.
-andy
*From: *John Hendrikx <john.hendr...@gmail.com>
*Date: *Wednesday, August 7, 2024 at 12:01
*To: *Andy Goryachev <andy.goryac...@oracle.com>,
openjfx-dev@openjdk.org <openjfx-dev@openjdk.org>
*Subject: *[External] : Re: Detecting threading problems faster
Hi Andy,
Problems that occur due to modifying the scene graph (like
adding/removing children, or changing certain properties) can be
incredibly subtle, and almost never point out the code that is the
actual culprit. I suspect that the PR below is one of those for
example, but there are likely many more.
https://github.com/openjdk/jfx/pull/1123
<https://urldefense.com/v3/__https:/github.com/openjdk/jfx/pull/1123__;!!ACWV5N9M2RV99hQ!ILqOOB9ua4tDxfAV9pGPJJiQ3oXewiSQjH2BCsaQkhpgkHEQzT8hRQt2fb9mKjYbF24ZRpQwfEiELFvn4B28lBLwxrTE$>
Issues that complain about some kind of odd bug in the FX code (like a
NPE, ConcurrentModificationException, ArrayIndexOutOfBoundsException,
etc) could all be caused by user code doing things on the wrong
thread, but its really hard to diagnose as the stack traces provided
will never point out the user code that may be the culprit. I've
investigated PR 1123 extensively, and I can't see any bugs in the FX
code at all **assuming** it is used from a single thread...
--John
On 05/08/2024 17:01, Andy Goryachev wrote:
John:
Can you cite a bug or give an example of such a problem?
During all my time of actively using javafx (since about 2015) I
have never encountered an issue with threading you describe. It
is possible that I use the system trivially and not to the full
extent. Both swing and javafx are single threaded by design. Yes,
there might be occasions when one can use multiple threads and it
is sort of allowed, but doing so may lead to unpleasant issues
down the road, so the question is why would you want to do that?
-andy
*From: *openjfx-dev <openjfx-dev-r...@openjdk.org>
<mailto:openjfx-dev-r...@openjdk.org> on behalf of John Hendrikx
<john.hendr...@gmail.com> <mailto:john.hendr...@gmail.com>
*Date: *Sunday, August 4, 2024 at 16:30
*To: *openjfx-dev@openjdk.org <openjfx-dev@openjdk.org>
<mailto:openjfx-dev@openjdk.org>
*Subject: *Detecting threading problems faster
Hi list,
I know of quite some bugs and users that have been bitten by the
threading model used by JavaFX. Basically, anything directly or
indirectly linked to an active Scene must be accessed on the FX
thread.
However, as FX also allows manipulating nodes and properties before
they're displayed, there can be no "hard" check everywhere to
ensure we
are on the FX thread (specifically, in properties).
Now, I think this situation is annoying, as a simple mistake where a
Platform.runLater wrapper was forgotten usually results in programs
operating mostly flawlessly, but then fail in mysterious and
random and
hard to reproduce ways. The blame is often put on FX as the
resulting
exceptions will almost never show the user code which was the actual
culprit. It can result in FX being perceived as unstable or buggy.
So I've been thinking if there isn't something we can do to detect
these
bugs originating from user code much earlier, similar to the
`ConcurrentModificationException` the collection classes do when
accessed in nested or concurrent contexts.
I think it may be possible to have properties check whether
they're part
of an active scene without too much of an performance impact,
possibly
even behind a switch. It would work like this:
Properties involved with Nodes will have an associated bean instance
(`getBean`). This is an object, but we could check here if this
instance implements an interface:
if (getBean() instanceof MayBePartOfSceneGraph x) {
if (x.isPartOfActiveScene() && !isOnFxThread()) {
throw new IllegalStateException("Property must
only be
used from the FX Application Thread");
}
}
This check could be done on every set of the property, and
potentially
on every get as well. It should be relatively cheap, but will expose
problematic code patterns at a much earlier stage. There's a chance
that this will "break" some programs that seemed to be behaving
correctly as well, so we may want to put it behind a switch until
such
programs (or libraries) can be fixed.
What do you all think?
--John
(*) Names of methods/interfaces are only used for illustration
purposes,
we can think of good names if this moves forward.