RE: Null pointer check elimination

2005-11-15 Thread Boehm, Hans
> From: David Daney
> Sent: Tuesday, November 15, 2005 11:44 AM
> To: Mike Stump
> Cc: gcc@gcc.gnu.org; [EMAIL PROTECTED]
> Subject: Re: Null pointer check elimination
> 
> 
> Mike Stump wrote:
> > On Nov 14, 2005, at 11:36 PM, David Daney wrote:
> > 
> >> Perhaps not in general, but one unstated premise of this whole  
> >> thread
> >> is that for some GCC targets (most Unix like operating  
> systems) you 
> >> *can* count on a SIGSEGV when you dereference a null  pointer.
> > 
> > 
> > Unless that null pointer points to an object that is of the wrong  
> > size
> > (too large), such as an array or a structure.
> 
> The java front end ignores this case.  I mean what are the 
> chances that 
> someone would try to access something near the end of such an object 
> with out first trying to access something near the beginning of it?
If the code is malicious, probably about 100%.

It seems to me that we probably do want a solid guarantee here
eventually.  As David wrote later, we probably already have one on most
platforms.

The libjava GC code also currently makes some weaker assumptions along
these lines.  It believes that none of the GC heap resides at addresses
below 16K (see _Jv_AllocArray in boehm.cc).  But that's more of a
performance than correctness issue.

Hans


RE: Memory corruption due to word sharing

2012-02-01 Thread Boehm, Hans
I'm clearly with Torvald here.

The original set of problems here is very clearly addressed by the C+11/C11 
memory model.  It clearly implies, among other things:

- Updating a bit-field may interfere (create a data race with) other updates to 
that same contiguous sequence of bit-fields, but not with other adjacent 
fields.  Updating non-bit-fields may not interfere with accesses to any other 
field.  A zero length bit-field breaks a sequence of bit-fields.  The initial 
example here is a clear compiler bug by C11 semantics.
- Bit-fields may be updated, and indeed must be updated, in some cases, with 
multiple smaller stores.  Doing anything else would violate the preceding rule.
- Spurious writes to a variable may never be introduced.
- "Volatile" has nothing to do with thread interactions.  Atomic variables were 
introduced to allow unprotected accesses to shared objects.  (It doesn't make 
sense to use volatile for this for a number of good, mostly historically 
motivated reasons.  See 
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2016.html if you 
really care.)

(Technically, this applies to C11 threads, because that's all it can talk 
about.  But an implementation would have to be insane to restrict it to that.  
For implementations of atomic operations, I would expect most compilers to 
assume memory mappings and consistency as normally seen by user processes.)

C11 is a published standard.  Last I checked, gcc did not follow many of the 
above rules.  It looks like major changes were recently merged into the gcc 
trunk, and I haven't had a chance to test those, so it may well be fixed.  But 
more importantly, so far I haven't read anything here to dissuade me that they 
are the right target.

Hans

> -Original Message-
> From: linux-ia64-ow...@vger.kernel.org [mailto:linux-ia64-
> ow...@vger.kernel.org] On Behalf Of Torvald Riegel
> Sent: Wednesday, February 01, 2012 9:43 AM
> To: Linus Torvalds
> Cc: Jan Kara; LKML; linux-i...@vger.kernel.org; dste...@suse.cz;
> ptesa...@suse.cz; rguent...@suse.de; gcc@gcc.gnu.org
> Subject: Re: Memory corruption due to word sharing
> 
> On Wed, 2012-02-01 at 08:41 -0800, Linus Torvalds wrote:
> > If the gcc people aren't willing to agree that this is actually a
> flaw
> > in the standard (one that is being addressed, no less)
> 
> It has been addressed in the standards.
> 
> > and try to fix
> > it,
> 
> Again, this is being worked on, see
> http://gcc.gnu.org/wiki/Atomic/GCCMM
> 
> > we just have to extend our assumptions to something like "a
> > compiler would be stupid to ever access anything bigger than the
> > aligned register-size area". It's still just an assumption, and
> > compiler people could be crazy, but we could just extend the current
> > alpha rules to cover not just "int", but "long" too.
> 
> So, let's ignore everyone's craziness (the kernel is not the only GCC
> client...) and think about how we can improve the situation.
> 
> We need a proper memory model.  No vague assumptions with lots of
> hand-waving.  If you think that this is simple stuff and can
> sufficiently described by "don't do anything stupid", then please have
> a
> look at the issues that the Java memory model faced, and all the
> iterations of the C++11/C11 model and discussions about it.
> 
> The only candidate that I see is the C++11/C11 model.  Any other
> suggestions?
> 
> Now, if we assume this model, what does the kernel community think
> about
> it?  It might be good to split this discussion into the following two
> points, to avoid syntax flame wars:
> 1) The model itself (ie, the memory orders, ordering guarantees, (lack
> of) progress guarantees, etc.).
> 2) The syntax/APIs to specify atomics.
> 
> If something else or more is needed, the compiler needs to have a
> formal
> specification for that.  This will help users too, because it avoids
> all
> the ambiguities.
> 
> > Sure, the compiler people could use "load/store multiple" or
> something
> > like that, but it would be obviously crazy code, so if it happens
> past
> > a register size, at least you could argue that it's a performance
> > issue and maybe the gcc people would care.
> >
> > HOWEVER. The problem with the alpha rules (which, btw, were huge, and
> > led to the CPU designers literally changing the CPU instruction set
> > because they admitted they made a huge mistake) was never so much the
> > occasional memory corruption, as the fact that the places where it
> > could happen were basically almost impossible to find.
> >
> > So we probably have tons of random places in the kernel that violate
> > even the alpha rules - because the corruption is so rare, and the
> > architecture was so rare as to making the corruption even harder to
> > find.
> 
> I assume you still would want a weak memory model, or not?  (That is,
> rely on data-race-freedom, only atomics do not contribute to data
> races,
> and you need to mark data used for synchronization (or which is just
> accessed concurrentl

RE: Memory corruption due to word sharing

2012-02-01 Thread Boehm, Hans
> From: Linus Torvalds
> >
> > We need a proper memory model.
> 
> Not really.
> 
> The fact is, the kernel will happily take the memory model of the
> underlying hardware. Trying to impose some compiler description of the
> memory model is actually horribly bad, because it automatically also
> involves compiler synchronization points - which will try to be
> portable and easy to understand, and quite frankly, that will
> automatically result in what is technically known as a "shitload of
> crap".
The C11 memory model potentially adds overhead in only two cases:

1. When current code involves touching a field that wouldn't otherwise be 
touched.  There are odd cases in which this measurably slows down code, but I 
think all agree that we need it.  In addition to bitfields, it can affect 
speculatively promoting a value to a register in a loop, which at least older 
versions of gcc also do.

2. When you use atomic operations for racing accesses.  And in that case, if 
you really want it, you get a lot of control over memory ordering.  Atomic 
operations that specify "memory_order_relaxed" should only have three kinds of 
impact:
- They ensure the access is indivisible.
- They tell the compiler that the location may be concurrently modified 
and it should refrain from optimizations that will break if it is.
- They enforce cache coherence, i.e. single-variable ordering.  This is 
implicit on most architectures, but requires ld.acq on Itanium.  IIRC, Paul 
McKenney argued convincingly that this was essential for preserving programmer 
sanity.

> 
> Now, a strict memory model is fine for portability, and to make it
> simple for programmers. I actually largely approve of the Java memory
> model approach, even if it has been horribly problematic and has some
> serious problems. But it's not something that is acceptable for the
> kernel - we absolutely do *not* want the compiler to introduce some
> memory model, because we are very much working together with whatever
> the hardware memory model is, and we do our own serialization
> primitives.
I think you are somewhat misinterpreting the C11 memory model.  Aside from the 
minor cache coherence issues on one or two architectures, you can still do 
that, if you like.  But I suspect you can also do better in places.

> 
> > No vague assumptions with lots of hand-waving.
> 
> So here's basically what the kernel needs:
> 
>  - if we don't touch a field, the compiler doesn't touch it.
> 
>This is the rule that gcc now violates with bitfields.
And in other more subtle cases.  At least until very recently.

> 
>This is a gcc bug. End of story. The "volatile" example proves it -
> anybody who argues otherwise is simply wrong, and is just trying to
> make excuses.
Agreed.

> 
>  - if we need to touch something only once, or care about something
> that is touched only conditionally, and we actually need to make sure
> that it is touched *only* under that condition, we already go to quite
> extreme lengths to tell the compiler so.
> 
>We usually use an access with a "volatile" cast on it or tricks
> like that. Or we do the whole "magic asm that says it modified memory
> to let the compiler know not to try anything clever"
> 
>  - The kernel IS NOT GOING TO MARK DATA STRUCTURES.
> 
> Marking data structures is seriously stupid and wrong. It's not the
> *data* that has access rules, it's the *code* that has rules of
> access.
> 
> The traditional C "volatile" is misdesigned and wrong. We don't
> generally mark data volatile, we really mark *code* volatile - which
> is why our "volatiles" are in the casts, not on the data structures.
> 
> Stuff that is "volatile" in one context is not volatile in another. If
> you hold a RCU write lock, it may well be entirely stable, and marking
> it volatile is *wrong*, and generating code as if it was volatile is
> pure and utter shit.
> 
> On the other hand, if you are touching *the*very*same* field while you
> are only read-locked for RCU, it may well be one of those "this has to
> be read by accessing it exactly once".
> 
> And we do all this correctly in the kernel.  Modulo bugs, of course,
> but the fundamental rule really is: "atomicity or volatility is about
> CODE, not DATA".
The preferred C11 model is clearly to mark data that potentially participates 
in unprotected concurrent accesses, and then to annotate accesses that require 
less care, e.g. because we know they can't occur concurrently with other 
accesses.  Maybe this is the wrong approach for the kernel, but to me that 
seems like a safer approach.  We did consider the alternative model, and tried 
to ensure that casts to atomic could also work on as many platforms as 
possible, though we can't guarantee that.  (There is currently no specification 
for atomic accesses that says "not concurrently accessed".  We've thought about 
it.  I would argue for adding it if there were important cases in which 
memory_order_relaxed is demonstrably not 

RE: Memory corruption due to word sharing

2012-02-01 Thread Boehm, Hans
> From: Linus Torvalds
> Don't try to make it anything more complicated. This has *nothing* to
> do with threads or functions or anything else.
> 
> If you do massive inlining, and you don't see any barriers or
> conditionals or other reasons not to write to it, just write to it.
> 
> Don't try to appear smart and make this into something it isn't.
> 
> Look at the damn five-line example of the bug. FIX THE BUG. Don't try
> to make it anything bigger than a stupid compiler bug. Don't try to
> make this into a problem it simply isn't.
> 
My impression is that all of us not on the hook to fix this are in violent 
agreement on this example.

Here are some more interesting ones that illustrate the issues (all 
declarations are non-local, unless stated otherwise):

struct { char a; int b:9; int c:7; char d} x;

Is x.b = 1 allowed to overwrite x.a?  C11 says no, essentially requiring two 
byte stores.  Gcc currently does so.  I'm not sure I understand Linus' position 
here.


int count;
/* p and q are local */

for (q = p; q = q -> next; q != 0) if (q -> data > 0) ++count;

Can count be promoted to a register, and thus written even if there are no 
positive elements.  C11 says no. gcc at least used to do this.


for (q = p; q = q -> next; q != 0) { ++count; if (rare_cond) f(); }

Same question, with cond saved and restored around the call to f() (which might 
include a fence).  C11 says no.  I think Linus is also arguing for no.


for (i = 0; i < 1000; ++i) { if (i%1) a[i] = i; }

Can I vectorize the loop writing back the original even values, and thus 
writing all entries of the array.  C11 and Linus both say no.


My impression is that we are generally in agreement.

Hans


RE: Memory corruption due to word sharing

2012-02-01 Thread Boehm, Hans
> From: Torvald Riegel
> > Oh, one of my favorite (NOT!) pieces of code in the kernel is the
> > implementation of the
> >
> >smp_read_barrier_depends()
> >
> > macro, which on every single architecture except for one (alpha) is a
> no-op.
> >
> > We have basically 30 or so empty definitions for it, and I think we
> > have something like five uses of it. One of them, I think, is
> > performance crticial, and the reason for that macro existing.
> >
> > What does it do? The semantics is that it's a read barrier between
> two
> > different reads that we want to happen in order wrt two writes on the
> > writing side (the writing side also has to have a "smp_wmb()" to
> order
> > those writes). But the reason it isn't a simple read barrier is that
> > the reads are actually causally *dependent*, ie we have code like
> >
> >first_read = read_pointer;
> >smp_read_barrier_depends();
> >second_read = *first_read;
> >
> > and it turns out that on pretty much all architectures (except for
> > alpha), the *data*dependency* will already guarantee that the CPU
> > reads the thing in order. And because a read barrier can actually be
> > quite expensive, we don't want to have a read barrier for this case.
> 
> I don't have time to look at this in detail right now, but it looks
> roughly close to C++11's memory_order_consume to me, which is somehwat
> like an acquire, but just for subsequent data-dependent loads.  Added
> for performance reasons on some architecture AFAIR.
> 
It's intended to address the same problem, though somewhat differently.  (I 
suspect there was overlap in the people involved?) One reason that C11 took a 
slightly different path is that compilers can, and sometimes do, remove 
dependencies, making smp_read_barrier_depends brittle unless it also imposes 
compiler constraints.

Hans



RE: Implementing C++1x and C1x atomics (really an aside on SFENCE)

2009-08-20 Thread Boehm, Hans
 

> -Original Message-
> From: Lawrence Crowl [mailto:cr...@google.com] 
> The problem is that gcc does support 80386.  It also supports 
> other processors that have less-than-complete support for 
> concurrency.  Just in the x86 line, we get some additional 
> capability in many new layers.
> 
>   8086LOCK XCHG
>   80486   CMPXCHG XADD
>   Pentium CMPXCHG8B
>   SSE SFENCE
Aside to an interesting discussion:

I believe the current conclusion is that SFENCE should be ignored, except for 
library or compiler-generated code that uses non-temporal/coalescing stores, 
which I believe are also a recent addition.  Normal stores are ordered anyway, 
so it's not needed.  Thus you are faced with a choice of either (a) 
implementing fences on the assumption that ordinary code may contain 
non-temporal stores, or (b) making sure that non-temporal stores are always 
surrounded by the appropriate fences.  This is really an important ABI issue, 
but it's something that I believe no ABI currently specifies.  Our conclusion 
in earlier discussions among a different group of people was that (b) made more 
sense, since non-temporal stores of various kinds seemed to be largely confined 
to a few library routines.

It would be really nice if everyone somehow managed to agree on this.  
Inconsistency here, probably even between Windows and Linux, seems likely to 
result in really subtle bugs.

Note that this also affects correctness of spinlock implementations, not just 
atomics.  A simple store to release a lock doesn't work if the critical section 
may contain unfenced non-temporal stores.

Hans

>   SSE2MFENCE
>   late AMD64  CMPXCHG16B
> 
> So, we do not get to ignore the problem as a relic of 80386.
> 


RE: Implementing C++1x and C1x atomics (really an aside on SFENCE)

2009-09-09 Thread Boehm, Hans
> From: Lawrence Crowl [mailto:cr...@google.com] 
> 
> On 8/20/09, Boehm, Hans  wrote:
> > > -Original Message-
> > > From: Lawrence Crowl [mailto:cr...@google.com] The 
> problem is that 
> > > gcc does support 80386.  It also supports other 
> processors that have 
> > > less-than-complete support for concurrency.  Just in the 
> x86 line, 
> > > we get some additional capability in many new layers.
> > >
> > >   8086LOCK XCHG
> > >   80486   CMPXCHG XADD
> > >   Pentium CMPXCHG8B
> > >   SSE SFENCE
> >
> > Aside to an interesting discussion:
> >
> > I believe the current conclusion is that SFENCE should be ignored, 
> > except for library or compiler-generated code that uses 
> > non-temporal/coalescing stores, which I believe are also a recent 
> > addition.  Normal stores are ordered anyway, so it's not needed.
> > Thus you are faced with a choice of either (a) implementing 
> fences on 
> > the assumption that ordinary code may contain non-temporal 
> stores, or 
> > (b) making sure that non-temporal stores are always 
> surrounded by the 
> > appropriate fences.  This is really an important ABI issue, 
> but it's 
> > something that I believe no ABI currently specifies.  Our 
> conclusion 
> > in earlier discussions among a different group of people 
> was that (b) 
> > made more sense, since non-temporal stores of various kinds 
> seemed to 
> > be largely confined to a few library routines.
> 
> Hm.  I would expect that given the C++0x memory model, 
> compilers could be much more aggressive about using 
> non-temporal stores, potentially improving performance 
> substantially.  That is, it may be better to accept a 
> slightly less efficient ABI for today's compilers to gain a 
> more efficient ABI for tomorrow's compilers.
> 
> > It would be really nice if everyone somehow managed to 
> agree on this.
> > Inconsistency here, probably even between Windows and Linux, seems 
> > likely to result in really subtle bugs.
> >
> > Note that this also affects correctness of spinlock 
> implementations, 
> > not just atomics.  A simple store to release a lock doesn't work if 
> > the critical section may contain unfenced non-temporal stores.
> 
> Yes, but the spinning acquire doesn't require the fence, only 
> the the release.  So, is this additional instruction a 
> performance problem?
> 
I haven't looked at this terribly systematically.  I do know that in Pentium 4 
days, sfence was tremendously expensive (basically equivalent to mfence or 
cmpxchg, i.e. 100+ cycles), even in contexts in which it was a no-op.  Thus ABI 
convention (a) roughly doubles the (already very high) cost of an uncontended 
spin-lock on a Pentium 4.  I suspect that got better on later implementations, 
but I'm not sure by how much.

I think the only nontemporal stores on X86 are vector instructions.  I would 
guess that for many applications neither these nor spin-lock times matter a 
lot, and for most of the rest, these vector instructions won't make up for the 
cost of doubling spin-lock execution times.  If you do manage to automatically 
generate non-temporal stores at all, you will usually generate a bunch of them 
between potential synchronization operations, so that you can amortize the 
sfence.  As I recall, we did look briefly during earlier discussions, and 
didn't find them used much even in hand-crafted libc code.

But this is all hand-waving and guessing.  Certainly real measurements would be 
much better.

The most important issue of course is that we need to stick to one convention 
or the other.  Currently a lot of code seems to assume that an X86 spin lock 
can be released with a simple store, so invalidating that would be tricky, 
especially since sfence was a fairly recent introduction.

Hans


-fexceptions introduces ABI incompatibility of sorts (was: RE: Re[2]: [Gc] On cancellation, GC_thread_exit_proc is never called)

2010-04-13 Thread Boehm, Hans
I would still love to get a reaction to this from the gcc folks (now included):

1) Is it intended that inconsistent use of -fexceptions can cause pthread 
cleanup handlers to be skipped (since exception based cleanup is not invoked if 
thread exit is triggered from a context without exceptions),  thus usually 
breaking code relying on pthread_cleanup handlers?

2) If so, would it be appropriate to document that behavior?  In particular, 
several other options in the gcc manual currently indicate that they introduce 
binary compatibility issues, while this one does not.  This does seem to be 
basically a binary incompatibility issue.

3) If not, any chance of fixing this somehow?

4) Is the use of the __EXCEPTIONS macro in the library considered a stable part 
of the library interface?  Is it reasonable to work around this problem by 
explicitly undefining it in user code? 

The original problem description that triggered this discussion can be found at

http://permalink.gmane.org/gmane.comp.programming.garbage-collection.boehmgc/3820

Thanks.

Hans

> -Original Message-
> From: gc-boun...@napali.hpl.hp.com 
> [mailto:gc-boun...@napali.hpl.hp.com] On Behalf Of Ivan Maidanski
> Sent: Saturday, April 10, 2010 6:20 AM
> To: NIIBE Yutaka
> Cc: g...@napali.hpl.hp.com
> Subject: Re[2]: [Gc] On cancellation, GC_thread_exit_proc is 
> never called
> 
> 
> Fri, 09 Apr 2010 17:45:47 +0900 NIIBE Yutaka :
> 
> > Ivan Maidanski wrote:
> > > This seems to be equivalent to your first patch.
> > 
> > It is equivalent for libgc build, but it doesn't touch any header 
> > files.
> > 
> > My intention is to minimize impact by the change.
> > 
> > For libgc users who include gc.h, the first/second patch of 
> mine would 
> > not be good.  Even if a user use -fexceptions for her 
> application, the 
> > change of undefine __EXCEPTIONS forces to use 
> __pthread_register_cancel.
> > With third patch, it is up to users.
> > 
> > > Did you read
> > > 
> http://permalink.gmane.org/gmane.comp.programming.garbage-collection
> > > .boehmgc/3824 ?  Why not to pull GC_inner_start_routine 
> out into a 
> > > separate file that isn't compiled with -fexceptions?
> > 
> > Thanks for the pointer.  I had not read it.  I read it now.
> > 
> > My opinion is that:
> > 
> > (1) I would agree that it would be better to pull out
> >  GC_inner_start_routine into a separate file.
> > 
> >  Just FYI, I confirmed that it is not needed to separate it out,
> >  at least for current implementation.
> > 
> >  For current implementation, compilation of 
> pthread_support.c with
> >  no __EXCEPTIONS only affects compilation of 
> GC_inner_start_routine
> >  (specifically, pthread_cancel_push), no effect for 
> other routines.
> > 
> > (2) Only undefining __EXCEPTIONS is better, I think.
> > 
> >  I think that compiling GC_inner_start_routine without 
> -fexceptions
> >  is overkill somehow.  Yes, it means it goes with no 
> __EXCEPTIONS,
> >  but it also generates no .eh_frame section for
> >  GC_inner_start_routine.
> > 
> >  No .eh_frame section for GC_inner_start_routine would be OK,
> >  because it is almost starting point of a thread.
> > 
> >  One minor benefit: If we have .eh_frame section for
> >  GC_inner_start_routine, it is possible for a debugger to trace
> >  back to the beginning.  This is perhaps only useful 
> for those who
> >  chase down to the detail of pthread_create 
> implementation, though.
> > --
> 
> I could agree with You. But, again, it's up to Hans whether 
> to commit this patch or not.
> 
> Bye.
> ___
> Gc mailing list
> g...@linux.hpl.hp.com
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
> 


RE: RFH: optabs code in the java front end

2010-09-12 Thread Boehm, Hans
I'm not up on the details here.  But I want to point out that C++0x and C1x 
require atomic operations on all platforms, with the expectation that they will 
be emulated (most likely with an address-indexed table of locks) on platforms 
that don't provide them.  If I understand this correctly, it sounds to me like 
some generalization of the Java mechanism will also be needed for C++ (and 
eventually C).

Hans

> -Original Message-
> From: java-ow...@gcc.gnu.org [mailto:java-ow...@gcc.gnu.org] On Behalf
> Of Andrew Haley
> Sent: Sunday, September 12, 2010 10:08 AM
> To: Joseph S. Myers
> Cc: gcc@gcc.gnu.org; j...@gcc.gnu.org
> Subject: Re: RFH: optabs code in the java front end
> 
> On 11/09/10 21:21, Joseph S. Myers wrote:
> > On Sat, 11 Sep 2010, Andrew Haley wrote:
> >
> >> The test tells us whether the back-end has atomic builtins.  If it
> doesn't
> >> then we generate calls to the libgcj back end.  I really don't want
> gcj
> >> to generate calls to nonexistent __compare_and_swap_4 or somesuch.
> >
> > Maybe not to nonexistent functions, but if the functions exist - say
> the
> > kernel-assisted libgcc functions used on Linux on SH, PA and older
> ARM
> > processors - then certainly they should be used.  So optabs are
> hardly the
> > right thing to check; if you need to know whether this functionality
> is
> > supported, you need a hook that will say whether there is a library
> > fallback when code for __sync_* isn't generated inline.
> 
> Sure, that would be nice.  However, my first goal is correctness, so
> if we end up using slower workarounds in libgcj I can live with that,
> but
> not failure to link because of missing library routines.
> 
> In the case of ARM GNU/Linux we work around the problem with a compile-
> time option, -fuse-atomic-builtins, which overrides the optabs check.
> 
> Andrew.
> 



RE: Reconsidering gcjx

2006-01-27 Thread Boehm, Hans

> From:  Laurent GUERBY
> Wether C++, Java or Ada, a new language requirement looks the same to
> me: having a good enough base compiler and runtime installed 
> for the language, I do not see anything special to Java or 
> Ada over C++ here. The base compiler I use for building GCC 
> has only c,ada (4.0) because that's what is needed, if c++ is 
> needed I'll add the recommanded c++ compiler, if java and 
> some JVM is needed, I'll add java and the recommanded JVM, no 
> big difference.

As others have pointed out, there's potentially a small difference in
the case of Java, in that I believe the .class -> .o part of the
compiler would still be buildable without an existing JVM, and perhaps
even somewhat tested without one.  And that's the part that's likely to
break if other parts of the compiler are changed.  I don't think Ada has
an analog to that.

Hans