Hi Luke,
thank you very much for your answer!
I don't know anything about GIMPLE, but I can address this issue. It
sounds like you are proposing to checkpoint (via copying) the entire
stack eagerly at the time you enter a transaction, in order to avoid
transactional instrumentation of stack accesses.
That is exactly what I had in mind. May I assume that you are part of
the group working on RochesterSTM?
This is probably a bad idea, as the overhead of the instrumentation is
probably much less than the overhead of copying and restoring the
stack. You're also relying on your ability to detect stack accesses
statically (in order not to instrument them). If you can do this, then
you can probably just implement a lightweight lazy checkpoint scheme
rather than an eager, full copy.
We were not sure about the overhead and do not have any comparisons so
far, but we thought that there are some cases where it is faster to
copy the stack. (rarely aborting transactions with frequent accesses
to stack variables). Do you know any papers comparing the different
approaches?
Right now, I am not sure that we are able to detect stack accesses
statically, but I figured it should be possible inside the GCC. But
assuming it would be possible to do so, how would a lightweight lazy
checkpoint scheme look like? Would it be like Intels approach
mentioned in the paper below?
I think, based on your first post
(http://gcc.gnu.org/ml/gcc/2008-06/msg00193.html), that you are
attempting to interface with TinySTM, which does lazy versioning
(write-buffering) anyway if I recall correctly, so writes are going to
be buffered aren't going to be written back on an abort anyway, which
is what you want.
TinySTM supports write-buffering (called write-back) and undo-log
based versioning (called write-through). The idea of copying the stack
comes from Tanger. This is an LLVM-based compiler environment that
supports code generation for TinySTM. Saving the stack can be switched
off by specifying a command-line option.
At first sight stack copying seemed a reasonable solution and looked
straightforward to implement. After experiencing some difficulties
accessing the necessary data if we want to restrict ourselves to the
tree representation, maybe we should simply go for the approach of
instrumenting the stack variables as well.
If you are thinking about an eager versioning system, then you should
take a look at how stack access is dealt with in "Code Generation and
Optimization for Transactional Memory Constructs in an Unmanaged
Language" (http://portal.acm.org/citation.cfm?id=1251974.1252529), as
the real issue is trying to "undo" accesses into stack space between
the setjmp and the longjmp.
We are familiar with the paper, but this approach would require to go
to RTL level or maybe even beyond. From my current point of view this
seemed not tempting.
Regards,
Martin