At 4:59 PM +0200 7/20/04, Leopold Toetsch wrote:
Dan Sugalski <[EMAIL PROTECTED]> wrote:
At 10:35 AM +0200 7/20/04, Leopold Toetsch wrote:

 And yes, this will, with sufficient call depth, result in an
 all-bits-set dirty mask, which is also why we allow bytecode to
 *unset* bits in the dirty frame marker, but only if those bits are
 set in the sub's mask of frames it uses.

How is the dirty mask usuable, when bits are reset?

Since you'd only be allowed to reset bits that were set in your used mask (that is, for frames that *must* have been saved on entry) then it's usable just fine.


For example, if somewhere in your code you *only* use P registers, you could do:

unset_used 11111100b  # assuming registers in INSP order, two bits each

at the start, and turn off saving of the I, N, and S registers. Any function call (or at least any call made through a vtable or other mechanism where the caller can't know it's making a call) from then on wouldn't save them.

If we go with this unconditionally and drop the requirement for the caller to save the frames its interested in (counting, instead, on this mechanism to do it universally) then only the bits that were set in the current function's 'used bits' flag (and, thus, saved when we entered this function) could be unset.

 Anyway, any code
making use of Parrot native register types will have to preserve all
most of the time.

Nonsense. The low frame of I, S, and N registers will rarely need saving because they're rarely dirtied in the normal course of affairs, and when they *are* dirty then they'll need to be saved regardless.


Once again, this is *only* an issue when making calls when the caller can't know that a call's being made. This should, in general, be unusual. When we *do* know we're making a call then we save those registers that we care about which, at the point of the call, should generally not be all of them. (And yes, this may require some code analysis to see what's used at this point, or used from this point to the next call)

>> whenever a subroutine or
method will use some native S or N registers, we end up saving 640
bytes. In *each* function call. And restore 640 bytes on each return.

My proposal did show a way, how to copy ~640 bytes *once* per subroutine
creation. You didn't even comment that.

Well, it seemed obviously wrong,

Why?

 ... and inefficient in most cases, so I
 didn't.

Copying 640 bytes once, or 640 bytes * 2 * nr of calls? What is inefficient?

This *only* makes a difference for vtable functions written in bytecode. For normal code we're already copying the frames in and out when we make a call and there's no way around that.


> ... If you don't like the scheme I outlined above I can go into
more detail.

I'm all for a better scheme. Moving P0-P2, S0, I0-I4 somewhere else is fine.

The only downside is it makes the continuations larger, since they need to preserve this information. OTOH if everyone's saving it anyway we might as well.


BTW five registers (I0-I4) for information that fits into one is
overkill anyway.

Erm... no. You at least need 5 bytes, so that's two registers, (or two words somewhere) and using full words rather than bytes is faster since you skip the mask and shift on most processors.


Hell with it, let's just do it. Gimme a bit and I'll spec the ops to get/set the metainformation for the call things. It'll also make doing stack back traces a damn sight easier, which'll be a win.

So much for not changing the calling conventions. :( (Unless we want to have invoke and its friends automagically move this info over for now to give people time to transition)
--
Dan


--------------------------------------it's like this-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to