When I made this code work I will start examine the impact of the design. I
suspect that it will, for complex cases
if not compiled be way slower then the compiled part and this is really the
hot path. For one thing, we will create 2 closures that
allocates memory at each iteration if not inlined. The best solutionis that
wingo can design a very fast compiler targeting
this case in the beginning meaning that guile just handle it perfectly even
with potentially 1000:s of cases. Second
posibility is if guiile had a fast compiler that when feeding a lambda to
it, it would optimise it. we could simulate this
by simply pass lists representing code and use compile to compile it, but
my experience is that it's very time consuming
to do this. I can experiment a little here to see the actual timing.

Anyhow the idea with a fast compiler is that it could prepare in¨the first
compiler the setup so that it is really fast to
compile compared to starting from scratch. Here the advice from wingo would
be apprisiated.

A final posibility which is not too bad speedwise is to do the following
inside the loop and create one big dispatch like
that is executed each iteration.

(let ((val (if (= endian 'little)
                   (if float?
                       (if (= m 4) (get-f32 v1 k1 'little) (get-d64 v1 k1
'little))
                        ...))

(if (= endian 'little)
    (if float?
         (if (= m 4) (set-f32 v1 k1 val 'little) (set-d64 v1 k1 val
'little))
              ...))

This is ideally the code should compile to if it can't create all possible
loops

Now I do not like to adjust my code to output this as it makes the
framework less powerfulll and useful as every case
will be a special case. But what about if you could mark a code less
important. what we want is a dispatch like so

(if (= endian 'little) #:level-2 ...)

And in the first pass, if will be handled if endian is known (will reduce
complexity) else it will in the first pass freeze
this one and continue with the whole shebang. the level2 will be the basic
compiler, but where the #:level-2 tag is ignored.
Maybe this is a no issue and the compiler handles this gracefully.

Also The compiler could note that endian nbits single? float? etc etc is
really created outside the loop and prepare the code for
handling all cases. essentialle make sure to compile all nodes and make an
area in the code to modify. then when before the loop
the code can decide which version to use outside the loop (here we can use
padding or a goto in case if the padded area is so large
that a goto saves time. this means that the compiler has 33 cases for the
ref and 33 cases for the set! part in my most general
version which is ok as they each are typically small. So what I would do if
I where the compiler do the following layout pseudo,

(if ... (copy RefStub1 to StubX ...)
(if ... (copy SetStub1 to  StubY ...)
goto loop
StubRef1
...
StubRefN

StubSet1
...
StubSetN
loop:
(let lp (...)
  (let ((val StubX))
     StubY)
   (iwhen... (lp ...)))

this can be quite fast.

Self modifying code rocks!!!

Reply via email to