On Fri, Jan 30, 2015 at 09:39:41AM +0000, Markus Stockhausen wrote: > > Von: Gabriel Paubert [paub...@iram.es] > > Gesendet: Freitag, 30. Januar 2015 09:49 > > An: Markus Stockhausen > > Cc: Scott Wood; linuxppc-dev@lists.ozlabs.org; Herbert Xu > > Betreff: Re: AW: SPE & Interrupt context (was how to make use of SPE > > instructions) > > > > > ... > > > - I must already save several non-volatile registers. Putting the 64 bit > > > values > > > into them would require me to save their contents with evstdd instead of > > > stw. Of course stack alignment to 8 bytes required. So only a few > > > alignment > > > instructions needed additionally during initialization. > > > > On most PPC ABI the stack is guaranteed to be aligned to a 16 byte > > boundary. In some it may be only 8, but I can't remember any 4 byte > > only alignment. > > > > I checked my 32 bit kernel images with: > > > > objdump -d vmlinux |awk '/stwu.*r1,/{print $6,$7}'|sort -u > > > > and the stack seems to always be 16 byte aligned. > > For 64 bit, use stdu instead of stwu. > > > > I've also found a few stwux/stdux which are hopefully known > > to be harmless. > > > > Gabriel > > A helpful annotation. But now I'm unsure about function usage. SPE seems to be > 32bit only and I would use their evxxx instructions. Do you think the > following > sequence will be the right way? > > _GLOBAL(ppc_spe_sha256_transform) > stwu r1,-128(r1); /* create stack frame */ > stw r24,8(r1); /* save normal registers */ > stw r25,12(r1); > evstdw r14,16(r1); /* We must save non volatile */ > evstdw r15,24(r1); /* registers. Take the chance */ > evstdw r16,32(r12); /* and save the SPE part too */ \ > ... > lwz r24,8(r1); /* restore normal registers */ \ > lwz r25,12(r1); > evldw r14,16(r12); /* restore non-v. + SPE registers */ > evldw r15,24(r12); > evldw r16,32(r12); > addi r1,r1,128; /* cleanup stack frame */ >
Yes. But there is also probably a status/control register somewhere that you might need to save restore, unless it is never used and/or affected by the instructions you use. > Or must I use the kernel provided defines with PPC_STLU > r1,-INT_FRAME_SIZE(r1) > plus SAVE_GPR/SAVE_EVR/REST_GPR/REST_EVR? > From what I understand INT_FRAME_SIZE is for interrupt entry code. This is not the case of your code which is a standard function except for the fact that it clobbers the upper 32 bits of some registers by using SPE instructions. Therore INT_FRAME_SIZE is overkill. I also believe that you can save the registers as you suggest, no need to split it into the high and low part. By the way, I wonder where the SAVE_EVR/REST_EVR macros are used. I only see the definitions, no use in a 3.18 source tree. Gabriel _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev