On May 25 2002, Andrew Patrikalakis wrote: > Hello all, Hi, Andrew.
> With all the recent talk of use of assembly on the PowerPC, I came > up with a patch to use assembly versions of memcpy. It's about 35% > faster. Here is a sample of the memcpy speed test (which also now > works): Thanks for helping get xine get better for PowerPC. And also thanks for looking into my earlier message and discovering which part of the code was giving the relocation error (I just read your patch). > Benchmarking memcpy methods (smaller is better): > glibc memcpy() : 136 > ppcasm_memcpy() : 137 > ppcasm_cacheable_memcpy() : 88 > xine: using ppcasm_cacheable_memcpy() > (The lower time resolution is because I'm using times(NULL) in rdtsc()) We can also look into getting an extra version of memcpy that makes the transfers with floating point registers as some people suggested on the Debian PowerPC mailing list. People there said that using floating point registers (which are 64 bits large) instead of general purpose registers (32 bits each) may improve things. > I'd like to know how much it helps PPC users, so keep this list up > to date with the results. (Also, if my patch breaks other > platforms...) It gave my Mac laptop the little boost it needed to > play some media I have. Well, with the faster memcpy and with XFree86 4.2.0 (with DMA enabled), I can watch a DVD here with linearblend deinterlacing (coded in C) enabled and there are about 15% of frames skipped, which while still not perfect, is quite an improvement in face of the situation some weeks ago. BTW, I am using gcc-3.0 to compile xine-libs and I added some extra options to the configure script (-mfused-madd, -mcpu=750, -mtune=750, -O9). The next points of improvement (which may not be as immediate as using the memcpy being discussion) may be coding the idct, motion compensation and deinterlacing in assembly also. I guess that I'll heave to learn a bit more before I can get to these, but with the help of other people, things could go faster. > Just so you know, the methods I used are from the linux kernel > version 2.4.18 (arch/ppc/lib/string.S) Yes, that's what I tried in my earlier message, but I wasn't as succesful as you were. Your patch had a problem, though and I had to apply a part of it by hand. You might perhaps want to remake it and send to the xine developers so that it can be included for xine release 0.9.10, which should be near. > Andrew Patrikalakis Thanks for your help, Roger... -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Rogério Brito - [EMAIL PROTECTED] - http://www.ime.usp.br/~rbrito/ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]