Am 19.05.2017 16:39 schrieb "Karoly Balogh (Charlie/SGR)" < char...@scenergy.dfmk.hu>: > > Hi, > > On Fri, 19 May 2017, Reimar Grabowski wrote: > > > Final: The render function takes about 90%, the cast-to-int about 5%. No > > other interesting functions shown. So the missing time must be spent > > doing floating point math and branching (ifs), as that's all the render > > function does. > > Well, if I comment out the three additions where the ray is actually > traced and the tex := line, it's actually 60fps on my macbook. But > actually the real difference is made with the additions. If i comment out > everything, but those 3 (4 in fact) additions are in still there, it's > still slow. > > Which made me thinking. I think you can vectorize that quite easily, and > use some packed SIMD instruction, maybe that will make a difference. C/C++ > has some compiler intrinsics for that. I can't remember from the top of my > head if it's doable with FPC. Someone who feels like fiddling with this, > might want to try some assembly magic there, if it's possible somehow...
I think Jeppe wanted to add vector support. Though the question here is whether one wants to optimize/detect this at the AST level and convert that to implicit vectors or at the CSE level. By the way: I think my commit today of a SSE Frac() implementation sped up the framerate by a third on Win64 compared to the one without it :D Regards, Sven
_______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal