I'm seeing a curious issue where surface memory seems to be slower than freshly malloc'ed memory. The platform I am working on is based on an ARM1176 CPU (without L2 cache) and no graphics acceleration - everything is software based. I am currently testing DirectFB 1.1.1.
I created a simple straight blitter with no blending (i.e. A memcpy blitter) to see if I could increase blitting performance using some special ARM instructions. What I found when doing this was that if I created two surfaces with preallocated memory, and then ran my blitter to copy between the two surfaces, the performance was roughly 38 megapixels/sec, roughly 4 times lower than the available memory bandwidth. If I ran the exact same code without first creating a surface out of the preallocated buffers, the performance jumped to around 128 megapixels/sec - much closer to the memory bandwith max. My question is, what is DirectFB doing to the preallocated memory buffers that is causing them memcpy's between them to slow down so drastically? Are the buffers being mmap'ed internally? Is there any way to improve this situation? Thanks! -Robert Hildinger _______________________________________________ directfb-dev mailing list directfb-dev@directfb.org http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-dev