I'm seeing a curious issue where surface memory seems to be slower than
freshly malloc'ed memory. The platform I am working on is based on an
ARM1176 CPU (without L2 cache) and no graphics acceleration - everything is
software based. I am currently testing DirectFB 1.1.1.

I created a simple straight blitter with no blending (i.e. A memcpy blitter)
to see if I could increase blitting performance using some special ARM
instructions. What I found when doing this was that if I created two
surfaces with preallocated memory, and then ran my blitter to copy between
the two surfaces, the performance was roughly 38 megapixels/sec, roughly 4
times lower than the available memory bandwidth. If I ran the exact same
code without first creating a surface out of the preallocated buffers, the
performance jumped to around 128 megapixels/sec - much closer to the memory
bandwith max.

My question is, what is DirectFB doing to the preallocated memory buffers
that is causing them memcpy's between them to slow down so drastically? Are
the buffers being mmap'ed internally? Is there any way to improve this
situation?

Thanks!
-Robert Hildinger

_______________________________________________
directfb-dev mailing list
directfb-dev@directfb.org
http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-dev

Reply via email to