Re: pixman_blt on aarch64

Akihiko Odaki Mon, 06 Feb 2023 18:17:56 -0800

On 2023/02/06 4:16, Richard Henderson wrote:

On 2/5/23 08:44, BALATON Zoltan wrote:
On Sun, 5 Feb 2023, Richard Henderson wrote:
On 2/4/23 06:57, BALATON Zoltan wrote:
This has just bounced, I hoped to still be able to post aftermoderation but now I'm resending it after subscribing to the pixmanlist. Meanwhile I've found this ticket as well:https://gitlab.freedesktop.org/pixman/pixman/-/merge_requests/71See the rest of the message below. Looks like this is being workedon but I'm not sure how far is it from getting resolved. Any info onthat?
Please try this:

https://gitlab.freedesktop.org/rth7680/pixman/-/tree/general

It provides a pure C version for ultimate fallback.
Unfortunately, there are no test cases for this, nor documentation.

It can share the implementation with fast_composite_src_memcpy().fast_composite_src_memcpy() should be well-tested with the tests forpixman_image_composite(). arm-neon does similar so we can trustfast_composite_src_memcpy() functions as blt.

Thanks, I don't have hardware to test this but maybe Akihiko orsomebody else here cam try. Do you think pixman_fill won't have thesame problem? It seems to have at least a fast_path implementation butI'm not sure how pixman selects these.
For fill, I think the fast_path implementation should work, so long asit isn't disabled via environment variable. I'm not sure why that is,and why _fast_path isn't part of _general.

The implementation of fill should be moved to pixman-general.c but theother part of pixman-fast-path.c shouldn't be.

By isolating the non-essential fast-path code to pixman-fast-path.c, youcan disable it with the environment variable when you are not confidentwith the implementation, and that may help debugging. However, ifpixman-fast-path.c has some essential code like the implementation offill, the utility of the environment variable will be impaired assetting the environment variable may break things.

Indeed, the fast_path implementation of fill should be easily vectorizedby the compiler. I would expect it to be competitive with an assemblyimplementation. I would expect the implementation chain design to onlybe useful when multiple vector implementations are supported andselected at runtime -- e.g. the x86 SSE2 vs SSSE3 stuff.
r~

Re: pixman_blt on aarch64

Reply via email to