Hi,
On Thu, Jun 6, 2024, 12:51 Sean McGovern <gsean...@gmail.com> wrote: > > > On Thu, Jun 6, 2024, 05:53 Rémi Denis-Courmont <r...@remlab.net> wrote: > >> >> >> Le 6 juin 2024 10:43:05 GMT+03:00, Sean McGovern <gsean...@gmail.com> a >> écrit : >> >Hi, >> > >> >Attached inline is a _non-working_ implementation of flac_wasted32 for >> >VSX developed on a POWER9 in little-endian mode but probably just as >> >usable on POWER{8,10}. >> > >> >I'm not sure why probably one of the simplest DSP functions in lavc >> >does not work for me, I imagine this is probably something endian >> >related even though IBM's documentation for vec_sl()[1] does not >> >suggest any. >> >> Mixing up bytes and elements in the iterator. But you should be able to >> track this down with gdb or good ol' printf(). >> >> >Here's my code: >> > >> >#define VSX_STRIDE 16 >> > >> >void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len) >> >{ >> > register vec_s32 vec1; >> > register vec_u32 vec2 = { wasted, wasted, wasted, wasted }; >> >> There should be an instruction to splat a scalar to a vector. Better yet >> use vector-scalar shift, if VSX has it. >> > > In the POWER ISA, vec_splat() only accepts an immediate, so I think this > is the only way to do it in flac_wasted32. > > >> > register vec_s32 shifted; >> > >> > for (int i = 0; i < len; i += VSX_STRIDE) { >> > vec1 = vec_vsx_ld(i, decoded); >> > shifted = vec_sl(vec1, vec2); >> > vec_vsx_st(shifted, i, decoded); >> > } >> >} >> > >> >Anyone with experience with AltiVec or VSX see something obvious I am >> missing? >> > >> >-- Sean McGovern >> > >> >[1] >> https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl >> >_______________________________________________ >> >ffmpeg-devel mailing list >> >ffmpeg-devel@ffmpeg.org >> >https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> > >> >To unsubscribe, visit link above, or email >> >ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". >> > >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". >> > I feel the need to correct myself here: it turns out there is a way -- vec_splat() only accepts an immediate but vec_splats()[1] is what I need instead. Thanks for the tips, I have a working version of wasted32 for VSX now. I'll tackle wasted33 next and then submit them up. -- Sean McGovern [1] https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-splats _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".