2018-11-27 0:17 GMT+01:00, Carl Eugen Hoyos <ceffm...@gmail.com>: > 2018-11-17 9:12 GMT+01:00, Lauri Kasanen <c...@gmx.com>: >> ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt >> yuv420p \ >> -f null -vframes 100 -v error -nostats - >> >> 1158 UNITS in planar1, 65528 runs, 8 skips >> >> -cpuflags 0 >> >> 19082 UNITS in planar1, 65533 runs, 3 skips >> >> 16.48 speedup ratio. On x86, SSE2 is ~7. Curiously, the Power C version >> takes as many cycles as the x86 SSE2 version, yikes it's fast. >> >> Note that this function uses VSX instructions, but is not marked so. >> This is because several existing functions also make that mistake. >> I'll submit a patch moving them once this is reviewed. >> >> v2: Remove !BE check >> Signed-off-by: Lauri Kasanen <c...@gmx.com> >> --- >> libswscale/ppc/swscale_altivec.c | 53 >> ++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 53 insertions(+) >> >> diff --git a/libswscale/ppc/swscale_altivec.c >> b/libswscale/ppc/swscale_altivec.c >> index 2fb2337..8c6056d 100644 >> --- a/libswscale/ppc/swscale_altivec.c >> +++ b/libswscale/ppc/swscale_altivec.c >> @@ -324,6 +324,53 @@ static void hScale_altivec_real(SwsContext *c, >> int16_t >> *dst, int dstW, >> } >> } >> } >> + >> +static void yuv2plane1_8_u(const int16_t *src, uint8_t *dest, int dstW, >> + const uint8_t *dither, int offset, int start) >> +{ >> + int i; >> + for (i = start; i < dstW; i++) { >> + int val = (src[i] + dither[(i + offset) & 7]) >> 7; >> + dest[i] = av_clip_uint8(val); >> + } >> +} >> + >> +static void yuv2plane1_8_altivec(const int16_t *src, uint8_t *dest, int >> dstW, >> + const uint8_t *dither, int offset) >> +{ >> + const int dst_u = -(uintptr_t)dest & 15; >> + int i, j; >> + LOCAL_ALIGNED(16, int16_t, val, [16]); > >> + const vector uint16_t shifts = (vector uint16_t) {7, 7, 7, 7, 7, 7, >> 7, >> 7}; > > The patch breaks compilation with xlc, sorry for not testing earlier: > libswscale/ppc/swscale_altivec.c:344:11: error: unknown type name 'vector' > const vector uint16_t shifts = (vector uint16_t) {7, 7, 7, 7, 7, 7, 7, 7};
In case this error does not make much sense to you, don't worry too much, the following change was necessary to make xlc pass rv20-1239: diff --git a/fftools/ffmpeg_filter.c b/fftools/ffmpeg_filter.c index 6518d50..fb749c5 100644 --- a/fftools/ffmpeg_filter.c +++ b/fftools/ffmpeg_filter.c @@ -744,6 +744,7 @@ static int configure_input_video_filter InputFile *f = input_files[ist->file_index]; AVRational tb = ist->framerate.num ? av_inv_q(ist->framerate) : ist->st->time_base; +if(!ist->framerate.num)tb = ist->st->time_base; AVRational fr = ist->framerate; AVRational sar; AVBPrint args; ;-) (As expected, other tests also fail.) Carl Eugen _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel