IndecisiveTurtle: > From: IndecisiveTurtle <geoste...@gmail.com> > > Performance wise, encoding a 1080p 1-minute video is performed in about 2.5 > minutes with the cpu encoder running on my Ryzen 5 4600H, while it takes > about 30 seconds on my NVIDIA GTX 1650 > > Haar shader has a subgroup optimized variant that applies when configured > wavelet depth allows it > ---
> + > +void put_vc2_ue_uint(inout PutBitContext pb, uint val) > +{ > + int pbits = 0, topbit = 1, maxval = 1, bits = 0; > + if (val == 0) > + { > + put_bits(pb, 1, 1); > + return; > + } > + val++; > + > + while (val > maxval) > + { > + topbit <<= 1; > + bits++; > + maxval <<= 1; > + maxval |= 1; > + } > + > + for (int i = 0; i < bits; i++) > + { > + topbit >>= 1; > + pbits <<= 2; > + if ((val & topbit) != 0) > + pbits |= 1; > + } > + > + put_bits(pb, bits * 2 + 1, (pbits << 1) | 1); > +} > + You are still using the old and inefficient way to write VC-2 exponential coded integers. Improving this gave a nice speed boost to the software encoder. - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".