[FFmpeg-devel] QuickTime Animation (qtrle) decoder needs heavy rewriting for 1-bit video
I've looked at the XAnim source regarding the decoding of QuickTime Animation (qtrle in FFmpeg) (he calls the RLE codec "Graphics" but that's wrong). Anyway, it uses a lot of opcodes and stuff that aren't implemented whatsoever in libavcodec/qtrle.c. So far it has managed to work in spite of that, except for the 1-bit mode, which is rather broken. I have tried to mess with it a bit, borrowing code from XAnim, but I have yet to get it to work properly. I've noticed that the venerable rotating globe file below is "jaggy" at the right edges, and that's just one of the problems. That jagginess shouldn't be there at all. It would be nice if someone with more knowledge than me could fix the 1-bit mode. The interest is probably lukewarm, but I wanted to mention the issue anyway. Sample: https://drive.google.com/open?id=0B3_pEBoLs0faTThSek1EeXQ0ZHM Mats -- Mats Peterson http://matsp888.no-ip.org/~mats/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] QuickTime Animation (qtrle) decoder needs heavy rewriting for 1-bit video
Mats Peterson ffmpeg.org> writes: > I have tried to mess with it a bit, borrowing code from XAnim, Please stop this now and please be more careful! Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] QuickTime Animation (qtrle) decoder needs heavy rewriting for 1-bit video
On 12/31/2015 11:19 AM, Carl Eugen Hoyos wrote: Mats Peterson ffmpeg.org> writes: I have tried to mess with it a bit, borrowing code from XAnim, Please stop this now and please be more careful! Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel Pardon me? -- Mats Peterson http://matsp888.no-ip.org/~mats/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] QuickTime Animation (qtrle) decoder needs heavy rewriting for 1-bit video
On 12/31/2015 11:19 AM, Carl Eugen Hoyos wrote: I have tried to mess with it a bit, borrowing code from XAnim, Please stop this now and please be more careful! Just to let you know, I can do what the I want with the code while experimenting. Thank you for your supportive input, and a Happy New Year. Mats ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] QuickTime Animation (qtrle) decoder needs heavy rewriting for 1-bit video
Le primidi 11 nivôse, an CCXXIV, Mats Peterson a écrit : > Just to let you know, I can do what the I want with the code while > experimenting. Not exactly, no. If you look at another project, even while experimenting, then the final code can be said to be inspired by it, and for some people that is enough to claim copyright, even if the legal grounds are flimsy. Since XAnim is not Libre Software, code derived from it can not be added to FFmpeg. Where the limit between "inspired by" and "derived from" lies is subjective, you need to be very careful on how you do it to be well on the safe side. Regards, -- Nicolas George signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] QuickTime Animation (qtrle) decoder needs heavy rewriting for 1-bit video
On 12/31/2015 12:28 PM, Nicolas George wrote: Le primidi 11 nivôse, an CCXXIV, Mats Peterson a écrit : Just to let you know, I can do what the I want with the code while experimenting. Not exactly, no. If you look at another project, even while experimenting, then the final code can be said to be inspired by it, and for some people that is enough to claim copyright, even if the legal grounds are flimsy. Since XAnim is not Libre Software, code derived from it can not be added to FFmpeg. Where the limit between "inspired by" and "derived from" lies is subjective, you need to be very careful on how you do it to be well on the safe side. Yes, point taken once again. I would like to see someone solve the 1-bit qtrle issue in one or another way, though. I'm currently not capable to do it. But I'm afraid it's not possible, since there is no official documentation of the QuickTime Animation codec as far as I know. Pity. Mats -- Mats Peterson http://matsp888.no-ip.org/~mats/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [libav-devel] [RFC] Cineform HD questions
On 2015-12-31 07:02, Kieran Kunhya wrote: >> Apart from that, again from a quick glance, there are a ton of >> mallocs/frees. Can these somehow get consolidated? > > Yes, that's what I don't know how to solve easily. They should of > course be a single allocated buffer that's reused. Forgive me if I missed something but aren't they just mallocs of a static size? Space for 8M int16_t each? Just sum that up and do 1 malloc. It could even move into an the init function rather than per frame. Put one variable or each variable into the private context struct. If it aids performance perhaps assign a local variable to each one in the decode function. I haven't read the code in detail but I will point out that you have many style issues. A space should follow if and while keywords before the opening parenthesis and the opening brace should be on the same line too. I.e. > if (...) { not > if( ... ) > { signature.asc Description: OpenPGP digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavd/avfoundation: overhaul
Hi, > this patchset fixes most if not all bugs reported to trac for me. > [...] sorry I forgot to mention that this patchset has already become obsolete for the joint approach to push a common base for avfoundation into FFmpeg & Libav. I'm very short of time right now but I will update this & the avfoundation device as soon as I can! -Thilo ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] support for reading / writing encrypted MP4 files
On Wed, Dec 30, 2015 at 09:53:35PM +, Eran Kornblau wrote: > > > Please let me know if you think that is ok, and I will resubmit the patch > > > with all fixes. > > > > should be ok > > > Updated patch attached, diff from previous patch is: > > --- a/libavformat/mov.c > +++ b/libavformat/mov.c > @@ -4026,6 +4026,14 @@ static int mov_read_frma(MOVContext *c, AVIOContext > *pb, MOVAtom atom) > case MKTAG('e','n','c','v'):// encrypted video > case MKTAG('e','n','c','a'):// encrypted audio > id = mov_codec_id(st, format); > +if (st->codec->codec_id != AV_CODEC_ID_NONE && > +st->codec->codec_id != id) { > +av_log(c->fc, AV_LOG_WARNING, > + "ignoring 'frma' atom of '%.4s', stream has codec id > %d\n", > + (char*)&format, st->codec->codec_id); > +break; > +} > + > st->codec->codec_id = id; > sc->format = format; > break; > @@ -4045,7 +4053,6 @@ static int mov_read_senc(MOVContext *c, AVIOContext > *pb, MOVAtom atom) > AVStream *st; > MOVStreamContext *sc; > size_t auxiliary_info_size; > -int ret; > > if (c->decryption_key_len == 0 || c->fc->nb_streams < 1) > return 0; > @@ -4091,12 +4098,7 @@ static int mov_read_senc(MOVContext *c, AVIOContext > *pb, MOVAtom atom) > return AVERROR(ENOMEM); > } > > -ret = av_aes_ctr_init(sc->cenc.aes_ctr, c->decryption_key); > -if (ret) { > -return ret; > -} > - > -return 0; > +return av_aes_ctr_init(sc->cenc.aes_ctr, c->decryption_key); > } > > static int cenc_filter(MOVContext *c, MOVStreamContext *sc, uint8_t *input, > int size) > @@ -4107,7 +4109,7 @@ static int cenc_filter(MOVContext *c, MOVStreamContext > *sc, uint8_t *input, int > uint8_t* input_end = input + size; > > /* read the iv */ > -if (sc->cenc.auxiliary_info_pos + AES_CTR_IV_SIZE > > sc->cenc.auxiliary_info_end) { > +if (AES_CTR_IV_SIZE > sc->cenc.auxiliary_info_end - > sc->cenc.auxiliary_info_pos) { > av_log(c->fc, AV_LOG_ERROR, "failed to read iv from the auxiliary > info\n"); > return AVERROR_INVALIDDATA; > } > @@ -4123,7 +4125,7 @@ static int cenc_filter(MOVContext *c, MOVStreamContext > *sc, uint8_t *input, int > } > > /* read the subsample count */ > -if (sc->cenc.auxiliary_info_pos + sizeof(uint16_t) > > sc->cenc.auxiliary_info_end) { > +if (sizeof(uint16_t) > sc->cenc.auxiliary_info_end - > sc->cenc.auxiliary_info_pos) { > av_log(c->fc, AV_LOG_ERROR, "failed to read subsample count from the > auxiliary info\n"); > return AVERROR_INVALIDDATA; > } > @@ -4133,7 +4135,7 @@ static int cenc_filter(MOVContext *c, MOVStreamContext > *sc, uint8_t *input, int > > for (; subsample_count > 0; subsample_count--) > { > -if (sc->cenc.auxiliary_info_pos + 6 > sc->cenc.auxiliary_info_end) { > +if (6 > sc->cenc.auxiliary_info_end - sc->cenc.auxiliary_info_pos) { > av_log(c->fc, AV_LOG_ERROR, "failed to read subsample from the > auxiliary info\n"); > return AVERROR_INVALIDDATA; > } > @@ -4144,7 +4146,7 @@ static int cenc_filter(MOVContext *c, MOVStreamContext > *sc, uint8_t *input, int > encrypted_bytes = AV_RB32(sc->cenc.auxiliary_info_pos); > sc->cenc.auxiliary_info_pos += sizeof(uint32_t); > > -if (input + clear_bytes + encrypted_bytes > input_end) { > +if ((uint64_t)clear_bytes + encrypted_bytes > input_end - input) { > av_log(c->fc, AV_LOG_ERROR, "subsample size exceeds the packet > size left\n"); > return AVERROR_INVALIDDATA; > } > > > > [...] > > > > -- > > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > > > Thanks, Michael ! > > Eran > Changelog |1 > libavformat/isom.h | 13 +++ > libavformat/mov.c | 181 > + > 3 files changed, 195 insertions(+) > 5974fab38debc4fae0595bcdfec63d500932495a > 0001-mov-support-cenc-common-encryption.patch > From 2021b91bd195a20ae346b877810661dddfa73144 Mon Sep 17 00:00:00 2001 > From: erankor > Date: Mon, 7 Dec 2015 12:30:50 +0200 > Subject: [PATCH 1/2] mov: support cenc (common encryption) > > support reading encrypted mp4 using aes-ctr, conforming to ISO/IEC > 23001-7. > > a new parameter was added: > - decryption_key - 128 bit decryption key (hex) > --- > Changelog | 1 + > libavformat/isom.h | 13 > libavformat/mov.c | 181 > + > 3 files changed, 195 insertions(+) patch applied thanks [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB If a bugfix only changes things apparently unrelated to the bug with no further explanation, that is a good sign that the bugfix is wrong. signature.asc Description
Re: [FFmpeg-devel] [PATCH] lavf/qtpalette: Ignore greyscale bit in 1-bit video
On 12/30/2015 07:56 PM, Mats Peterson wrote: On 12/30/2015 07:52 PM, Mats Peterson wrote: Michael, can you apply this one in the meantime, just to get rid of it, and if it seems sensible, until someone discovers how to solve the 1-bit palettized qtrle issue? In the decoder, that is. Mats Michael? Do you see this one? Mats -- Mats Peterson http://matsp888.no-ip.org/~mats/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavf/qtpalette: Ignore greyscale bit in 1-bit video
On Thu, Dec 31, 2015 at 04:47:20PM +0100, Mats Peterson wrote: > On 12/30/2015 07:56 PM, Mats Peterson wrote: > >On 12/30/2015 07:52 PM, Mats Peterson wrote: > >>Michael, can you apply this one in the meantime, just to get rid of it, > >>and if it seems sensible, until someone discovers how to solve the 1-bit > >>palettized qtrle issue? > >> > > > >In the decoder, that is. > > > >Mats > > > > Michael? Do you see this one? yes, iam waiting as previously people complained about having had not enough time to review [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB You can kill me, but you cannot change the truth. signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavf/qtpalette: Ignore greyscale bit in 1-bit video
On 12/31/2015 04:54 PM, Michael Niedermayer wrote: On Thu, Dec 31, 2015 at 04:47:20PM +0100, Mats Peterson wrote: On 12/30/2015 07:56 PM, Mats Peterson wrote: On 12/30/2015 07:52 PM, Mats Peterson wrote: Michael, can you apply this one in the meantime, just to get rid of it, and if it seems sensible, until someone discovers how to solve the 1-bit palettized qtrle issue? In the decoder, that is. Mats Michael? Do you see this one? yes, iam waiting as previously people complained about having had not enough time to review [...] ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel There is nothing to review really. 1-bit mode should be palettized, that's it. And this patch doesn't affect the current state of the qtrle decoder, since it will use monobw anyway. But it's nice to have this part "done'. Mats -- Mats Peterson http://matsp888.no-ip.org/~mats/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavf/qtpalette: Ignore greyscale bit in 1-bit video
Mats Peterson ffmpeg.org> writes: > And this patch doesn't affect the current state of > the qtrle decoder Then there is no need for this patch to be committed yet. Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavf/qtpalette: Ignore greyscale bit in 1-bit video
On 12/31/2015 04:59 PM, Carl Eugen Hoyos wrote: Mats Peterson ffmpeg.org> writes: And this patch doesn't affect the current state of the qtrle decoder Then there is no need for this patch to be committed yet. Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel I would gladly get rid of this one-liner in the meantime. It's "my" file at that. Mats -- Mats Peterson http://matsp888.no-ip.org/~mats/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] jpegls: allocate large enough zero buffer
On 30.12.2015 21:12, Andreas Cadhalpun wrote: > It is read up to length s->width * stride, which can be larger than the > linesize. (stride = (s->nb_components > 1) ? 3 : 1) > > This fixes an out of bounds read. > > Signed-off-by: Andreas Cadhalpun > --- > libavcodec/jpeglsdec.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libavcodec/jpeglsdec.c b/libavcodec/jpeglsdec.c > index 68151cb..11ffe93 100644 > --- a/libavcodec/jpeglsdec.c > +++ b/libavcodec/jpeglsdec.c > @@ -348,7 +348,7 @@ int ff_jpegls_decode_picture(MJpegDecodeContext *s, int > near, > JLSState *state; > int off = 0, stride = 1, width, shift, ret = 0; > > -zero = av_mallocz(s->picture_ptr->linesize[0]); > +zero = av_mallocz(FFMAX(s->picture_ptr->linesize[0], s->width * > ((s->nb_components > 1) ? 3 : 1))); > if (!zero) > return AVERROR(ENOMEM); > last = zero; > A better fix is to error out before this happens. Patch doing that attached. Best regards, Andreas >From 637a849f80bff4acaa42afe8cb4d2dd60fc4248a Mon Sep 17 00:00:00 2001 From: Andreas Cadhalpun Date: Thu, 31 Dec 2015 16:55:43 +0100 Subject: [PATCH] mjpegdec: extend check for incompatible values of s->rgb and s->ls This can happen if s->ls changes from 0 to 1, but picture allocation is skipped due to s->interlaced. In that case ff_jpegls_decode_picture could be called even though the s->picture_ptr frame has the wrong pixel format and thus a wrong linesize, which results in a too small zero buffer being allocated. This fixes an out-of-bounds read in ls_decode_line. Signed-off-by: Andreas Cadhalpun --- libavcodec/mjpegdec.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libavcodec/mjpegdec.c b/libavcodec/mjpegdec.c index c812b86..c730e05 100644 --- a/libavcodec/mjpegdec.c +++ b/libavcodec/mjpegdec.c @@ -632,7 +632,8 @@ unk_pixfmt: av_log(s->avctx, AV_LOG_DEBUG, "decode_sof0: error, len(%d) mismatch\n", len); } -if (s->rgb && !s->lossless && !s->ls) { +if ((s->rgb && !s->lossless && !s->ls) || +(!s->rgb && s->ls && s->nb_components > 1)) { av_log(s->avctx, AV_LOG_ERROR, "Unsupported coding and pixel format combination\n"); return AVERROR_PATCHWELCOME; } -- 2.6.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavf/qtpalette: Ignore greyscale bit in 1-bit video
On 12/31/2015 05:00 PM, Mats Peterson wrote: I would gladly get rid of this one-liner in the meantime. It's "my" file at that. Mats Alright, Michael, do whatever you want, but it's awkward to have this patch lying around for no reason, when it's such a small change. Mats ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] jpegls: allocate large enough zero buffer
On Thu, Dec 31, 2015 at 05:02:14PM +0100, Andreas Cadhalpun wrote: > On 30.12.2015 21:12, Andreas Cadhalpun wrote: > > It is read up to length s->width * stride, which can be larger than the > > linesize. (stride = (s->nb_components > 1) ? 3 : 1) > > > > This fixes an out of bounds read. > > > > Signed-off-by: Andreas Cadhalpun > > --- > > libavcodec/jpeglsdec.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/libavcodec/jpeglsdec.c b/libavcodec/jpeglsdec.c > > index 68151cb..11ffe93 100644 > > --- a/libavcodec/jpeglsdec.c > > +++ b/libavcodec/jpeglsdec.c > > @@ -348,7 +348,7 @@ int ff_jpegls_decode_picture(MJpegDecodeContext *s, int > > near, > > JLSState *state; > > int off = 0, stride = 1, width, shift, ret = 0; > > > > -zero = av_mallocz(s->picture_ptr->linesize[0]); > > +zero = av_mallocz(FFMAX(s->picture_ptr->linesize[0], s->width * > > ((s->nb_components > 1) ? 3 : 1))); > > if (!zero) > > return AVERROR(ENOMEM); > > last = zero; > > > > A better fix is to error out before this happens. > Patch doing that attached. > > Best regards, > Andreas > mjpegdec.c |3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > e4b9f65abd49be0714b6367f8530d1829102e6d8 > 0001-mjpegdec-extend-check-for-incompatible-values-of-s-r.patch > From 637a849f80bff4acaa42afe8cb4d2dd60fc4248a Mon Sep 17 00:00:00 2001 > From: Andreas Cadhalpun > Date: Thu, 31 Dec 2015 16:55:43 +0100 > Subject: [PATCH] mjpegdec: extend check for incompatible values of s->rgb and > s->ls > > This can happen if s->ls changes from 0 to 1, but picture allocation is > skipped due to s->interlaced. > > In that case ff_jpegls_decode_picture could be called even though the > s->picture_ptr frame has the wrong pixel format and thus a wrong > linesize, which results in a too small zero buffer being allocated. > > This fixes an out-of-bounds read in ls_decode_line. > > Signed-off-by: Andreas Cadhalpun > --- > libavcodec/mjpegdec.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/libavcodec/mjpegdec.c b/libavcodec/mjpegdec.c > index c812b86..c730e05 100644 > --- a/libavcodec/mjpegdec.c > +++ b/libavcodec/mjpegdec.c > @@ -632,7 +632,8 @@ unk_pixfmt: > av_log(s->avctx, AV_LOG_DEBUG, "decode_sof0: error, len(%d) > mismatch\n", len); > } > > -if (s->rgb && !s->lossless && !s->ls) { > +if ((s->rgb && !s->lossless && !s->ls) || > +(!s->rgb && s->ls && s->nb_components > 1)) { > av_log(s->avctx, AV_LOG_ERROR, "Unsupported coding and pixel format > combination\n"); > return AVERROR_PATCHWELCOME; LGTM thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB The real ebay dictionary, page 1 "Used only once"- "Some unspecified defect prevented a second use" "In good condition" - "Can be repaird by experienced expert" "As is" - "You wouldnt want it even if you were payed for it, if you knew ..." signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] jpegls: allocate large enough zero buffer
On 31.12.2015 17:24, Michael Niedermayer wrote: > On Thu, Dec 31, 2015 at 05:02:14PM +0100, Andreas Cadhalpun wrote: >> On 30.12.2015 21:12, Andreas Cadhalpun wrote: >>> It is read up to length s->width * stride, which can be larger than the >>> linesize. (stride = (s->nb_components > 1) ? 3 : 1) >>> >>> This fixes an out of bounds read. >>> >>> Signed-off-by: Andreas Cadhalpun >>> --- >>> libavcodec/jpeglsdec.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/libavcodec/jpeglsdec.c b/libavcodec/jpeglsdec.c >>> index 68151cb..11ffe93 100644 >>> --- a/libavcodec/jpeglsdec.c >>> +++ b/libavcodec/jpeglsdec.c >>> @@ -348,7 +348,7 @@ int ff_jpegls_decode_picture(MJpegDecodeContext *s, int >>> near, >>> JLSState *state; >>> int off = 0, stride = 1, width, shift, ret = 0; >>> >>> -zero = av_mallocz(s->picture_ptr->linesize[0]); >>> +zero = av_mallocz(FFMAX(s->picture_ptr->linesize[0], s->width * >>> ((s->nb_components > 1) ? 3 : 1))); >>> if (!zero) >>> return AVERROR(ENOMEM); >>> last = zero; >>> >> >> A better fix is to error out before this happens. >> Patch doing that attached. >> >> Best regards, >> Andreas > >> mjpegdec.c |3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> e4b9f65abd49be0714b6367f8530d1829102e6d8 >> 0001-mjpegdec-extend-check-for-incompatible-values-of-s-r.patch >> From 637a849f80bff4acaa42afe8cb4d2dd60fc4248a Mon Sep 17 00:00:00 2001 >> From: Andreas Cadhalpun >> Date: Thu, 31 Dec 2015 16:55:43 +0100 >> Subject: [PATCH] mjpegdec: extend check for incompatible values of s->rgb and >> s->ls >> >> This can happen if s->ls changes from 0 to 1, but picture allocation is >> skipped due to s->interlaced. >> >> In that case ff_jpegls_decode_picture could be called even though the >> s->picture_ptr frame has the wrong pixel format and thus a wrong >> linesize, which results in a too small zero buffer being allocated. >> >> This fixes an out-of-bounds read in ls_decode_line. >> >> Signed-off-by: Andreas Cadhalpun >> --- >> libavcodec/mjpegdec.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/libavcodec/mjpegdec.c b/libavcodec/mjpegdec.c >> index c812b86..c730e05 100644 >> --- a/libavcodec/mjpegdec.c >> +++ b/libavcodec/mjpegdec.c >> @@ -632,7 +632,8 @@ unk_pixfmt: >> av_log(s->avctx, AV_LOG_DEBUG, "decode_sof0: error, len(%d) >> mismatch\n", len); >> } >> >> -if (s->rgb && !s->lossless && !s->ls) { >> +if ((s->rgb && !s->lossless && !s->ls) || >> +(!s->rgb && s->ls && s->nb_components > 1)) { >> av_log(s->avctx, AV_LOG_ERROR, "Unsupported coding and pixel format >> combination\n"); >> return AVERROR_PATCHWELCOME; > > LGTM Pushed. Best regards, Andreas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 3/3] lavc/cbrt_tablegen: unroll table generation loop
This patch does not seem to have measurable impact, at least on x86-64, though there could be benefits for less than stellar branch predictors. As such, the least useful of the series. Tested with FATE. Signed-off-by: Ganesh Ajjanagadde --- libavcodec/cbrt_tablegen.h | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/libavcodec/cbrt_tablegen.h b/libavcodec/cbrt_tablegen.h index d3614d8..78a6b1f 100644 --- a/libavcodec/cbrt_tablegen.h +++ b/libavcodec/cbrt_tablegen.h @@ -45,11 +45,15 @@ static av_cold void AAC_RENAME(cbrt_tableinit)(void) if (!cbrt_tab[(1<<13) - 1].i) { cbrt_tab[0].f = 0; int i; -for (i = 0; i < 1<<13; i++) { -if (!(i & 7)) -cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f; -else -cbrt_tab[i].f = i * cbrt(i); +for (i = 0; i < 1<<13; i+=8) { +cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f; +cbrt_tab[i+1].f = (i+1) * cbrt(i+1); +cbrt_tab[i+2].f = (i+2) * cbrt(i+2); +cbrt_tab[i+3].f = (i+3) * cbrt(i+3); +cbrt_tab[i+4].f = (i+4) * cbrt(i+4); +cbrt_tab[i+5].f = (i+5) * cbrt(i+5); +cbrt_tab[i+6].f = (i+6) * cbrt(i+6); +cbrt_tab[i+7].f = (i+7) * cbrt(i+7); } #if USE_FIXED for (i = 0; i < 1<<13; i++) { -- 2.6.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 2/3] lavc/cbrt_tablegen: speed up tablegen slightly
This exploits a very simple property of the cbrt function, obtaining a non-negligible speed-up. Tables turn out to be identical on GNU/Linux+gcc. Sample benchmark (Haswell, GNU/Linux+gcc): new: 6632898 decicycles in cbrt_tableinit, 256 runs, 0 skips 6623909 decicycles in cbrt_tableinit, 512 runs, 0 skips prev: 7582339 decicycles in cbrt_tableinit, 256 runs, 0 skips 7563556 decicycles in cbrt_tableinit, 512 runs, 0 skips i.e very close to the estimated 12.5% speedup. Tested with FATE. Signed-off-by: Ganesh Ajjanagadde --- libavcodec/cbrt_tablegen.h | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/libavcodec/cbrt_tablegen.h b/libavcodec/cbrt_tablegen.h index ef4c099..d3614d8 100644 --- a/libavcodec/cbrt_tablegen.h +++ b/libavcodec/cbrt_tablegen.h @@ -43,9 +43,13 @@ static union av_intfloat32 cbrt_tab[1 << 13]; static av_cold void AAC_RENAME(cbrt_tableinit)(void) { if (!cbrt_tab[(1<<13) - 1].i) { +cbrt_tab[0].f = 0; int i; for (i = 0; i < 1<<13; i++) { -cbrt_tab[i].f = i * cbrt(i); +if (!(i & 7)) +cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f; +else +cbrt_tab[i].f = i * cbrt(i); } #if USE_FIXED for (i = 0; i < 1<<13; i++) { -- 2.6.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] clarification on General Description for FFV1 Draft Specification
Hi all, I’m reviewing the FFV1 Draft Specification [1] and commenting here specifically upon the General Description section and have some questions to clarify the meaning of this section. I’m cross-posting to ffmpeg-devel though responses are welcomed on either list though encouraged on the IETF Cellar working group listserv [2]. The General Description section [3] contains this introductory paragraph: "Each frame is split in 1 to 4 planes (Y, Cb, Cr, Alpha). In the case of the normal YCbCr colorspace the Y plane is coded first followed by the Cb and Cr planes, if an Alpha/transparency plane exists, it is coded last. In the case of the JPEG2000-RCT colorspace the lines are interleaved to improve caching efficiency since it is most likely that the RCT will immediately be converted to RGB during decoding; the interleaved coding order is also Y, Cb, Cr, Alpha." Two colorspaces are referenced, YCbCr(with optional Alpha) and JPEG2000-RCT (Reversible Color Transform), but the RCT sentence doesn’t reference planar storage. Does an RCT encoding store with planes and if so how many places are used and what are their names (3 planes or one packed plane)? Also is storage of an alpha plane with RCT planes allowed? What is the meaning of “line" in the paragraph above? What does "In the case of the JPEG2000-RCT colorspace the lines are interleaved” mean? Re: "Each frame is split in 1 to 4 planes". I'd like to be more specific. From this reading it seems like 2 planes is possible; however, are 2 plane encodings possible (grayscale with alpha or Y with only Cb)? Re: "since it is most likely that the RCT will immediately be converted to RGB during decoding”. Is there anyone other conversion possible? Best Regards, Dave Rice [1] https://github.com/FFmpeg/FFV1/blob/master/ffv1.md [2] Cellar listserv subscription info: https://www.ietf.org/mailman/listinfo/cellar [3] https://github.com/FFmpeg/FFV1/blob/10fa55b6f70e9bd5c8f2347b455059524b1163a3/ffv1.md#general-description ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 1/3] lavc/cbrt_tablegen: convert cbrt_tab to a union
This is used to prepare for optimizations of the table generation, which is inherently done as floating point computation of i * cbrt[i]. Tested with FATE. Signed-off-by: Ganesh Ajjanagadde --- libavcodec/aacdec_fixed.c| 4 ++-- libavcodec/aacdec_template.c | 2 +- libavcodec/cbrt_tablegen.h | 23 +-- 3 files changed, 12 insertions(+), 17 deletions(-) diff --git a/libavcodec/aacdec_fixed.c b/libavcodec/aacdec_fixed.c index 923fbe0..ebc585e 100644 --- a/libavcodec/aacdec_fixed.c +++ b/libavcodec/aacdec_fixed.c @@ -154,9 +154,9 @@ static void vector_pow43(int *coefs, int len) for (i=0; i #include #include "libavutil/attributes.h" +#include "libavutil/intfloat.h" #include "libavcodec/aac_defines.h" -#if USE_FIXED -#define CBRT(x) lrint((x).f * 8192) -#else -#define CBRT(x) x.i -#endif - #if CONFIG_HARDCODED_TABLES #if USE_FIXED #define cbrt_tableinit_fixed() @@ -43,20 +38,20 @@ #include "libavcodec/cbrt_tables.h" #endif #else -static uint32_t cbrt_tab[1 << 13]; +static union av_intfloat32 cbrt_tab[1 << 13]; static av_cold void AAC_RENAME(cbrt_tableinit)(void) { -if (!cbrt_tab[(1<<13) - 1]) { +if (!cbrt_tab[(1<<13) - 1].i) { int i; for (i = 0; i < 1<<13; i++) { -union { -float f; -uint32_t i; -} f; -f.f = cbrt(i) * i; -cbrt_tab[i] = CBRT(f); +cbrt_tab[i].f = i * cbrt(i); } +#if USE_FIXED +for (i = 0; i < 1<<13; i++) { +cbrt_tab[i].i = lrint(cbrt_tab[i].f * 8192); +} +#endif } } #endif /* CONFIG_HARDCODED_TABLES */ -- 2.6.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 3/3] lavc/cbrt_tablegen: unroll table generation loop
Hi, On Thu, Dec 31, 2015 at 11:39 AM, Ganesh Ajjanagadde wrote: > This patch does not seem to have measurable impact, at least on x86-64, > though > there could be benefits for less than stellar branch predictors. > [..] > -for (i = 0; i < 1<<13; i++) { > -if (!(i & 7)) > -cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f; > -else > -cbrt_tab[i].f = i * cbrt(i); > +for (i = 0; i < 1<<13; i+=8) { > +cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f; > +cbrt_tab[i+1].f = (i+1) * cbrt(i+1); > +cbrt_tab[i+2].f = (i+2) * cbrt(i+2); > +cbrt_tab[i+3].f = (i+3) * cbrt(i+3); > +cbrt_tab[i+4].f = (i+4) * cbrt(i+4); > +cbrt_tab[i+5].f = (i+5) * cbrt(i+5); > +cbrt_tab[i+6].f = (i+6) * cbrt(i+6); > +cbrt_tab[i+7].f = (i+7) * cbrt(i+7); gcc (and most other compilers) will unroll the loop automatically, I suspect. Check disassembly to confirm? (That doesn't mean the patch shouldn't go in, I'm just trying to help you explain the result. I have no comment on the patch itself.) Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] clarification on General Description for FFV1 Draft Specification
Dave Rice dericed.com> writes: > Re: "Each frame is split in 1 to 4 planes". I'd like to be > more specific. From this reading it seems like 2 planes is > possible; however, are 2 plane encodings possible > (grayscale with alpha or Y with only Cb)? GRAY8A and GRAY16A are possible, both have two planes- > Re: "since it is most likely that the RCT will immediately > be converted to RGB during decoding”. Is there anyone > other conversion possible? I may misunderstand but I think it is possible to watch RCT data as if it were YUV (for example for debugging purpose). Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavf/qtpalette: Ignore greyscale bit in 1-bit video
On 12/31/2015 05:04 PM, Mats Peterson wrote: > Alright, Michael, do whatever you want, but it's awkward to have this patch lying around for no reason, when it's such a small change. Mats Let me repeat a snippet from the QuickTime File Format Specification: "Depth: A 16-bit integer that indicates the pixel depth of the compressed image. Values of 1, 2, 4, 8 ,16, 24, and 32 indicate the depth *of color images*. The value 32 should be used only if the image contains an alpha channel. Values of 34, 36, and 40 indicate 2-, 4-, and 8-bit grayscale, respectively, for grayscale images." Notice that 1-bit depth is in the category "color images". There is nothing to hesitate about here. Also, once again, no mention of value 33 (1-bit video with the greyscale bit set) in the grayscale sentence. Hence, the greyscale bit should be ignored in 1-bit video. Mats -- Mats Peterson http://matsp888.no-ip.org/~mats/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 3/3] lavc/cbrt_tablegen: unroll table generation loop
On Thu, Dec 31, 2015 at 8:46 AM, Ronald S. Bultje wrote: > Hi, > > On Thu, Dec 31, 2015 at 11:39 AM, Ganesh Ajjanagadde > wrote: >> >> This patch does not seem to have measurable impact, at least on x86-64, >> though >> there could be benefits for less than stellar branch predictors. > > [..] >> >> -for (i = 0; i < 1<<13; i++) { >> -if (!(i & 7)) >> -cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f; >> -else >> -cbrt_tab[i].f = i * cbrt(i); >> +for (i = 0; i < 1<<13; i+=8) { >> +cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f; >> +cbrt_tab[i+1].f = (i+1) * cbrt(i+1); >> +cbrt_tab[i+2].f = (i+2) * cbrt(i+2); >> +cbrt_tab[i+3].f = (i+3) * cbrt(i+3); >> +cbrt_tab[i+4].f = (i+4) * cbrt(i+4); >> +cbrt_tab[i+5].f = (i+5) * cbrt(i+5); >> +cbrt_tab[i+6].f = (i+6) * cbrt(i+6); >> +cbrt_tab[i+7].f = (i+7) * cbrt(i+7); > > > gcc (and most other compilers) will unroll the loop automatically, I > suspect. Check disassembly to confirm? checked, it does not on gcc at least. I would in fact suspect the opposite; this increases the binary size non-negligibly (unlike an unrolling by two for instance), and is slightly nontrivial for a compiler due to the i&7 business. > > (That doesn't mean the patch shouldn't go in, I'm just trying to help you > explain the result. I have no comment on the patch itself.) I think the branch predictor explanation is reasonable, these are very predictable due to the relatively long length of the loop, and simple, periodic branches. On the other hand, this varies considerably across even generations of intel CPU's; hence I posted the patch. Thanks for the idea though. > > Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavf/qtpalette: Ignore greyscale bit in 1-bit video
On Thu, Dec 31, 2015 at 03:59:53PM +, Carl Eugen Hoyos wrote: > Mats Peterson ffmpeg.org> writes: > > > And this patch doesn't affect the current state of > > the qtrle decoder > > Then there is no need for this patch to be > committed yet. if the patch is fixing incorrect code then it should be commited if it doesnt then it should not if someone needs more time to test/review then we should wait is that the case? do you prefer that the decoder patch is pushed first? or do i misunderstand? but i can surely push that first if thats preferred [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Democracy is the form of government in which you can choose your dictator signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavf/qtpalette: Ignore greyscale bit in 1-bit video
On 12/31/2015 06:24 PM, Michael Niedermayer wrote: do you prefer that the decoder patch is pushed first? or do i misunderstand? but i can surely push that first if thats preferred It's a small one-liner, as you can see, that ignores the greyscale bit for 1-bit video. This should be done regardless of the current (broken) state of the 1-bit mode the qtrle decoder. It won't affect the decoder whatsoever, since it insists on using monow at the moment, but it should be there, and I hate having that patch lying around. Mats -- Mats Peterson http://matsp888.no-ip.org/~mats/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavf/qtpalette: Ignore greyscale bit in 1-bit video
On 12/31/2015 06:53 PM, Mats Peterson wrote: On 12/31/2015 06:24 PM, Michael Niedermayer wrote: do you prefer that the decoder patch is pushed first? or do i misunderstand? but i can surely push that first if thats preferred It's a small one-liner, as you can see, that ignores the greyscale bit for 1-bit video. This should be done regardless of the current (broken) state of the 1-bit mode the qtrle decoder. It won't affect the decoder whatsoever, since it insists on using monow at the moment, but it should be there, and I hate having that patch lying around. Mats Don't bother about the decoder patch. It causes that globe to be displayed with blue colors, but the whole 1-bit mode in qtrle.c is broken, and needs to be rewritten. By whom is another question, since there is no official documentation available. It would take some more reverse engineering, I guess. Mats -- Mats Peterson http://matsp888.no-ip.org/~mats/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavf/qtpalette: Ignore greyscale bit in 1-bit video
On 12/31/2015 06:24 PM, Michael Niedermayer wrote: if the patch is fixing incorrect code then it should be commited if it doesnt then it should not And to answer that, yes, it fixes incorrect code. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavf/qtpalette: Ignore greyscale bit in 1-bit video
if the patch is fixing incorrect code then it should be commited if it doesnt then it should not And to answer that, yes, it fixes incorrect code. If it was OK to apply the previous lavf/qtpalette patch for 1-bit palettized video, it should be OK to apply this bug fix without any superfluous discussion. Mats -- Mats Peterson http://matsp888.no-ip.org/~mats/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] clarification on General Description for FFV1 Draft Specification
On Thu, Dec 31, 2015 at 11:41:22AM -0500, Dave Rice wrote: > Hi all, > > I’m reviewing the FFV1 Draft Specification [1] and commenting here > specifically upon the General Description section and have some questions to > clarify the meaning of this section. I’m cross-posting to ffmpeg-devel though > responses are welcomed on either list though encouraged on the IETF Cellar > working group listserv [2]. > > The General Description section [3] contains this introductory paragraph: > > "Each frame is split in 1 to 4 planes (Y, Cb, Cr, Alpha). In the case of the > normal YCbCr colorspace the Y plane is coded first followed by the Cb and Cr > planes, if an Alpha/transparency plane exists, it is coded last. In the case > of the JPEG2000-RCT colorspace the lines are interleaved to improve caching > efficiency since it is most likely that the RCT will immediately be converted > to RGB during decoding; the interleaved coding order is also Y, Cb, Cr, > Alpha." > > Two colorspaces are referenced, YCbCr(with optional Alpha) and JPEG2000-RCT > (Reversible Color Transform), but the RCT sentence doesn’t reference planar > storage. Does an RCT encoding store with planes and if so how many places are > used and what are their names (3 planes or one packed plane)? the RCT case uses planes interleaved at line granularity > Also is storage of an alpha plane with RCT planes allowed? I think theres nothing disallowing that combination, so it is allowed > > What is the meaning of “line" in the paragraph above? What does "In the case > of the JPEG2000-RCT colorspace the lines are interleaved” mean? a plane is a 2 dimensional array of integer samples a line in this context is meant as a horizontal line in that array that is a set where the second coordinate is always the same I think at least the word "horizontal" should be added somewhere line interleaved is meant so that a 4x3 slice with 3 planes would be stored as (left to right, top to bottom) or as in for (all horizontal lines) for (all planes) for (all samples in a line of a plane) store sample > > Re: "Each frame is split in 1 to 4 planes". I'd like to be more specific. > From this reading it seems like 2 planes is possible; however, are 2 plane > encodings possible (grayscale with alpha or Y with only Cb)? Y with alpha is possible Cb without Cr is not possible > > Re: "since it is most likely that the RCT will immediately be converted to > RGB during decoding”. Is there anyone other conversion possible? In theory the raw RCT values could be returned, thats purely a API/implementation question and would have no effect on the ffv1 format thats similar to using YCbCr values from NTSC/PAL instead of RGB NTSC/PAL isnt affected by what a device turns it into [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB In fact, the RIAA has been known to suggest that students drop out of college or go to community college in order to be able to afford settlements. -- The RIAA signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] libi264: Add Hardware Accelerated H.264 Encoder based on libVA
From: Bryan Christ This commit adds a hardware accelerated H.264 encoder which utilizes libVA (open source implementation of VA-API). Information about libva is available at: https://en.wikipedia.org/wiki/Video_Acceleration_API This encoder is only availbale on linux and supported hardware which can be viewed at: https://en.wikipedia.org/wiki/Video_Acceleration_API#Supported_hardware_and_drivers The short name for encoder is "libi264". The encoder must be enablde at configure time using the --enable-libi264 switch. By default it is turned off. --- Changelog |1 + MAINTAINERS |1 + configure |8 +- doc/general.texi| 11 + libavcodec/Makefile |1 + libavcodec/allcodecs.c |1 + libavcodec/libi264.c| 1476 +++ libavcodec/libi264.h| 107 +++ libavcodec/libi264_param_set.c | 425 ++ libavcodec/libi264_param_set.h | 81 ++ libavcodec/libi264_va_display.c | 104 +++ libavcodec/libi264_va_display.h | 77 ++ libavcodec/libi264_va_display_drm.c | 96 +++ libavcodec/libi264_va_display_x11.c | 171 libavcodec/version.h|2 +- 15 files changed, 2560 insertions(+), 2 deletions(-) create mode 100644 libavcodec/libi264.c create mode 100644 libavcodec/libi264.h create mode 100644 libavcodec/libi264_param_set.c create mode 100644 libavcodec/libi264_param_set.h create mode 100644 libavcodec/libi264_va_display.c create mode 100644 libavcodec/libi264_va_display.h create mode 100644 libavcodec/libi264_va_display_drm.c create mode 100644 libavcodec/libi264_va_display_x11.c diff --git a/Changelog b/Changelog index d9c2ea8..99acb56 100644 --- a/Changelog +++ b/Changelog @@ -49,6 +49,7 @@ version : - VAAPI VP9 hwaccel - audio high-order multiband parametric equalizer - automatic bitstream filtering +- H.264 hwaccelerated encoding through libVA version 2.8: diff --git a/MAINTAINERS b/MAINTAINERS index 9add13d..e37cb6f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -203,6 +203,7 @@ Codecs: libcelt_dec.c Nicolas George libdirac* David Conrad libgsm.c Michel Bardiaux + libi264* Bryan Christ libkvazaar.c Arttu Ylä-Outinen libopenjpeg.c Jaikrishnan Menon libopenjpegenc.c Michael Bradshaw diff --git a/configure b/configure index da74ccd..335c172 100755 --- a/configure +++ b/configure @@ -265,6 +265,7 @@ External library support: --enable-libwavpack enable wavpack encoding via libwavpack [no] --enable-libwebp enable WebP encoding via libwebp [no] --enable-libx264 enable H.264 encoding via x264 [no] + --enable-libi264 enable H.264 encoding via Intel's libva [no] --enable-libx265 enable HEVC encoding via x265 [no] --enable-libxavs enable AVS encoding via xavs [no] --enable-libxcb enable X11 grabbing using XCB [autodetect] @@ -1484,6 +1485,9 @@ EXTERNAL_LIBRARY_LIST=" libtwolame libutvideo libv4l2 +libva +libva-drm +libva-x11 libvidstab libvo_aacenc libvo_amrwbenc @@ -1491,6 +1495,7 @@ EXTERNAL_LIBRARY_LIST=" libvpx libwavpack libwebp +libX11 libx264 libx265 libxavs @@ -2658,7 +2663,7 @@ libwebp_anim_encoder_deps="libwebp" libx262_encoder_deps="libx262" libx264_encoder_deps="libx264" libx264rgb_encoder_deps="libx264" -libx264rgb_encoder_select="libx264_encoder" +libi264_encoder_deps="libi264" libx265_encoder_deps="libx265" libxavs_encoder_deps="libxavs" libxvid_encoder_deps="libxvid" @@ -5528,6 +5533,7 @@ enabled libx264 && { use_pkg_config x264 "stdint.h x264.h" x264_encode die "ERROR: libx264 must be installed and version must be >= 0.118."; } && { check_cpp_condition x264.h "X264_MPEG2" && enable libx262; } +enabled libi264 && require libva va/va.h vaInitialize -lva -lX11 -lva-x11 -lva-drm enabled libx265 && require_pkg_config x265 x265.h x265_api_get && { check_cpp_condition x265.h "X265_BUILD >= 57" || die "ERROR: libx265 version must be >= 57."; } diff --git a/doc/general.texi b/doc/general.texi index 06933ab..bca7ca0 100644 --- a/doc/general.texi +++ b/doc/general.texi @@ -131,6 +131,17 @@ x264 is under the GNU Public License Version 2 or later details), you must upgrade FFmpeg's license to GPL in order to use it. @end float +@section libva + +FFmpeg can make use of the libva library for H.264 encoding. libva is an +implementation of VA-API for Linux. libva can only be used for H.264 encoding +on unix based syste
Re: [FFmpeg-devel] [PATCH] libi264: Add Hardware Accelerated H.264 Encoder based on libVA
On Thu, Dec 31, 2015 at 10:35:47PM +0500, ha...@mayartech.com wrote: > From: Bryan Christ > > This commit adds a hardware accelerated H.264 encoder which utilizes > libVA (open source implementation of VA-API). Information about libva > is available at: https://en.wikipedia.org/wiki/Video_Acceleration_API > This encoder is only availbale on linux and supported hardware which > can be viewed at: > https://en.wikipedia.org/wiki/Video_Acceleration_API#Supported_hardware_and_drivers > > The short name for encoder is "libi264". The encoder must be enablde at > configure time using the --enable-libi264 switch. By default it is > turned off. > --- > Changelog |1 + > MAINTAINERS |1 + > configure |8 +- > doc/general.texi| 11 + > libavcodec/Makefile |1 + > libavcodec/allcodecs.c |1 + > libavcodec/libi264.c| 1476 > +++ > libavcodec/libi264.h| 107 +++ > libavcodec/libi264_param_set.c | 425 ++ > libavcodec/libi264_param_set.h | 81 ++ > libavcodec/libi264_va_display.c | 104 +++ > libavcodec/libi264_va_display.h | 77 ++ > libavcodec/libi264_va_display_drm.c | 96 +++ > libavcodec/libi264_va_display_x11.c | 171 > libavcodec/version.h|2 +- > 15 files changed, 2560 insertions(+), 2 deletions(-) > create mode 100644 libavcodec/libi264.c > create mode 100644 libavcodec/libi264.h > create mode 100644 libavcodec/libi264_param_set.c > create mode 100644 libavcodec/libi264_param_set.h > create mode 100644 libavcodec/libi264_va_display.c > create mode 100644 libavcodec/libi264_va_display.h > create mode 100644 libavcodec/libi264_va_display_drm.c > create mode 100644 libavcodec/libi264_va_display_x11.c > > diff --git a/Changelog b/Changelog > index d9c2ea8..99acb56 100644 > --- a/Changelog > +++ b/Changelog > @@ -49,6 +49,7 @@ version : > - VAAPI VP9 hwaccel > - audio high-order multiband parametric equalizer > - automatic bitstream filtering > +- H.264 hwaccelerated encoding through libVA > > > version 2.8: > diff --git a/MAINTAINERS b/MAINTAINERS > index 9add13d..e37cb6f 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -203,6 +203,7 @@ Codecs: >libcelt_dec.c Nicolas George >libdirac* David Conrad >libgsm.c Michel Bardiaux > + libi264* Bryan Christ >libkvazaar.c Arttu Ylä-Outinen >libopenjpeg.c Jaikrishnan Menon >libopenjpegenc.c Michael Bradshaw > diff --git a/configure b/configure > index da74ccd..335c172 100755 > --- a/configure > +++ b/configure > @@ -265,6 +265,7 @@ External library support: >--enable-libwavpack enable wavpack encoding via libwavpack [no] >--enable-libwebp enable WebP encoding via libwebp [no] >--enable-libx264 enable H.264 encoding via x264 [no] > + --enable-libi264 enable H.264 encoding via Intel's libva [no] >--enable-libx265 enable HEVC encoding via x265 [no] >--enable-libxavs enable AVS encoding via xavs [no] >--enable-libxcb enable X11 grabbing using XCB [autodetect] > @@ -1484,6 +1485,9 @@ EXTERNAL_LIBRARY_LIST=" > libtwolame > libutvideo > libv4l2 > +libva > +libva-drm > +libva-x11 > libvidstab > libvo_aacenc > libvo_amrwbenc > @@ -1491,6 +1495,7 @@ EXTERNAL_LIBRARY_LIST=" > libvpx > libwavpack > libwebp > +libX11 > libx264 > libx265 > libxavs ? > @@ -2658,7 +2663,7 @@ libwebp_anim_encoder_deps="libwebp" > libx262_encoder_deps="libx262" > libx264_encoder_deps="libx264" > libx264rgb_encoder_deps="libx264" > -libx264rgb_encoder_select="libx264_encoder" this looks unintended > +libi264_encoder_deps="libi264" > libx265_encoder_deps="libx265" > libxavs_encoder_deps="libxavs" > libxvid_encoder_deps="libxvid" > @@ -5528,6 +5533,7 @@ enabled libx264 && { use_pkg_config x264 > "stdint.h x264.h" x264_encode > die "ERROR: libx264 must be installed and > version must be >= 0.118."; } && > { check_cpp_condition x264.h "X264_MPEG2" && > enable libx262; } > +enabled libi264 && require libva va/va.h vaInitialize -lva -lX11 > -lva-x11 -lva-drm > enabled libx265 && require_pkg_config x265 x265.h x265_api_get && > { check_cpp_condition x265.h "X265_BUILD >= 57" > || > die "ERROR: libx265 version must be >= 57."; } also the patch breaks configure ./configure ./configure: 1: eval: libva-drm_checking=yes: not found ./configure: 1: eval: -drm_deps_checking=yes: no
Re: [FFmpeg-devel] [PATCH 3/3] lavc/cbrt_tablegen: unroll table generation loop
On Thu, Dec 31, 2015 at 8:46 AM, Ronald S. Bultje wrote: > Hi, > > On Thu, Dec 31, 2015 at 11:39 AM, Ganesh Ajjanagadde > wrote: >> >> This patch does not seem to have measurable impact, at least on x86-64, >> though >> there could be benefits for less than stellar branch predictors. > > [..] >> >> -for (i = 0; i < 1<<13; i++) { >> -if (!(i & 7)) >> -cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f; >> -else >> -cbrt_tab[i].f = i * cbrt(i); >> +for (i = 0; i < 1<<13; i+=8) { >> +cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f; >> +cbrt_tab[i+1].f = (i+1) * cbrt(i+1); >> +cbrt_tab[i+2].f = (i+2) * cbrt(i+2); >> +cbrt_tab[i+3].f = (i+3) * cbrt(i+3); >> +cbrt_tab[i+4].f = (i+4) * cbrt(i+4); >> +cbrt_tab[i+5].f = (i+5) * cbrt(i+5); >> +cbrt_tab[i+6].f = (i+6) * cbrt(i+6); >> +cbrt_tab[i+7].f = (i+7) * cbrt(i+7); > > > gcc (and most other compilers) will unroll the loop automatically, I > suspect. Check disassembly to confirm? > > (That doesn't mean the patch shouldn't go in, I'm just trying to help you > explain the result. I have no comment on the patch itself.) Patch series dropped, I have superior approach that brings down to ~ 400k cycles (as opposed to original 750k, proposed 660k). Currently at work seeing if there is anything I can easily squeeze further. BTW, it would also help me if you or an AAC maintainer can come up with a number below which dynamic initialization can always be done. Thanks for your input. > > Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 3/3] lavc/cbrt_tablegen: unroll table generation loop
On Thu, Dec 31, 2015 at 3:53 PM, Ganesh Ajjanagadde wrote: > On Thu, Dec 31, 2015 at 8:46 AM, Ronald S. Bultje wrote: >> Hi, >> >> On Thu, Dec 31, 2015 at 11:39 AM, Ganesh Ajjanagadde >> wrote: >>> >>> This patch does not seem to have measurable impact, at least on x86-64, >>> though >>> there could be benefits for less than stellar branch predictors. >> >> [..] >>> >>> -for (i = 0; i < 1<<13; i++) { >>> -if (!(i & 7)) >>> -cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f; >>> -else >>> -cbrt_tab[i].f = i * cbrt(i); >>> +for (i = 0; i < 1<<13; i+=8) { >>> +cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f; >>> +cbrt_tab[i+1].f = (i+1) * cbrt(i+1); >>> +cbrt_tab[i+2].f = (i+2) * cbrt(i+2); >>> +cbrt_tab[i+3].f = (i+3) * cbrt(i+3); >>> +cbrt_tab[i+4].f = (i+4) * cbrt(i+4); >>> +cbrt_tab[i+5].f = (i+5) * cbrt(i+5); >>> +cbrt_tab[i+6].f = (i+6) * cbrt(i+6); >>> +cbrt_tab[i+7].f = (i+7) * cbrt(i+7); >> >> >> gcc (and most other compilers) will unroll the loop automatically, I >> suspect. Check disassembly to confirm? >> >> (That doesn't mean the patch shouldn't go in, I'm just trying to help you >> explain the result. I have no comment on the patch itself.) > > Patch series dropped, I have superior approach that brings down to ~ > 400k cycles (as opposed to original 750k, proposed 660k). Currently at > work seeing if there is anything I can easily squeeze further. Sorry, actually 300k cycles. [...] ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 3/3] lavc/cbrt_tablegen: unroll table generation loop
Hi, On Thu, Dec 31, 2015 at 6:53 PM, Ganesh Ajjanagadde wrote: > BTW, it would also help me if you or an AAC maintainer can come up > with a number below which dynamic initialization can always be done. I think the answer is "never", since 0 is always faster than any number. But that's not an absolute veto or anything. Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libi264: Add Hardware Accelerated H.264 Encoder based on libVA
On Thu, 31 Dec 2015 22:07:56 +0100 Michael Niedermayer wrote: > On Thu, Dec 31, 2015 at 10:35:47PM +0500, ha...@mayartech.com wrote: > > From: Bryan Christ > >--enable-libx264 enable H.264 encoding via x264 [no] > > + --enable-libi264 enable H.264 encoding via Intel's libva is there a difference between intel libva and other libva ? otherwise can it just be "via libva" ? > [...] > > +for (row = 0; row < frame->height/2; row++) { > > +unsigned char *U_row = U_start + row * U_pitch; > > +unsigned char *u_ptr = NULL, *v_ptr=NULL; > > +// int j; > > +int j, N, Nmod; > > +switch (surface_image.format.fourcc) { > > +case VA_FOURCC_NV12: > > +u_ptr = frame->data[1] + row * frame->linesize[1]; > > +v_ptr = frame->data[2] + row * frame->linesize[2]; > > + > > + > > +Nmod = (frame->width/2) & 7; // mod 8 > > +N= (frame->width/2) - Nmod; > > +__asm__( > > +"movq %0, %%rax \n\t" > > +"movq %1, %%rbx \n\t" > > +"movq %2, %%rcx \n\t" > > +"movq %3, %%rdx \n\t" > > +"asm_loop: \n\t" > > +"movq (%%rax), %%xmm0 \n\t" > > +"movq (%%rbx), %%xmm1 \n\t" > > +"punpcklbw %%xmm1, %%xmm0 \n\t" > > +"movdqu%%xmm0, (%%rcx)\n\t" > > +"addq $0x8,%%rax \n\t" > > +"addq $0x8,%%rbx \n\t" > > +"addq $0x10, %%rcx \n\t" > > +"cmp %%rcx, %%rdx \n\t" > > +"jnz asm_loop" > > +: > > +: "r"(u_ptr), "r"(v_ptr), > > "r"(U_row), > > + "r" (U_row+2*N) > > +: "rax", "rbx", "rcx", "rdx", > > "xmm0", "xmm1" > > +); > > x86* asm belongs in yasm files > colorspace convertion belongs to vf_scale / swscale, why is this > code here ? if i may guess, colorspace conversion within this lib is so that an external application can call lavc's i264 encoder without converting nv12 itself. > also FFmpeg supports many platforms, not just x86 based ones > asm should be behind appropriate ARCH_* & cpuflags checks yes but i think this patch is for specifically x86 intel libva although right now i'm guessing there is also x64 intel libva? rephrased, this is a feature just for x86 libva h264 encoding on intel chipsets. would that be ok to commit as-is? and then add support for other platforms later? i ask because sometimes it is monumental task to support all platforms at once, especially by new contributors. i could be wrong. -compn ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 3/3] lavc/cbrt_tablegen: unroll table generation loop
On Thu, Dec 31, 2015 at 5:08 PM, Ronald S. Bultje wrote: > Hi, > > On Thu, Dec 31, 2015 at 6:53 PM, Ganesh Ajjanagadde > wrote: >> >> BTW, it would also help me if you or an AAC maintainer can come up >> with a number below which dynamic initialization can always be done. > > > I think the answer is "never", since 0 is always faster than any number. But > that's not an absolute veto or anything. It would be useful to know how long aac decoding takes in general, e.g for n seconds of audio, what is the cycle count. Cycle counts of initialization can get easily amortized in that. And that is really what should determine one's heuristics for when to statically/dynamically init. I personally don't mind either way. All of this work grew out of a remark by wm4 some weeks back: https://ffmpeg.org/pipermail/ffmpeg-devel/2015-November/184018.html, something I agree with mostly. The reason I ask is to prioritize effort. For instance, if 300k cycles is too much to always dynamically init, I will not bother shaving 20k cycles more: I don't want to spend time on last mile optimizations unless it can result in help towards wm4's goal, i.e I don't want an immediate nack to removal of the hardcoded tables stuff here. > > Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] doc/filters: add showwavespic colorize example
On Tue, 29 Dec 2015 10:45:25 -0900, Lou Logan wrote: > Signed-off-by: Lou Logan > --- > doc/filters.texi | 8 > 1 file changed, 8 insertions(+) Pushed. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] Thoughts on palettized 1-bit QuickTime Animation video
Even though the 1-bit mode in the QuickTime Animation decoder (qtrle) is rather broken, it should use pal8 and not monow, since it's a palettized mode. So I suggest that we apply the decoder patch (and the accompanying "bug fix" patch of qtpalette.c that ignores the greyscale bit for 1-bit video) that I posted after all. Not all files render incorrectly, and if they contain a palette, the colors should be rendered properly. Mats -- Mats Peterson http://matsp888.no-ip.org/~mats/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] avcodec/put_bits: Always check buffer end before writing
From: Michael Niedermayer This causes a overall slowdown of 0.1 % (tested with mpeg4 single thread encoding of matrixbench at QP=3) Signed-off-by: Michael Niedermayer --- libavcodec/put_bits.h | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/libavcodec/put_bits.h b/libavcodec/put_bits.h index 5b1bc8b..69a3049 100644 --- a/libavcodec/put_bits.h +++ b/libavcodec/put_bits.h @@ -163,9 +163,11 @@ static inline void put_bits(PutBitContext *s, int n, unsigned int value) #ifdef BITSTREAM_WRITER_LE bit_buf |= value << (32 - bit_left); if (n >= bit_left) { -av_assert2(s->buf_ptr+3buf_end); -AV_WL32(s->buf_ptr, bit_buf); -s->buf_ptr += 4; +if (3 < s->buf_end - s->buf_ptr) { +AV_WL32(s->buf_ptr, bit_buf); +s->buf_ptr += 4; +} else +av_assert0(0); bit_buf = value >> bit_left; bit_left += 32; } @@ -177,9 +179,11 @@ static inline void put_bits(PutBitContext *s, int n, unsigned int value) } else { bit_buf <<= bit_left; bit_buf|= value >> (n - bit_left); -av_assert2(s->buf_ptr+3buf_end); -AV_WB32(s->buf_ptr, bit_buf); -s->buf_ptr += 4; +if (3 < s->buf_end - s->buf_ptr) { +AV_WB32(s->buf_ptr, bit_buf); +s->buf_ptr += 4; +} else +av_assert0(0); bit_left += 32 - n; bit_buf = value; } -- 1.7.9.5 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] Thoughts on palettized 1-bit QuickTime Animation video
On 1 January 2016 at 02:22, Mats Peterson wrote: > Even though the 1-bit mode in the QuickTime Animation decoder (qtrle) is > rather broken, it should use pal8 and not monow, since it's a palettized > mode. So I suggest that we apply the decoder patch (and the accompanying > "bug fix" patch of qtpalette.c that ignores the greyscale bit for 1-bit > video) that I posted after all. Not all files render incorrectly, and if > they contain a palette, the colors should be rendered properly. Would it be possible to reduce the number of new threads that you make about this topic? Kind Regards, Kieran Kunhya ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] Thoughts on palettized 1-bit QuickTime Animation video
On 01/01/2016 03:27 AM, Kieran Kunhya wrote: On 1 January 2016 at 02:22, Mats Peterson wrote: Even though the 1-bit mode in the QuickTime Animation decoder (qtrle) is rather broken, it should use pal8 and not monow, since it's a palettized mode. So I suggest that we apply the decoder patch (and the accompanying "bug fix" patch of qtpalette.c that ignores the greyscale bit for 1-bit video) that I posted after all. Not all files render incorrectly, and if they contain a palette, the colors should be rendered properly. Would it be possible to reduce the number of new threads that you make about this topic? Kind Regards, Kieran Kunhya ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel In this case, no, since it had to do with both of the patches, and it was a general reflection. Mats -- Mats Peterson http://matsp888.no-ip.org/~mats/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libi264: Add Hardware Accelerated H.264 Encoder based on libVA
On 12/31/2015 10:11 PM, compn wrote: > On Thu, 31 Dec 2015 22:07:56 +0100 > Michael Niedermayer wrote: > >> On Thu, Dec 31, 2015 at 10:35:47PM +0500, ha...@mayartech.com wrote: >>> From: Bryan Christ >>>--enable-libx264 enable H.264 encoding via x264 [no] >>> + --enable-libi264 enable H.264 encoding via Intel's libva > > is there a difference between intel libva and other libva ? otherwise > can it just be "via libva" ? > >> [...] >>> +for (row = 0; row < frame->height/2; row++) { >>> +unsigned char *U_row = U_start + row * U_pitch; >>> +unsigned char *u_ptr = NULL, *v_ptr=NULL; >>> +// int j; >>> +int j, N, Nmod; >>> +switch (surface_image.format.fourcc) { >>> +case VA_FOURCC_NV12: >>> +u_ptr = frame->data[1] + row * frame->linesize[1]; >>> +v_ptr = frame->data[2] + row * frame->linesize[2]; >>> + >>> + >>> +Nmod = (frame->width/2) & 7; // mod 8 >>> +N= (frame->width/2) - Nmod; >>> +__asm__( >>> +"movq %0, %%rax \n\t" >>> +"movq %1, %%rbx \n\t" >>> +"movq %2, %%rcx \n\t" >>> +"movq %3, %%rdx \n\t" >>> +"asm_loop: \n\t" >>> +"movq (%%rax), %%xmm0 \n\t" >>> +"movq (%%rbx), %%xmm1 \n\t" >>> +"punpcklbw %%xmm1, %%xmm0 \n\t" >>> +"movdqu%%xmm0, (%%rcx)\n\t" >>> +"addq $0x8,%%rax \n\t" >>> +"addq $0x8,%%rbx \n\t" >>> +"addq $0x10, %%rcx \n\t" >>> +"cmp %%rcx, %%rdx \n\t" >>> +"jnz asm_loop" >>> +: >>> +: "r"(u_ptr), "r"(v_ptr), >>> "r"(U_row), >>> + "r" (U_row+2*N) >>> +: "rax", "rbx", "rcx", "rdx", >>> "xmm0", "xmm1" >>> +); >> >> x86* asm belongs in yasm files >> colorspace convertion belongs to vf_scale / swscale, why is this >> code here ? > > if i may guess, colorspace conversion within this lib is so that an > external application can call lavc's i264 encoder without converting > nv12 itself. > > >> also FFmpeg supports many platforms, not just x86 based ones >> asm should be behind appropriate ARCH_* & cpuflags checks > > yes but i think this patch is for specifically x86 intel libva > > although right now i'm guessing there is also x64 intel libva? > > rephrased, this is a feature just for x86 libva h264 encoding on intel > chipsets. > > would that be ok to commit as-is? and then add support for > other platforms later? i ask because sometimes it is monumental task to No, it can't. For starters, arch specific code can't be outside the relevant arch specific folder, and we've been trying to get rid of x86 inline asm outside of inlined functions like those from lavu's intreadwrite.h for a while now. Others may also have comments about the rest of the patch. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH][WIP] lavc/cbrt_tablegen: speed up tablegen
This exploits an approach based on the sieve of Eratosthenes, a popular method for generating prime numbers. Tables are identical to previous ones. Tested with FATE. Does not work yet with --enable-hardcoded-tables due to the union and lack of proper WRITE_ARRAY for it. Want to get feedback on this; if we always dynamically init it this won't need addressing. Sample benchmark (Haswell, GNU/Linux+gcc): prev: 7860100 decicycles in cbrt_tableinit, 1 runs, 0 skips 490 decicycles in cbrt_tableinit, 2 runs, 0 skips [...] 7582339 decicycles in cbrt_tableinit, 256 runs, 0 skips 7563556 decicycles in cbrt_tableinit, 512 runs, 0 skips new: 2099480 decicycles in cbrt_tableinit, 1 runs, 0 skips 2044470 decicycles in cbrt_tableinit, 2 runs, 0 skips [...] 1796544 decicycles in cbrt_tableinit, 256 runs, 0 skips 1791631 decicycles in cbrt_tableinit, 512 runs, 0 skips Both small and large run count given as this is called once so small run count may give a better picture, small numbers are fairly consistent, and there is a consistent downward trend from small to large runs, at which point it stabilizes to a new value. Signed-off-by: Ganesh Ajjanagadde --- libavcodec/aacdec_fixed.c| 4 ++-- libavcodec/aacdec_template.c | 2 +- libavcodec/cbrt_tablegen.h | 53 +++- 3 files changed, 40 insertions(+), 19 deletions(-) diff --git a/libavcodec/aacdec_fixed.c b/libavcodec/aacdec_fixed.c index 923fbe0..ebc585e 100644 --- a/libavcodec/aacdec_fixed.c +++ b/libavcodec/aacdec_fixed.c @@ -154,9 +154,9 @@ static void vector_pow43(int *coefs, int len) for (i=0; i #include #include "libavutil/attributes.h" +#include "libavutil/intfloat.h" #include "libavcodec/aac_defines.h" -#if USE_FIXED -#define CBRT(x) lrint((x).f * 8192) -#else -#define CBRT(x) x.i -#endif - #if CONFIG_HARDCODED_TABLES #if USE_FIXED #define cbrt_tableinit_fixed() @@ -43,20 +38,46 @@ #include "libavcodec/cbrt_tables.h" #endif #else -static uint32_t cbrt_tab[1 << 13]; +union ff_int32float64 { +uint32_t i; +double f; +}; +static union ff_int32float64 cbrt_tab[1 << 13]; static av_cold void AAC_RENAME(cbrt_tableinit)(void) { -if (!cbrt_tab[(1<<13) - 1]) { -int i; -for (i = 0; i < 1<<13; i++) { -union { -float f; -uint32_t i; -} f; -f.f = cbrt(i) * i; -cbrt_tab[i] = CBRT(f); +int i, j, k; +double cbrt_val; + +if (!cbrt_tab[(1<<13) - 1].i) { +cbrt_tab[0].f = 0; +for (i = 1; i < 1<<13; i++) +cbrt_tab[i].f = 1; + +/* have to worry about non-squarefree numbers */ +for (i = 2; i < 90; i++) { +if (cbrt_tab[i].f == 1) { +cbrt_val = i * cbrt(i); +for (k = i; k < (1<<13); k*= i) +for (j = k; j < (1<<13); j+=k) +cbrt_tab[j].f *= cbrt_val; +} } + +for (i = 91; i <= 8191; i+=2) { +if (cbrt_tab[i].f == 1) { +cbrt_val = i * cbrt(i); +for (j = i; j < (1<<13); j+=i) +cbrt_tab[j].f *= cbrt_val; +} +} +#if USE_FIXED +for (i = 0; i < 1<<13; i++) +cbrt_tab[i].i = lrint(cbrt_tab[i].f * 8192); +#else +for (i = 0; i < 1<<13; i++) +cbrt_tab[i].i = av_float2int((float)cbrt_tab[i].f); +#endif } } #endif /* CONFIG_HARDCODED_TABLES */ -- 2.6.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel