Hi Betty, On Mon, Apr 3, 2017 at 11:00 AM, Betty Wu <lumoso...@gmail.com> wrote:
> +typedef double number_t; > Why? +static int ansnr_filter(const uint8_t* src_image, number_t* dst_image, int > img_row, int img_col, const double *filter, int stride) > [..] > + number_t **imme_image; > + if(!(imme_image = (number_t **)av_malloc((size_t)imme_row * > sizeof(number_t *)))) > + return AVERROR(ENOMEM); > + > + for(int i = 0; i < imme_row; i++) { > + if(!(imme_image[i] = (number_t *)av_malloc((size_t)imme_col * > sizeof(number_t)))) > + return AVERROR(ENOMEM); > + } > In ffmpeg, we typically don't allocate lines and cols in separate arrays. You can just use a 1D array and use y*stride+x for indexing, which is what we do elsewhere. These arrays should also be initialized once in the init function and then reuse, instead of being re-allocated for each frame. + number_t *anc_after; > + if(!(anc_after = (number_t *)av_malloc((size_t)row * col * > sizeof(number_t)))) { > + av_free(anc_after); > + return AVERROR(ENOMEM); > + } > Why free if allocation failed? And this also should be allocated once during init, not re-allocated for each frame. > +static int query_formats(AVFilterContext *ctx) > +{ > + static const enum AVPixelFormat pix_fmts[] = { > + AV_PIX_FMT_GRAY8, AV_PIX_FMT_GRAY16, > +#define PF_NOALPHA(suf) AV_PIX_FMT_YUV420##suf, AV_PIX_FMT_YUV422##suf, > AV_PIX_FMT_YUV444##suf > +#define PF_ALPHA(suf) AV_PIX_FMT_YUVA420##suf, AV_PIX_FMT_YUVA422##suf, > AV_PIX_FMT_YUVA444##suf > +#define PF(suf) PF_NOALPHA(suf), PF_ALPHA(suf) > + PF(P), PF(P9), PF(P10), PF_NOALPHA(P12), PF_NOALPHA(P14), PF(P16), > + AV_PIX_FMT_YUV440P, AV_PIX_FMT_YUV411P, AV_PIX_FMT_YUV410P, > + AV_PIX_FMT_YUVJ411P, AV_PIX_FMT_YUVJ420P, AV_PIX_FMT_YUVJ422P, > + AV_PIX_FMT_YUVJ440P, AV_PIX_FMT_YUVJ444P, > + AV_PIX_FMT_GBRP, AV_PIX_FMT_GBRP9, AV_PIX_FMT_GBRP10, > + AV_PIX_FMT_GBRP12, AV_PIX_FMT_GBRP14, AV_PIX_FMT_GBRP16, > + AV_PIX_FMT_GBRAP, AV_PIX_FMT_GBRAP16, > + AV_PIX_FMT_NONE > + }; > Have all of these been tested? Since ANSNR is Y-only, it doesn't work on GBR. Likewise, I believe the code is 8-bit only so it shouldn't work on gray16 or any of the P9/10/12/14/16 formats. For SIMD purposes, I would probably encourage you to work in fixed-point integer. Maybe for the qualification task we can skip that (I hadn't really thought about it), but I think for the final implementation, fixed-point will be much faster, and speed is a significant goal here. An interesting idea for speed is to make the filter decomposable (i.e. split the horizontal/vertical pass) if that's possible. At least for the 3-tap filter, that should be trivial (it's just a 121 lowpass in both directions), and should give some speed gains because you go from n^2 to 2n + an extra load/store pair in terms of complexity (whether it's actually faster remains to be proven). The other filter should also be decomposable but I don't see from the top of my head what it decomposes into and I don't feel like writing a script to figure it out. :-). Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel