> -----Original Message----- > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of > Carl Eugen Hoyos > Sent: Tuesday, April 9, 2019 9:21 PM > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> > Subject: Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter > > 2019-04-09 4:54 GMT+02:00, Song, Ruiling <ruiling.s...@intel.com>: > > >> > +kernel void vert_sum(__global uint4 *ii, > >> > + int width, > >> > + int height) > >> > +{ > >> > + int x = get_global_id(0); > >> > + uint4 sum = 0; > >> > + for (int i = 0; i < height; i++) { > >> > + ii[i * width + x] += sum; > >> > + sum = ii[i * width + x]; > >> > >> This looks like it might be able to overflow in extreme cases? > >> > >> 3840 * 2160 * (1 - 0)^2 * 255 * 255 = 539,343,360,000 which > >> is a long way out of range for a 32-bit int. That requires > >> impossible input (all pixels differing by the most extreme > >> value), but something like a chequerboard might be of the > >> same order? > > Yes this is a dilemma for me. Generally the filter is with > > high computation cost. > > To fix the overflow, we have to use 64bit integer for the > > integral image. Most GPUs are not good at 64bit integer > > calculation I think. May be we can try later. > > So I would prefer to stay with 32bit integer for a while. > > Can the overflow be detected at runtime? Will add the check. > > Could the user choose between 32 and 64 bit calculation? I may mark this as TODO. > > Carl Eugen > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".