On Mon, Nov 23, 2015 at 12:43:52PM -0500, Ganesh Ajjanagadde wrote: > On Sun, Nov 22, 2015 at 3:56 PM, Ganesh Ajjanagadde <gajja...@mit.edu> wrote: > > On Sun, Nov 22, 2015 at 3:07 PM, Michael Niedermayer <michae...@gmx.at> > > wrote: > >> On Sun, Nov 22, 2015 at 12:05:49PM -0500, Ganesh Ajjanagadde wrote: > >>> Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> > >>> --- > >>> libavfilter/vsrc_mandelbrot.c | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>> diff --git a/libavfilter/vsrc_mandelbrot.c b/libavfilter/vsrc_mandelbrot.c > >>> index 950c5c8..a0c101e 100644 > >>> --- a/libavfilter/vsrc_mandelbrot.c > >>> +++ b/libavfilter/vsrc_mandelbrot.c > >>> @@ -291,7 +291,7 @@ static void draw_mandelbrot(AVFilterContext *ctx, > >>> uint32_t *color, int linesize, > >>> > >>> use_zyklus= (x==0 || s->inner!=BLACK ||color[x-1 + > >>> y*linesize] == 0xFF000000); > >>> if(use_zyklus) > >>> - epsilon= scale*1*sqrt(SQR(x-s->w/2) + > >>> SQR(y-s->h/2))/s->w; > >>> + epsilon= scale*hypot(x-s->w/2, y-s->h/2)/s->w; > >> > >> old: > >> 704 decicycles in hypo, 1048570 runs, 6 skips > >> > >> new: > >> 1075 decicycles in hypo, 1048566 runs, 10 skips > >> > >> that is from START/STOP_TIMER over hypot() > >> > >> the code is speed relevant as its executed per pixel > > > > Thanks for testing. Looking more closely, I see no reason for > > expensive sqrt calls anyway: one can simply square both sides; it > > should be cheaper. Will rework, post benchmark if it is indeed faster > > and does not suffer from floating point overflow, else will simply > > push a trivial removal of the "1". > > It seems like getting rid of the sqrt altogether has a very slight > positive impact (if any at all). I can post the patch, but would like > to know what to benchmark. There are numerous choices, e.g > draw_mandelbrot as a whole, the outer loop, or the inner loop. > I personally think the inner x loop (lines 268-388) is a good place to > look at, since the difference is very small anyway, and further > localization is impossible.
please post the patch and mandelbrot is difficult to benchmark due to the variable number of iterations per pixel, the skip code from START/STOP timer should possibly be disabled for this [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Old school: Use the lowest level language in which you can solve the problem conveniently. New school: Use the highest level language in which the latest supercomputer can solve the problem without the user falling asleep waiting.
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel