Hi, On Wed, Nov 25, 2015 at 10:46 PM, Ganesh Ajjanagadde <gajja...@mit.edu> wrote:
> On Wed, Nov 25, 2015 at 10:13 PM, Ronald S. Bultje <rsbul...@gmail.com> > wrote: > > Hi, > > > > On Wed, Nov 25, 2015 at 8:48 PM, Ganesh Ajjanagadde <gajja...@mit.edu> > > wrote: > >> > >> On Wed, Nov 25, 2015 at 8:29 PM, Ganesh Ajjanagadde <gajja...@mit.edu> > >> wrote: > >> > On Wed, Nov 25, 2015 at 8:19 PM, Ronald S. Bultje <rsbul...@gmail.com > > > >> > wrote: > >> >> Hi, > >> >> > >> >> On Wed, Nov 25, 2015 at 7:36 PM, Ganesh Ajjanagadde < > gajja...@mit.edu> > >> >> wrote: > >> >> > >> >>> On Wed, Nov 25, 2015 at 6:49 PM, James Almer <jamr...@gmail.com> > >> >>> wrote: > >> >>> > On 11/25/2015 8:32 PM, Ganesh Ajjanagadde wrote: > >> >>> >> On Wed, Nov 25, 2015 at 6:19 PM, Ronald S. Bultje > >> >>> >> <rsbul...@gmail.com> > >> >>> wrote: > >> >>> >>> Hi, > >> >>> >>> > >> >>> >>> On Wed, Nov 25, 2015 at 5:17 PM, Ganesh Ajjanagadde < > >> >>> gajjanaga...@gmail.com> > >> >>> >>> wrote: > >> >>> >>>> > >> >>> >>>> On systems having cbrt, there is no reason to use the slow pow > >> >>> function. > >> >>> >>>> > >> >>> >>>> Sample benchmark (x86-64, Haswell, GNU/Linux): > >> >>> >>>> new: > >> >>> >>>> 5124920 decicycles in cbrt_tableinit, 1 runs, 0 > skips > >> >>> >>>> > >> >>> >>>> old: > >> >>> >>>> 12321680 decicycles in cbrt_tableinit, 1 runs, 0 > skips > >> >>> >>>> > >> >>> >>>> Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> > >> >>> >>>> > >> >>> >>>> > >> >>> > >> >>> > ------------------------------------------------------------------------------- > >> >>> >>>> What I wonder about is why --enable-hardcoded-tables is not the > >> >>> default > >> >>> >>>> for > >> >>> >>>> FFmpeg. Unless I am missing something, static storage is anyway > >> >>> allocated > >> >>> >>>> even > >> >>> >>>> if hardcoded tables are not used, and the cost is deferred to > >> >>> >>>> runtime > >> >>> >>>> instead of > >> >>> >>>> build time. Thus binary size should not be affected, but users > >> >>> >>>> burn > >> >>> cycles > >> >>> >>>> unnecessarily for every codec having these kinds of tables. I > >> >>> >>>> have > >> >>> these > >> >>> >>>> patches, > >> >>> >>>> simply because at the moment users are paying a price for the > >> >>> >>>> typical > >> >>> >>>> default. > >> >>> >>>> --- > >> >>> >>>> libavcodec/cbrt_tablegen.h | 6 +++--- > >> >>> >>>> 1 file changed, 3 insertions(+), 3 deletions(-) > >> >>> >>> > >> >>> >>> > >> >>> >>> This has been discussed extensively in the past... > >> >>> >> > >> >>> >> Can you please give a link and/or timeframe to search for? > >> >>> >> > >> >>> >>> > >> >>> >>> As for the patch, don't forget that tablegen runs on the host > >> >>> >>> (build), > >> >>> not > >> >>> >>> target (runtime), whereas libm.h is for target (runtime) and may > >> >>> >>> not be > >> >>> >>> compatible. I believe that's why we don't use libm.h in tablegen > >> >>> >>> files. > >> >>> >> > >> >>> >> I don't understand this, it seems to me like any other code (at > >> >>> >> least > >> >>> >> in the default configure), it gets called, and like all other > such > >> >>> >> things, we use libavutil/libm for hackery. This host/target > >> >>> >> business > >> >>> >> affects other things as well. What is the issue? > >> >>> > > >> >>> > libavutil/libm.h uses defines from config.h, which are based on > the > >> >>> tests run > >> >>> > by configure for the target, and not the host where compilation > >> >>> > takes > >> >>> place. > >> >>> > The tablegen applications all run at compile time. What is > available > >> >>> > on > >> >>> the > >> >>> > target may not be on the host. > >> >>> > >> >>> Ok. So I would like an answer to two simple questions that are > outside > >> >>> my knowledge or interest. > >> >>> > >> >>> Is it possible with some hackery to get this change through, or not? > >> >>> If so, what is it? > >> >> > >> >> > >> >> You need to understand the issue before you can evaluate hacks. > >> >> > >> >> The issue is: > >> >> - I'm using a linux x86-64 machine using gcc as a compiler, with > >> >> libc=glibc > >> >> 2.18 (A); > >> >> - to build a binary that will run on a Windows Mobile ARMv7 machine, > >> >> with > >> >> libC=something-from-Microsoft (B). > >> >> > >> >> tablegen runs on A, but ffmpeg.exe runs on B. libavutil/libm.h only > >> >> works > >> >> for B. If you want a version of libm.h on A, you need to generate a > >> >> version > >> >> of libm.h that works on A. There is no relationship between A and B, > >> >> and > >> >> thus there can not possibly ever be any relationship between A's > libm.h > >> >> and > >> >> B's libavutil/libm.h. > >> >> > >> >> It's probably possible to generate a version of libm.h for A, but > >> >> that's > >> >> not so much a coding issue, as it is an issue of understanding the > >> >> build > >> >> system and including detection for stuff on machine A, as opposed to > >> >> machine B (which is what most of configure does). > >> > > >> > Thanks a lot for the detail. So how about using a local > >> > #ifndef cbrt > >> > #define cbrt(x) pow(x, 1 / 3.0) > >> > code... > >> > #undef cbrt // at the very end of the file > >> > #endif > >> > >> Not that simple, something more like > >> #ifndef cbrt > >> #define ff_cbrt(x) pow(x, 1/3.0) > >> #else > >> #define ff_cbrt(x) cbrt(x) > >> code... > >> #undef ff_cbrt // at the very end of the file > >> #endif > >> > >> - this will resolve a glitch with the above in not undef'ing an > >> important symbol (all this is of course without including > >> libavutil/libm.h). > > > > > > I don't think cbrt is a macro? But anyway, I don't think any of this is > > meaningful, you're basically creating a crappy per-file libm.h for > tablegen > > files which will be duplicated all over the place. How about we do this > > correctly, if we do it at all? > > I lack the knowledge for doing this in configure, all this avutil/libm > and compat business is black magic to me, and worse still I can hardly > test any of it since I run exclusively a GNU/Linux setup, as evidenced > by some recent patches. I can put in an effort if someone is willing > to review and if it is reasonably simple in configure. > > > > > (Don't forget these table inits run once per process, so the cost of > > increased code messiness and code complexity has a much higher weight > than > > it does in normal situations. Runtime isn't that important because it > only > > runs once.) > > I doubt the patch itself qualifies as complex/messy as is. It is all > this cross compiling business and associated hackery that is. > > An extra 0.1 ms of CPU burning across 1000's of clients across the > globe is significant, especially when it is trivially avoided, going > back to the above point. You're getting in this argument/defensive mode again, don't do that. Cross-compiling is not a hack. Your patch is a hack. Our code as it is right now works perfectly fine when cross-compiling, and lots of people use it like that - including me. I understand that the configure code is convoluted, and I don't fully understand it either. However, to correctly solve this problem, you need to do it right. Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel