On Saturday 20 September 2014 18:21:44 John Ralls wrote: > On Aug 27, 2014, at 10:31 PM, John Ralls <jra...@ceridwen.us> wrote: > > On Aug 27, 2014, at 8:32 AM, Geert Janssens <janssens- ge...@telenet.be> wrote: > >> On Saturday 23 August 2014 18:01:15 John Ralls wrote: > >>> So, having gotten test-lots and all of the other tests working* > >>> with > >>> libmpdecimal, I studied the Intel library for several days and > >>> couldn't figure out how to make it work, so I decided to try the > >>> GCC > >>> implementation, which offers a 128-bit IEEE 754 format that's > >>> fixed > >>> size. Since it doesn't ever call malloc, I thought it might prove > >>> faster, and indeed it is. I haven't finished integrating it -- the > >>> library doesn't provide formatted printing -- but it's far enough > >>> along that it passes all of the engine and backend tests. Some > >>> results: > >>> > >>> test-numeric, with NREPS increased to 20000 to get a reasonable > >>> execution time for profiling: master 9645ms > >>> > >>> mpDecimal 21410ms > >>> decNumber 12985ms > >>> > >>> test-lots: > >>> master 16300ms > >>> mpDecimal 20203ms > >>> decNumber 19044ms > >>> > >>> The first shows the relative speed in more or less pure > >>> computation, > >>> the latter shows the overall impact on one of the longer-running > >>> tests that does a lot of other stuff. > >> > >> John, > >> > >> Thanks for implementing this and running the tests. The topic was > >> last touched before my holidays so it took me a while to refresh > >> my memory... > >> > >> decNumber clearly performs better, although both implementations > >> lag on our current gnc_numeric performance.>> > >>> I haven't investigated Christian's other suggestion of aggressive > >>> rounding to eliminate the overflow issue to make room for larger > >>> denominators, nor my original idea of replacing gnc_numeric with > >>> boost::rational atop a multi-precision class (either boost::mp or > >>> gmp). > >> > >> Do you still have plans for either ? > >> > >> I suppose aggressive rounding is orthogonal to the choice of data > >> type. Christian's argument that we should round as is expected in > >> the financial world makes sense to me but that argument does not > >> imply any underlying data type. > >> > >> How about the boost::rational option ? > >> > >>> I have noticed that we're doing some dumb things with Scheme, > >>> like using double as an intermediate when converting from Scheme > >>> numbers to gnc_numeric (Scheme numbers are also rational, so the > >>> conversion should be direct) and representing gnc_numerics as a > >>> tuple > >>> (num, denom) instead of just using Scheme rationals. > >> > >> Does this mean you see potential performance gains in this as we > >> clean up the C<->Scheme number conversions ?>> > >>> Neither will > >>> work for decimal floats, of course; the whole class will have to > >>> be > >>> wrapped so that computation takes place in C++. > >> > >> Which means some performance drop again... > >> > >>> Storage in SQL is > >>> also an issue, > >> > >> From the previous conversation I recall sqlite doesn't have a > >> decimal type so we can't run calculating queries on it directly. > >> > >> But how about the other two: mysql and postsgresql. Is the decimal > >> type you're using in your tests directly compatible with the > >> decimal data types in mysql and postgresql, or compatible enough > >> to convert automatically between them ?>> > >>> as is maintaining backward file compatibility. > >>> > >>> Another issue is equality: In order to get tests to pass I've had > >>> to > >>> implement a fuzzy comparison where both numbers are first rounded > >>> to > >>> the smaller number of decimal places -- 2 fewer if there are 12 or > >>> more -- and compared with two roundings, first truncation and > >>> second > >>> "bankers", and declared unequal only if they're unequal in both. I > >>> hate this, but it seems to be necessary to obtain equality when > >>> dealing with large divisors (as when computing prices or interest > >>> rates). I suspect that we'd have to do something similar if we > >>> pursue > >>> aggressive rounding to avoid overflows, but the only way to know > >>> for > >>> certain is to try. > >> > >> Ugh. :( > >> > >> So what's the current balance ? > >> > >> I see following pros and cons of your tests so far: > >> > >> Pro: > >> - using a decimal type gives us more precision > >> > >> Con: > >> - sqlite doesn't have a decimal data type, so as it currently > >> stands we can't run calculations in queries in that database type > >> - we loose backward/forward compatibility with earlier versions of > >> GnuCash - decNumber or mpDecimal are new dependencies > >> - their performance is currently less than the original gnc_numeric > >> - guile doesn't know of a decimal data type so we may need some > >> conversion glue - equality is fuzzy > >> > >> Please add if I forgot arguments on either side. > >> > >> Arguably many of the con arguments can be solved. That will effort > >> however. And I consider the first two more important than the > >> others. > >> > >> So do you think the benefits (I assume there will be more than the > >> one I mentioned) will outweigh the drawbacks ? Does the work that > >> will go into it bring GnuCash enough value to continue on this > >> track ? > >> > >> It's probably too early to tell for sure but I wanted to get your > >> ideas based on what we have so far.> > > Testing boost::rational is next on the agenda. My original idea was > > to use it with boost::multiprecision or gmp, but I'd prefer > > something that doesn't depend on heap allocations because it's so > > much slower than stack allocation and must be passed by pointer, > > which is a major change in the API -- meaning a ton of cleanup work > > up front. I think I'll do a straight substitution of the existing > > math128 with boost::rational<int64_t> just to see what happens. > > > > I think that part of implementing immediate rounding must include > > constraining denominators to powers-of-ten. The main reason is that > > it makes my head hurt when I try to think about how to do rounding > > with arbitrary denominators. If you consider that a big chunk of > > the overflow problems arise from denominators and divisors that are > > large primes, it becomes quickly apparent that avoiding large prime > > denominators might well resolve much of the problem. It's also true > > that for real-world numbers, as opposed to free random-generated > > numbers from tests, that all numbers have powers-of-ten > > denominators. We'd still have many-digit-prime divisors to deal > > with, but constraining denominators gives us something to round to. > > Does that make sense, or does it seem the rambling of a lunatic? > > This really does make my head hurt. > Boost::Rational is a serious disappointment. Boost::rational<int64_t> > didn’t allow a significant increase in precision and is further > hampered by not providing any overflow detection. Benchmarks of > test-numeric with NREPS set to 20000 (the numbers are a bit different > from before because I’m using my Mac Pro instead of my Mac Book Air, > and because these are debug builds): > > Branch Tests Time > master: 1187558 5346ms > libmpdecimal: 1180076 8718ms > boost-rational, cppint: 1187558 20903ms > boost-rational, gmp: 1187558 34232ms > > cppint means boost::multiprecision::checked_cppint128_t, a 16-byte > stack allocated multi-precision integer. “Checked” means that it > throws std::overflow_error instead of wrapping. Gmp means the Gnu > Multiprecision library. It’s supposed to be faster than cppint, but > its performance is killed by having to malloc everything. The fact > that our own C code is substantially faster than any library I’ve > tried is a tribute to Linas. > > There’s another wrinkle: Boost::Rational immediately reduces all > numbers to what we called in my grade school “simplest form”, meaning > no common factors between the numerator and denominator. This > actually helps prevent overflows, but means that we have to be very > careful to supply the SCU as the rounding denominator or we’ll get > unexpected rounding results. Boost::Rational provides no rounding > function of its own so I rewrote gnc_numeric_convert into C++ using > the overloaded operators from boost::multiprecision. That at least > taught me about rounding arbitrary denominators, so my head doesn’t > explode any more. > > The good news is that using 128-bit numbers for all internal > representations along with aggressive reduction and a tweak to > get_random_gnc_numeric() so that the actual number doesn’t exceed > 1E13/1 and careful attention to rounding prevents overflow errors > during testing, at least up through test-lots. > > Looking a bit more at rounding, it doesn’t appear to me that at 14 out > of 151 gnc_numeric operations in the code base we’re over-using > GNC_HOW_RND_NEVER. I’m not convinced that it would help much to > eliminate those cases. > > It looks like the best solution is to work over our existing > gnc-numeric with math128 implementation so that the internals are > always 128-bit and we don’t declare overflows prematurely. > Thanks for the update and the elaborate testing.
So,... math128 is what we use now, using the rational representation of numbers, do I get that right ? And the best option is to stick with it and improve on it ? Would you still transform it into C++ so it becomes an object with properties and members ? Geert _______________________________________________ gnucash-devel mailing list gnucash-devel@gnucash.org https://lists.gnucash.org/mailman/listinfo/gnucash-devel