Eric Blake <ebl...@redhat.com> writes: > On 01/20/2016 02:02 AM, Markus Armbruster wrote: > >>> @@ -519,6 +519,8 @@ static QObject *parse_literal(JSONParserContext *ctxt) >>> } >>> case JSON_FLOAT: >>> /* FIXME dependent on locale */ >>> + /* FIXME our lexer matches RFC7159 in forbidding Inf or NaN, >> >> For what it's worth, the RFC spells this "RFC 7159". > > Looks like we use space more often than not, but that we're > inconsistent. For example: > > slirp/tcp.h: * Per RFC 793, September, 1981. > slirp/tcp.h: * Per RFC793, September, 1981. > > Will fix if I need to respin, otherwise I assume you can do it.
Okay. >>> + /* FIXME: snprintf() is locale dependent; but JSON requires >>> + * numbers to be formatted as if in the C locale. */ >> >> The new FIXME matches the existing one in parse_literal(), visible >> above. >> >> However, dependence on C locale is a pervasive issue in QEMU. These two >> comments could give readers the idea that it's an issue just here. >> Suggest to add something like "Dependence on C locale is a pervasive >> issue in QEMU." > > Good idea. > >> >>> + /* FIXME: This risks printing Inf or NaN, which are not valid >>> + * JSON values. */ >>> + /* FIXME: the default precision of %f may be insufficient to >>> + * tell this number apart from others. */ >> >> Yup. >> >> The easy way to avoid loss of precision is %a, but of course that's way >> too much sophistication for JSON. >> >> Avoiding loss of precision with a decimal format is non-trivial; see >> Steele, Jr., Guy L., and White, Jon L. How to print floating-point >> numbers accurately, SIGPLAN ’90, and later improvements collected here: >> http://stackoverflow.com/questions/7153979/algorithm-to-convert-an-ieee-754-double-to-a-string > > Ah, memories. I read and implemented that paper when working on the > jikes compiler for the Java language back in the late nineties, as it is > the Java language specification which had the very neat property of > requiring the shortest decimal string that will unambiguously round back > to the same floating point pattern. > > One alternative is to always output a guaranteed unambiguous decimal > string (although not necessarily the shortest), by using %.17f, using > <float.h> DBL_DECIMAL_DIG. (Note that DBL_DIG of 15 is NOT sufficient - > it is the lower limit that says that a decimal->float->decimal will not > change the decimal; but we want the converse where a > float->decimal->float will not change the float. There are stretches of > numbers where the pigeonhole principle applies; you can think of it this > way: there is no way to map all possible 2^10 (1024) binary values > inside 2^3 (1000) decimal digits without at least 24 of them needing one > more decimal digit. But by the same arguments, DBL_DECIMAL_DIG is an > upper limit and usually more than you need.) > > So, the question is whether we want to always output 17 digits, or > whether we want to do the poor-man's truncation scheme (easy to > understand, but not optimal use of the processor), or go all the way to > the algorithm of that paper (faster but lots harder to understand). For > reference, here's the poor-man's algorithm in pseudocode: I don't think we want to implement floating-point formatting ourselves. > if 0, inf, nan: > special case > else: > Obtain the DBL_DECIMAL_DIG string via sprintf %.17f > i = 17; > do { > truncate the original string to i-1 decimal characters > parse that with strtod() > if the bit pattern differs: > break; > } while (--i); > assert(i) > use i digits of the string That's a lot of strtod()... May not be noticable if we write the result to a slowish sink. Binary search could save a few. Naive idea: chop off trailing '0'? > As a separate patch, of course, but I have a pending patch that provides > a single place where we could drop in such an improvement: > https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg03932.html Definitely separate.