[issue4114] struct returns incorrect 4 byte float

2011-03-16 Thread Robert Withrow

Robert Withrow  added the comment:

I have to disagree.  It seems entirely reasonable to expect that unpack should 
return the same value passed to pack.  That it doesn't (as of 2.6.5 at least) 
is completely unexpected and undocumented.  And yes I understand the 
limitations of floating point numbers.

I suggest that struct should be fixed so that 
struct.unpack(fmt,struct.pack(fmt,v)) == v and format is something like '!f'.

This can be done in C code (I do it) for IEEE 754 floats.

At the very list this unexpected behavior should be documented in struct.

--
nosy: +Robert.Withrow

___
Python tracker 
<http://bugs.python.org/issue4114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4114] struct returns incorrect 4 byte float

2011-03-17 Thread Robert Withrow

Robert Withrow  added the comment:

Martin: in C I have the luxury of using 32 bit floats; not an option in Python. 
 Simple code doing the moral equivalent of NTOHL(HTONL()) works in this case 
for C but wouldn't help for Python.

Mark: I understand about the precision truncation issue and how Python does 
floating point arithmetic.  This C code clearly demonstrates what is going on:

#include 

int main(int argc, char *argv[])
{
  double d1 = 6.21;
  float f = 6.21;
  double d2 = f;
  
  printf("double: %.15f\n", d1);
  printf("float: %.15f\n", f);
  printf("double converted from float: %15.15f\n", d2);
}

The point here is about the contract of struct, NOT how Python does floating 
point arithmetic.  The contract is: what pack packs, unpack will unpack 
resulting in the original value.  At least, that is what the documentation 
leads you to believe.

For the 'f' format character, this contract is broken because of a basic 
implementation detail of Python and there is nothing in the documentation for 
struct that *directly* lets you know this will happen.  After all, the mentions 
in the documentation about 32 bit versus 64 bit talk about C not Python!

Even worse, there is no straightforward way (that I'm aware of) to write 
portable tests for code using the 'f' format character.  In my case I'm writing 
a tool that creates message codecs in multiple languages and the most basic 
unit test goes something like this:

m1 = example.message()
m1.f1 = 6.21
b = m.encode() # uses struct pack
m2 = example.message(b) # uses struct unpack
if m1 != m2:  # rich comparison
  print('fail')

This test will fail when you use the 'f' format code.

I suggest two things could be done to improve the situation:

1) Add a note to the documentation for struct that tells you that unpack(pack) 
using the 'f' format code will not generally give you the results you probably 
expect because .

2) Create a way in Python to write portable code related to 32 bit floats.  For 
example, if I had a way in Python to cause the precision truncation 
programmatically:

m1 = example.message()
m1.f1 = 6.21.as_32_bit_float() # Does the precision truncation upfront
b = m.encode() # uses struct pack
m2 = example.message(b) # uses struct unpack
if m1 != m2:  # rich comparison
  print('fail')

I'd expect this test to pass.

Hope this long-winded note helps.

--

___
Python tracker 
<http://bugs.python.org/issue4114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4114] struct returns incorrect 4 byte float

2011-03-17 Thread Robert Withrow

Robert Withrow  added the comment:

> If you agree that Python actually behaves correct, I fail to
> understand what it is that you disagree with in msg131195

I don't agree that Python is behaving correctly as far as the documented 
contract for struct is concerned.

I disagree with the statement in the preceding msg74708 which says:

> people should read general CS introductory material
> to learn how floating point numbers work.

Aside from being patronizing it doesn't solve the problem in any meaningful way.

> If you use numbers that are exactly representable as floats,
> the test should be portable to all platforms that use 32-bit
> IEEE-754 floats.

A reasonable suggestion, but it is a constrained definition of "portable".  
Since most (or nearly all?) modern platforms use '754 it is probably not a bad 
constraint, given that struct explicitly uses '754.

> If you then also use numbers without a fractional
> part, it should even port to non-IEEE platforms

I confess, the "CS introductory material" I read 30 years ago (predating '754) 
don't give me enough information to know if this is correct.

Anyway:

> If all you want is a documentation change, can you please propose
> specific wording?

It isn't exactly "all I want", but it is a good start.  I note that msg74705 
suggests adding documentation to struct about the 'f' format code.

First of all, as far as I know, struct is the only place where this issue of 32 
bit versus 64 bit floating point numbers shows up in Python because the rest of 
Python uses only 64 bit numbers.  (Is there anywhere else in Python where a 32 
bit float is converted to a 64 bit float?) So the documentation probably 
belongs in struct.

I would add to note 4 of 7.3.2.2 (in the 2.7.1 documentation) something like:

"Note that 32 bit representations do not generally convert exactly to 64 bit 
representations (which Python uses internally) so that the results of 
unpack(fmt,pack(fmt,number)) may not equal number when using the 'f' format 
character."

It would be friendly to add an example at the bottom demonstrating the issue 
and incorporating your comments about fractions and non-fractional values.

>>> x = unpack('!f', pack('!f', 6.24))[0]
>>> x == 6.24
False
>>> x = unpack('!f', pack('!f', 6.25))[0]
>>> x == 6.25
True

--

___
Python tracker 
<http://bugs.python.org/issue4114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4114] struct returns incorrect 4 byte float

2011-03-17 Thread Robert Withrow

Robert Withrow  added the comment:

> it needs to be worded in a way that doesn't
> imply that the struct implementation is broken or misdesigned. 

Agree.

> A better note would focus on the basic (and obvious)
> fact that downgrading from double precision to single
> precision entails a loss of precision.

Sort of where I was going, but I'm sure my text could be vastly improved.

> The suggested examples are misleading because they 
> use 6.24 which is not exactly representable in binary
> floating point.

I'd quibble with this for two reasons:

1) to be precise, numbers which are not exactly representable in binary 
floating point would nonetheless pass the unpack(pack) test if you use the 'd' 
format character.  The key issue is, as you said, loss of precision.

2) I don't understand why the 6.24 example is "misleading" when it accurately 
demonstrates the issue.

One comment about portability I forgot to mention earlier:  I don't know how 
wed Python is to '754 or even binary floating point representations.  My 
personal belief is that it should be possible to write a test so that the 
unpack(fmt, pack(fmt, precision_truncate(number))) == 
precision_truncate(number) test works for any legal number on any platform.  I 
don't like the idea that one has to pick specific numbers based on knowledge of 
the platform's floating point format.

I acknowledge that this may not bother others as much as it bothers me though.  
I'm a portability nut.

--

___
Python tracker 
<http://bugs.python.org/issue4114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4114] struct returns incorrect 4 byte float

2011-03-18 Thread Robert Withrow

Robert Withrow  added the comment:

For completeness: msg131234 states that the issue of 64 bit -> 32 bit precision 
truncation is covered in the floating point tutorial.  I believe that is 
incorrect; at least I can't find it explicitly mentioned. Ref: 
http://docs.python.org/tutorial/floatingpoint.html.

If struct is the only place this (64->32 bit precision truncation) can happen 
in Python, the lack of discussion in the tutorial makes sense.  Otherwise, a 
sentence about it should be added to the tutorial.

As it is, there is no _explicit_ mention of this anywhere in Python 
documentation.  It is all well and good to state that it is "obvious", but it 
seems that explicit documentation is preferable to implicit documentation, 
given the rarity of the issue in Python and the meager cost of adding a 
sentence here or there.

Incidentally, it is simple to create the truncation routine I mention earlier:

>>> def fptrunc(value):
...   return unpack('!f', pack('!f', value))[0]
... 
>>> fptrunc(6.24)
6.237711181641
>>> fptrunc(6.25)
6.25

But this has the questionable smell of using pack/unpack in a test of 
pack/unpack.  It's sorta OK for _users_ of pack/unpack though.

A quick scan of the Python source code shows that only two things try to pack 4 
byte floats: struct and ctypes and both of these use the underlying Python 
float object routines.  So a better way of doing the truncation is to use 
ctypes:

>>> def fptrunc(value):
...   return c_float(value).value
... 
>>> fptrunc(6.24)
6.237711181641
>>> fptrunc(6.25)
6.25

Doing this allows you to write tests that work for any number and don't require 
the use of magic numbers or knowledge of the underlying floating point 
implementation.

Even if nothing gets put into the documentation, people will probably find this 
discussion by Googling.

I can't imagine there is much more that can be said about this, so I'll leave 
you guys alone now...  ;-)

--

___
Python tracker 
<http://bugs.python.org/issue4114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com