On Sun, 17 Apr 2005 20:47:20 +0100, "Jonathan Brady" <[EMAIL PROTECTED]> wrote:
> ><[EMAIL PROTECTED]> wrote in message >news:[EMAIL PROTECTED] >> Hello, >> >> I was looking at this: >> http://docs.python.org/lib/module-struct.html >> and tried the following >> >>>>> import struct >>>>> struct.calcsize('h') >> 2 >>>>> struct.calcsize('b') >> 1 >>>>> struct.calcsize('bh') >> 4 >> >> I would have expected >> >>>>> struct.calcsize('bh') >> 3 >> >> what am I missing ? A note for the original poster: "unpack hex to decimal" (the subject line from your posting) is an interesting concept. Hex[adecimal] and decimal are ways of representing the *same* number. Let's take an example of a two-byte piece of data. Suppose the first byte has all bits set (== 1) and the second byte has all bits clear (== 0). The first byte's value is hexadecimal FF or decimal 255, whether or not you unpack it, if you are interpreting it as an unsigned number ('B' format). Signed ('b' format) gives you hexadecimal -1 and decimal -1. The second byte's value is 0 hexadecimal and 0 decimal however you interpret it. Suppose you want to interpret the two bytes as together representing a 16-bit signed number (the 'h' format). If the perp is little-endian, the result is hex FF and decimal 255; otherwise it's hex -100 and decimal -256. > >Not sure, however I also find the following confusing: >>>> struct.calcsize('hb') >3 >>>> struct.calcsize('hb') == struct.calcsize('bh') >False > >I could understand aligning to multiples of 4, Given we know nothing about the OP's platform or your platform, "4" is no more understandable than any other number. > but why is 'hb' different >from 'bh'? Likely explanation: the C compiler aligns n-byte items on an n-byte boundary. Thus in 'hb', the h is at offset 0, and the b can start OK at offset 2, for a total size of 3. With 'bh', the b is at offset 0, but the h can't (under the compiler's rules) start at 1, it must start at 2, for a total size of 4. Typically, you would use "native" byte ordering and alignment (the default) only where you are accessing data in a C struct that is in code that is compiled on your platform [1]. When you are picking apart a file that has been written elsewhere, you will typically need to read the documentation for the file format and/or use trial & error to determine which prefix (@, <, >) you should use. If I had to guess for you, I'd go for "<". [1] Care may need to be taken if the struct is defined in source compiled by a compiler *other* than the one used to compile your Python executable -- there's a slight chance you might need to fiddle with the "foreign" compiler's alignment options to make it suit. HTH, John -- http://mail.python.org/mailman/listinfo/python-list