On 19/12/2019 11.52, lampahome wrote: > I meet performance is low when I use struct.unpack to unpack binary data. > > So I tried to use numpy.ndarray > But meet error when I want to unpack multiple dtypes > > Can anyone teach me~ > > Code like below: > # python3 > import struct > import numpy as np > s1 = struct.Struct("@QIQ") > ss1 = s1.pack(1,11,111) > np.ndarray((3,), [('Q','I','Q')], ss1) > # ValueError: mismatch in size of old and new data-descriptor.
A numpy array always has ONE dtype for ALL array elements. If you read an array of structs, you can define a structured type, where each element of your struct must have a name. The error you're seeing is (as you know) because you're not setting up your dtype in the right way. Let's fix it: > In [2]: np.dtype([('Q', 'I', > 'Q')]) > > > --------------------------------------------------------------------------- > ValueError Traceback (most recent call > last) > <ipython-input-2-cecc70c78408> in <module> > ----> 1 np.dtype([('Q', 'I', 'Q')]) > > ValueError: mismatch in size of old and new data-descriptor > > In [3]: np.dtype([('field1', 'Q'), ('field2', 'I'), ('field3', > 'Q')]) > Out[3]: dtype([('field1', '<u8'), ('field2', '<u4'), ('field3', '<u8')]) > > In [4]: > > ... and now let's put it all together! s1 = struct.Struct("@QIQ") ss1 = s1.pack(1,11,111) struct_dtype = np.dtype([('field1', 'Q'), ('field2', 'I'), ('field3', 'Q')]) a = np.frombuffer(ss1, dtype=struct_dtype) I'm using the frombuffer() function deliberately so I don't have to figure out the shape of the final array (which is (1,), not (3,), by the way). And hey presto: it raises an exception! > ValueError: buffer size must be a multiple of element size Your example shows a difference between the default behaviour of numpy's structured dtype and the struct module: packing! By default, numpy structured dtypes are closely packed, i.e. nothing is aligned to useful memory boundaries. struct_type.itemsize == 20 The struct module, on the other hand, tries to guess where the C compiler would put its padding. len(ss1) == 24 We can tell numpy to do the same: struct_dtype = np.dtype([('field1', 'Q'), ('field2', 'I'), ('field3', 'Q')], align=True) and then a = np.frombuffer(ss1, dtype=struct_dtype) works and produces array([(1, 11, 111)], dtype={'names':['field1','field2','field3'], 'formats':['<u8','<u4','<u8'], 'offsets':[0,8,16], 'itemsize':24, 'aligned':True}) with a .shape of (1,) ##### It's worth noting that in your example, all three fields are aligned to 8 bytes, meaning that on a little-endian machine, you could quite simply have interpreted the data as an array of uint64's instead: In [30]: np.frombuffer(ss1, dtype='u8') Out[30]: array([ 1, 11, 111], dtype=uint64) -- Thomas -- https://mail.python.org/mailman/listinfo/python-list