Paul Moore <p.f.mo...@gmail.com> writes: > I'm looking for a library that lets me parse binary data structures. > The stdlib struct module is fine for simple structures, but when it > gets to more complicated cases, you end up doing a lot of the work by > hand (which isn't that hard, and is generally perfectly viable, but > I'm feeling lazy ;-)) > > I know of Construct, which is a nice declarative language, but it's > either weak, or very badly documented, when it comes to recursive > structures. (I really like Construct, and if I could only understand > the docs better I may well not need to look any further, but as it is, > I can't see anything showing how to do recursive structures...) I am > specifically trying to parse a structure that looks something like the > following: > > Multiple instances of: > - a type byte > - a chunk of data structured based on the type > types include primitives like byte, integer, etc, as well as > (type byte, count, data) - data is "count" occurrences of data of > the given type.
What you have is a generalized deserialization problem. It can be solved with a set of deserializers. def deserialize(file): """read the beginning of file and return the corresponding object.""" In the above case, you have a mapping "type byte --> deserializer", called "TYPE" and (obviously) "(" is one such "type byte". The deserializer corresponding to "(" is: def sequence_deserialize(file): type_byte = file.read(1) if not type_byte: raise EOFError() type = TYPE[type_byte] count = TYPE[INT].deserialize(file) seq = [type.deserialize(file) for i in range(count)] assert file.read(1) == ")" return seq The top level "deserialize" could look like: def top_deserialize(file): """generates all values found in *file*.""" while True: type_byte = file.read(1) if not type_byte: return yield TYPE[type_byte].deserialize(file) -- https://mail.python.org/mailman/listinfo/python-list