Morning all Since PHP 7.1 the unpack() function has a (still undocumented) optional 3rd argument that allows the caller to specify the offset in the input data where parsing should start. While this is a useful feature, it is currently impossible to know how many bytes of the input were consumed for some format specifiers, such as Z*, f, d and anything else that does not consume a universally constant amount of data.
It is typically possible to determine this externally, but not without some clumsy measurements either of the returned value or (in the case of system-dependent numeric types) inspecting the length of the string returned by pack() for those specifiers. It can also get complicated when using things like x and X, which adjust the offset without producing data in the returned value. Additionally, computing the new position in the input buffer separately from the format string risks the two diverging if one is modified and the other is either not updated, or updated incorrectly. Many binary data formats are sufficiently complex that unpacking a large structure requires multiple calls to unpack(), as often there are nuances that cannot be directly expressed with the current specifier format, such as strings prefixed with a length indicator. Here is some code that demonstrates the problem: /* This is the only way to know for certain how big float is on the local system */ define('FLOAT_WIDTH', strlen(pack('f', 0.0))); /* an exaggerated example using two variable width codes and a code that does not produce output but modifies the input buffer offset */ $pieces = unpack('f/X/Z*', $data, $offset); /* we now have to modify the offset before we can continue to unpack data */ $offset += FLOAT_WIDTH // f - 1 // x + strlen($pieces[3]); // Z* I would like to look at adding a 4th optional argument, taken by-ref, which will be populated with the number of buffer bytes consumed by the unpack() operation. This would enable the above code to be rewritten like so: $pieces = unpack('f/X/Z*', $data, $offset, $consumed); $offset += $consumed; Not only is this code much simpler and less susceptible to breakage, it is (IMHO) clearer to read as well. Does anyone have any objections to/thoughts about this? If not I will work up a patch in the coming week. Thanks, Chris