Hello internals! Thanks for PHP! I'm writing to gauge interest in two new functions to the PHP `hash` extension, `hash_serialize` and `hash_unserialize`. These functions would serialize and unserialize the internals of a HashContext object, allowing a partially-computed hash to be saved, then restored and completed in a later run.
EXAMPLE: Multi-part upload. Say that a very large file is uploaded in pieces, `big.001` through `big.999`, and it is necessary to compute the SHA256 of the final concatenated file. Current PHP must compute the hash in one go: $ctx = hash_init("sha256"); for ($i = 1; $i <= 999; ++$i) { hash_update_file($ctx, sprintf("big.%.03d", $i)); } $hash = hash_final($ctx); This in turn requires that all pieces be on the filesystem simultaneously. With hash_serialize and hash_unserialize, the hash can be computed gradually, allowing pieces to be deleted as they are uploaded elsewhere. $ctx = hash_init("sha256"); hash_update_file($ctx, "big.001"); SAVE_TO_DATABASE(hash_serialize($ctx)); ... $ctx = hash_unserialize(LOAD_FROM_DATABASE()); hash_update_file($ctx, "big.002"); SAVE_TO_DATABASE(hash_serialize($ctx)); ... etc. *** I am happy to write up an RFC for these functions. An initial implementation with tests is visible here: https://github.com/kohler/php-src/commit/5a3a828f90b88cd7f660babec7db531cfc04b0a1 New functions `hash_serialize` and `hash_unserialize` appear to fit the existing API well, and simplify implementation, but it's possible that `__serialize/__unserialize` or the internal `serialize/unserialize` functions would be preferred. I'd be grateful for any feedback. Thanks! Eddie Kohler