"movead...@highgo.ca" <movead...@highgo.ca> writes: > [ long_bytea_string_bug_fix_ver5.patch ]
I don't think this has really solved the overflow hazards. For example, in binary_encode we've got resultlen = enc->encode_len(VARDATA_ANY(data), datalen); result = palloc(VARHDRSZ + resultlen); and all you've done about that is changed resultlen from int to int64. On a 64-bit machine, sure, palloc will be able to detect if the result exceeds what can be allocated --- but on a 32-bit machine it'd be possible for the size_t argument to overflow altogether. (Or if you want to argue that it can't overflow because no encoder expands the data more than 4X, then we don't need to be making this change at all.) I don't think there's really any way to do that safely without an explicit check before we call palloc. I also still find the proposed signatures for the encoding-specific functions to be just plain weird: - unsigned (*encode_len) (const char *data, unsigned dlen); - unsigned (*decode_len) (const char *data, unsigned dlen); - unsigned (*encode) (const char *data, unsigned dlen, char *res); - unsigned (*decode) (const char *data, unsigned dlen, char *res); + int64 (*encode_len) (const char *data, unsigned dlen); + int64 (*decode_len) (const char *data, unsigned dlen); + int64 (*encode) (const char *data, unsigned dlen, char *res); + int64 (*decode) (const char *data, unsigned dlen, char *res); Why did you change the outputs from unsigned to signed? Why didn't you change the dlen inputs? I grant that we know that the input can't exceed 1GB in Postgres' usage, but these signatures are just randomly inconsistent, and you didn't even add a comment explaining why. Personally I think I'd make them like uint64 (*encode_len) (const char *data, size_t dlen); which makes it explicit that the dlen argument describes the length of a chunk of allocated memory, while the result might exceed that. Lastly, there is a component of this that can be back-patched and a component that can't --- we do not change system catalog contents in released branches. Cramming both parts into the same patch is forcing the committer to pull them apart, which is kind of unfriendly. regards, tom lane