On Sat, Jan 2, 2021 at 02:25:33PM +0900, Michael Paquier wrote: > > Let me get my patch building on the cfbot and then I will address each > > of these. I am trying to do one stage at a time since I am still > > learning the process. Thanks. > > No problem. On my end, this stuff has been itching me for a couple of > days and I could not recall why.. Until I remembered that the design > of the hex APIs in your patch is weak with overflow handling because > we don't pass down to the function the size of the destination buffer. > We have finished with a similar set of issues on the past with SCRAM > and base64, with has led to CVE-2019-10164 and the improvements done > in cfc40d3. So I think that we need to improve things in a safer way. > Mapping with the design for base64, I have finished with the attached > patch, and the following set: > +extern int64 pg_hex_decode(const char *src, int64 len, char *dst, int64 > dstlen); > +extern int64 pg_hex_encode(const char *src, int64 len, char *dst, int64 > dstlen); > +extern int64 pg_hex_enc_len(int64 srclen); > +extern int64 pg_hex_dec_len(int64 srclen);
I can see the value of passing the destination length to the hex functions, and I think you have to pass the src length to pg_hex_encode since the input can be binary. I assume the pg_hex_decode doesn't need the source length because it is a null-terminated C string, right? However, pg_base64_decode(const char *src, size_t len, char *dst) does pass in the src length, so I can see we should just make it the same. I also agree it is unclear if 'len' is the src or dst len, so your patch fixes that with the names. Also, is there a reason you use int64 instead of the size_t used by bytea? > This ensures that the result never overflows, which explains the > introduction of an error code for the encoding part, and does not > use elog()/pg_log() so as external libraries can use them. ECPG uses > long variables in a couple of places, explaining why it feels safer to > use int64. int should give enough room to any caller of those APIs, > but there is no drawback in using a 64-bit API either, and I don't > think it is worth the risk to break ECPG either for long handling, > even if I don't believe either that folks are going to work on strings > larger than 2GB. Might as well just have hex match what bytea and esc does. > Things get trickier for the bytea input/output because we want more > generic error messages depending for invalid values or an incorrect > number of digits, which is why I have left the copies in encode.c. > This design could be easily extended with more types of error codes, > though I am not sure if that's worth bothering. I don't think removing the two error type reporting from src/common, and then having to add it back into adt/encode.c makes sense. There are two, soon three, that want those two error reports, so I am thinking we are best to just leave ecpg alone since it is just a library that doesn't want more than one error reporting. I don't think we will have many more library call cases. > Even with that, this leads to much more sanity for hex buffer > manipulation in backup manifests (I don't think that using > PG_SHAXXX_DIGEST_STRING_LENGTH is a good idea either, I'd like to get > rid of it in the long-term) and ECPG, so that's clearly a gain. > > I don't have a Windows host at hand, though I think that it should > work there correctly. What do you think about the ideas in the > attached patch? I think your patch has a few mistakes. First, I think you removed the hex_decode files from the file system, rather than removing it via git, so the diff didn't apply in the cfbot. Second, I don't think ecpg ever used libpgcommon before. For non-Windows, libpgcommon got referenced anyway in the build, so non-Windows compiles worked, but in Windows, that was not referenced, meaning the cfbot failed on ecpglib. I have modified the attached patch with both of these fixed --- let's see how it likes this version. -- Bruce Momjian <br...@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
hex2.diff.gz
Description: application/gzip