> From: owner-openssl-us...@openssl.org On Behalf Of ml > Sent: Tuesday, 01 May, 2012 15:21
Aside: this question isn't really related to OpenSSL. > i work on small projet > https://github.com/fakessh/openprojectssl/blob/master/smtp.c > https://github.com/fakessh/openprojectssl/blob/master/smtp.h > > > i realise encode base 64 like that > > //realizing base64 > void base64(char *dbuf, char *buf128, int len) > { > struct data6 *ddd = NULL; > int i = 0; > char buf[256] = {0}; > char *tmp = NULL; > char cc = '\0'; > memset(buf, 0, 256); > strcpy(buf, buf128); This is misleading, unsafe, and wasteful. You name the input parameter buf128 but actually support up to 255 chars, only nonnulls even though much data that people want to b64 contains nulls, and without checking for overflow which if this data can be controlled (sometimes even partly) by an adversary causes security breaches. Even for 'good' data, a declaration T ary[N] = {0} initializes _all_ elements of ary to 0, the memset does so _again_, and _neither_ should be needed because you should not be accessing beyond the end of the input data (although because of the erroneous strcpy you may). Even if these are corrected, using a fixed-size temporary like this is a limitation on callers unnecessarily because you don't need a full copy. In particular if you're doing MIME (over SMTP) the data you need to b64 will often be quite large. If you're only doing SMTP-AUTH, or obfuscated msgids, or things like that, it will be smaller. > for(i = 1; i <= len/3; i++) > { > tmp = buf+(i-1)*3; Most C programs/programmers use zero-origin subscripting. One-origin works if you do it right, which you did, but it doesn't make you popular. > cc = tmp[2]; > tmp[2] = tmp[0]; > tmp[0] = cc; > ddd = (struct data6 *)tmp; > dbuf[(i-1)*4+0] = con628((unsigned int)ddd->d1); > dbuf[(i-1)*4+1] = con628((unsigned int)ddd->d2); > dbuf[(i-1)*4+2] = con628((unsigned int)ddd->d3); > dbuf[(i-1)*4+3] = con628((unsigned int)ddd->d4); > } You don't show the declaration of data6, but I assume it's 4 bitfields of size 6. The layout of bitfields in a struct is implementation dependent, and so is the alignment of any struct type (indeed any non-character type). This is likely to crash or give completely wrong results on most systems other than x86, and maybe other compilers or even options. You can avoid the alignment problem, and the fixed-size temp, by copying one triplet at a time to a single aligned temp, swapping if necessary in the process although the layout is still implementation dependent and unportable; that's why good implementations of b64 use bitshift and mask instead. con628 is defined below with a prototyped char parameter. If there is also a prototype declaration before the call the casts are useless because the prototype forces conversion; if such a declaration is not visible the casts are wrong because one major purpose of prototypes is that the ABI can be specialized and providing an unsigned int argument to a char parameter may not work. But for many common ABIs including usual x86 it does work 'accidentally' so you'll actually hit this bug only if your software gets wide use. > if(len%3 == 1) <snip> > char con628(char c6) > { > char rtn = '\0'; > if (c6 < 26) rtn = c6 + 65; > else if (c6 < 52) rtn = c6 + 71; > else if (c6 < 62) rtn = c6 - 4; > else if (c6 == 62) rtn = 43; > else rtn = 47; > return rtn; > } > On an ASCII-compatible system (which nearly all are today) it's much clearer to write 'A' ('a'-26) ('0'-52) '+' '/' . On the few remaining EBCDIC systems, you need to determine whether the b64 output should be net-ASCII (e.g. transmitted as-is without translation) or if it will be passed through or (especially) stored for other code that expects EBCDIC and translates when needed, in which case you can't handle A-Z or a-z as a single range. There it's usually easier to use an array, which also works in ASCII: return "ABC...Zabc...z01..9+/" [c6]; On modern machines with good caching but penalty for misprediction this may be faster as well, although you generally shouldn't design your code for that because there is so much variation across machines and time. > > one might think that incidentally changed all "char" by "unsigned > char ", it would avoid the casts for arithmetic (yes, it takes a blow > casts for "") but it is less boring than the errors > undetectable from negative char! > The input (your buf128) semantically should be a pointer to (sequence of) unsigned char, since b64 is defined on that. But, as I think you mean, passing C strings (defined as pointer to plain char) then requires a cast, which is a bit ugly. If you access that parameter by memcpy (as now) memcpy actually accesses its arguments as unsigned chars (ignoring the declared type in the caller, since it's passed through qual-void*). If you access those bytes directly you should use u-char*, but you can do that with a local pointer set from the parameter. (Accessing char* then casting to u-char, as some people do for cases like this, isn't quite guaranteed by the standard although in practice it works everywhere as far as I know.) For this approach I would make the parameter void* so it doesn't actually mislead the caller. Even better, const void* so the caller can pass a buffer knowing you won't modify it. > another way to make this challenge would be to consider a > code like this > but I do not know how > I have long bothered to look at my code > > and i think this code is a cool for use > > I ask your humble opinion with the advice and example that go > Good luck. ______________________________________________________________________ OpenSSL Project http://www.openssl.org User Support Mailing List openssl-users@openssl.org Automated List Manager majord...@openssl.org