RE: encode base 64 in simply smtp client

Dave Thompson Thu, 03 May 2012 14:52:29 -0700

> From: owner-openssl-us...@openssl.org On Behalf Of ml
> Sent: Tuesday, 01 May, 2012 15:21


Aside: this question isn't really related to OpenSSL.

> i work on small projet
> https://github.com/fakessh/openprojectssl/blob/master/smtp.c
> https://github.com/fakessh/openprojectssl/blob/master/smtp.h
> 
> 
> i realise encode base 64 like that
> 
> //realizing base64
> void base64(char *dbuf, char *buf128, int len)
> {
>       struct data6 *ddd = NULL;
>       int i = 0;
>       char buf[256] = {0};
>       char *tmp = NULL;
>       char cc = '\0';
>       memset(buf, 0, 256);
>       strcpy(buf, buf128);

This is misleading, unsafe, and wasteful. You name 
the input parameter buf128 but actually support up to 
255 chars, only nonnulls even though much data that 
people want to b64 contains nulls, and without 
checking for overflow which if this data can be 
controlled (sometimes even partly) by an adversary 
causes security breaches. Even for 'good' data, 
a declaration T ary[N] = {0} initializes _all_ 
elements of ary to 0, the memset does so _again_, 
and _neither_ should be needed because you should 
not be accessing beyond the end of the input data 
(although because of the erroneous strcpy you may).

Even if these are corrected, using a fixed-size 
temporary like this is a limitation on callers 
unnecessarily because you don't need a full copy.
In particular if you're doing MIME (over SMTP) 
the data you need to b64 will often be quite large.
If you're only doing SMTP-AUTH, or obfuscated msgids, 
or things like that, it will be smaller. 

>       for(i = 1; i <= len/3; i++)
>       {
>               tmp = buf+(i-1)*3;

Most C programs/programmers use zero-origin subscripting.
One-origin works if you do it right, which you did, but 
it doesn't make you popular.

>               cc = tmp[2];
>               tmp[2] = tmp[0];
>               tmp[0] = cc;
>               ddd = (struct data6 *)tmp;
>               dbuf[(i-1)*4+0] = con628((unsigned int)ddd->d1);
>               dbuf[(i-1)*4+1] = con628((unsigned int)ddd->d2);
>               dbuf[(i-1)*4+2] = con628((unsigned int)ddd->d3);
>               dbuf[(i-1)*4+3] = con628((unsigned int)ddd->d4);
>       }

You don't show the declaration of data6, but I assume it's 
4 bitfields of size 6. The layout of bitfields in a struct 
is implementation dependent, and so is the alignment of any 
struct type (indeed any non-character type). This is likely 
to crash or give completely wrong results on most systems 
other than x86, and maybe other compilers or even options.
You can avoid the alignment problem, and the fixed-size temp, 
by copying one triplet at a time to a single aligned temp,
swapping if necessary in the process although the layout is 
still implementation dependent and unportable; that's why 
good implementations of b64 use bitshift and mask instead.

con628 is defined below with a prototyped char parameter.
If there is also a prototype declaration before the call 
the casts are useless because the prototype forces conversion; 
if such a declaration is not visible the casts are wrong 
because one major purpose of prototypes is that the ABI 
can be specialized and providing an unsigned int argument 
to a char parameter may not work. But for many common ABIs 
including usual x86 it does work 'accidentally' so you'll 
actually hit this bug only if your software gets wide use.

>       if(len%3 == 1)
<snip>

> char con628(char c6)
> {
>       char rtn = '\0';
>       if (c6 < 26) rtn = c6 + 65;
>       else if (c6 < 52) rtn = c6 + 71;
>       else if (c6 < 62) rtn = c6 - 4;
>       else if (c6 == 62) rtn = 43;
>       else rtn = 47;
>       return rtn;
> }
> 
On an ASCII-compatible system (which nearly all are today) 
it's much clearer to write 'A' ('a'-26) ('0'-52) '+' '/' .
On the few remaining EBCDIC systems, you need to determine 
whether the b64 output should be net-ASCII (e.g. transmitted 
as-is without translation) or if it will be passed through 
or (especially) stored for other code that expects EBCDIC 
and translates when needed, in which case you can't handle 
A-Z or a-z as a single range. There it's usually easier to 
use an array, which also works in ASCII:
  return "ABC...Zabc...z01..9+/" [c6];
On modern machines with good caching but penalty for 
misprediction this may be faster as well, although you 
generally shouldn't design your code for that because 
there is so much variation across machines and time.

> 
> one might think that incidentally changed all "char" by "unsigned
> char ", it would avoid the casts for arithmetic (yes, it takes a blow
> casts for "") but it is less boring than the errors
> undetectable from negative char!
> 
The input (your buf128) semantically should be a pointer to 
(sequence of) unsigned char, since b64 is defined on that.
But, as I think you mean, passing C strings (defined as pointer 
to plain char) then requires a cast, which is a bit ugly.

If you access that parameter by memcpy (as now) memcpy actually 
accesses its arguments as unsigned chars (ignoring the declared 
type in the caller, since it's passed through qual-void*).
If you access those bytes directly you should use u-char*, but 
you can do that with a local pointer set from the parameter.
(Accessing char* then casting to u-char, as some people do 
for cases like this, isn't quite guaranteed by the standard 
although in practice it works everywhere as far as I know.)
For this approach I would make the parameter void* so it doesn't 
actually mislead the caller. Even better, const void* so the 
caller can pass a buffer knowing you won't modify it.

> another way to make this challenge would be to consider a 
> code like this
> but I do not know how
> I have long bothered to look at my code
> 
> and i think this code is a cool for use
> 
> I ask your humble opinion with the advice and example that go
> 
Good luck.


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    openssl-users@openssl.org
Automated List Manager                           majord...@openssl.org

RE: encode base 64 in simply smtp client

Reply via email to