Re: Observations from an old timer playing with 64 bit numbers...

Bruce Evans Fri, 25 Jun 2010 10:30:33 -0700

On Wed, 23 Jun 2010, Randall Stewart wrote:

Bruce:


Comments (and questions in-line)...  (you too Luigi)


See Luigi's reply for most details..

On Jun 23, 2010, at 6:33 AM, Bruce Evans wrote:

On Wed, 23 Jun 2010, Luigi Rizzo wrote:

strong objection!
We should instead use names with exact sizes (16,32,64).

So please tell me why you object so strongly? We have the 16/32/64 bit nameswhich

are nice but are not expected so folks seem to not use them. I have
received a few private comments to the effect that "Gee, we would like
that, we rolled our own versions of ntohll() and if you did it we could
remove our version". This tells me that although a nicer name, folks
are rolling their own due to the name. Having a #define that maps
ntohll -> be64toh will make it so that people can actually find
the function.


Not really.  Such a define would be an obfuscation, and might make it
fairly hard to find the actual function.  See the corresponding #define
for ntohl().  It is far from what is documented and it takes searching
through 3-4 layers of include files and/or nested macros (with at least
the lowest layer differing across arches) to see what it is: on i386:
- man page:
    no mention of whether it is a function or a macro or both,
    except for the non-i386 case where it is "null" where it is
    misdocumented as being null (but it is never null).  I think it
    is actually both a (extern) function and a macro.  POSIX may require
    both, and doesn't prohibit the function, so we implement the
    function.
- sys/param.h:
    htonl() may be defined at a lower level.  It isn't on i386.
    When it isn't defined at a lower level, it is defined here as
    __htonl(x).  Before this, if it isn't prototyped at a lower level,
    it is prototyped here, so that #undefing it works right.
- i386/include/endian.h:
    - __htonl(x) is defined as __bswap32(x)
    - __bswap32(x) is a static inline function taking a uint32_t arg and
      returns a uint32_t after calling __byte_swap_int(x).  The typed
      arg causes a converion of the 'x' arg in ntohl(x) in cases where
      it is not already uint32_t.  I think this conversion is required
      by POSIX.  On big-endian arches where ntohl(x) is "null", this
      conversion should be done too, so the macro cannot be null (really
      the null filter).  It is done at the level of endian.h on at
      least sparc64.
    - __byte_swap_int(x) is a macro that expands in various ways:
      - if __OPTIMIZE__ is defined (i.e., for compiling with -O1 or higher),
        then it expands to the macro __byte_swap_int_var(x) if x is a variable
        and to the macro __byte_swap_int_const(x) if x is a constant.
        Otherwise, it always expands to the macro __byte_swap_int_const(x).
    - __byte_swap_int_var(x) expands to a Gnu C statement-expression with
      inline asm
    - __byte_swap_int_const(x) expands to a C expression with shifts and
      masks.  It is expected that this expression is evaluated at compile
      time iff it is used.

Some of this layering is bogus, but fixing it will gives slightly more
complicated layering to reduce duplication:
- __byte_swap_int_const(x) is MI so it should be defined in a MI file
  and not duplicated ad nauseum like it is now.  Especially since it is
  only to support a tiny optimization.
- __byte_swap_int_const(x) works for the non-constant case too, so its
  name is bogus.  Some arches use it for the non-constant case, and for
  swapping uint64_t's on i386 there is nothing better than a similar
  expression.  This expression should work on i386, but none of the versions
  versions of gcc used in FreeBSD optimize it well (to bswap).  Someday
  when they do, ntohl() can be simply this version in all cases.
- the ifdef's involving __OPTIMIZE__ can probably be improved.  However,
  just having them gets in the way of letting gcc do the optimization.
  You don't want even more convoluted __OPTIMIZE__ ifdefs to "fix" thus.

Name changes are a good thing for clarity but if no
one will use your changed name and roll their own.. thats a bad thing.


The hand rolled version is as unlikely to be as sophisticated as ntohl() :-).

The be/le family is intentionally unsophisticated sinc its speed is
believed to be unimportant.  Probably the speed of the ntohl() family
is similarly unimportant.

By having such a define you might encourage folks (over time) to transition
off to the more clear naming versions..


Nah, that will encourage them to keep using the macro that hides the newer
versions.

Yes, long long should not exist, and there are few places where exact
sizes should be used, but networking code is one place where they
should be used.

ntohs() will break semantically or actually on machines with shorts with
more than 16 bits.


So two questions here:

a) What machines that we do support currently have a short larger than16bits?


None.  Even if there were, then this would be moot because ntohs() was fixed
some time ago to not take a signed short arg.  It takes an unsigned 16-bit
arg.  (Signed 16-bit shorts would also break on 1's complement and maybe
on signed-magnitude machines: 0xFFFF in bits has value -0 on 1's. complement,
so ntohs(0xFFFF) might become ntohs(-0) = ntohs(0) = 0).

b) Does anyone that use networking code really expect ntohs() to do anything
  different than a 16 bit byte swap?


No.  It cannot, since it is required to work on 16-bit unsigned values.

  The man page is pretty clear on
  it i.e. uint16_t is the input and its a "network short" which
  if I remember right is defined to be 16 bits... At least most RFC's
  are pretty clear and the same is true for UNP Vol 1 ;-)


Since ntohs() is now (and maybe always has been) specified specifically
enough.  But this specification makes the function name bogus -- it never
takes a short arg.

A similar case that is still broken is fls().  POSIX only standardized
historical malpractice for it, so it is still specified to take an int
arg, without even any details of what happens when the value is negative.

ntohl() is already broken semantically on machines with longs with
more than 32 bits (all 64-bit arches in FreeBSD).  It doesn't actually
handle longs on such arches.  Networking code handles a few cases related
to this with n_long.  This is not quite as bad, but its name is nonsense --
n_long is very far from being a long -- it is unsigned 32 bits exactly,
unlike long which is signed >= 32 bits.


again same two questions only change them to say 32 bits?


Like ntohs(), it is specified correcly but doesn't match its name.
Renaming it to ntoh32() would break historical practice.

ntohll() would break similarly on machines with long longs with more
or less than 64 bits.  In practice this and other misuses of long long
may prevent the size of long long being changed, like it prevented the
size of long being changed on some systems with historical abuses of
long.

Again same first question... the second one I would ask too except thefunction

does NOT exist (except where folks go out and roll it themselves).


No need to break it from the start.

Basically I don't agree with your assessment since these functions aredesignedto operate on network code which I think defines a short = 16, a long = 32(at least
all RFC's I have read do that anyway).


shorts and longs are defined by the C standard and the C compiler.  Network
code cannot change this.

And it appears from feedback I havereceivedthat ntohll would be defined as a 64bit network quantity in most folks mindssince
that what a lot of folks have implemented...


Wouldn't they expect it to have a long long type if it were named ntohll()?

Another error (mentioned in a later reply) would be expecting to print a
64 bit network type using %lld.  That is as unportable as expecting to
print a 32-bit type using %ld.  On of the main type errors in practice
for 64-bit arches under FreeBSD was printing the result of ntohl() using
%ld.  When there is a standard for a 64-bit network type, it will
necessarily specify the type to be uint64_t (or perhaps a typedefed
alias of this, like in_addr_t for uint32_t).  Printing of this type
will have the usual problems for a typedefed type.

Bruce
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Observations from an old timer playing with 64 bit numbers...

Reply via email to