include

Bruce Evans Tue, 28 Feb 2012 15:27:00 -0800

On Tue, 28 Feb 2012, Tijl Coosemans wrote:

Log:
 Copy amd64 _types.h to x86 and merge with i386 _types.h. Replace existing
 amd64/i386/pc98 _types.h with stubs.


I don't like this much.  It gives 2 layers of convolutions for readers
(humans and compilers) to slowly untangle.  There is another layer of
include files for compatibility (but both layers are still used by
default), and lots of ifdefs.  The whole point of 1 file per arch was
to avoid such ifdefs.  This might be OK if arches were actually identical
for the APIs in these files, but for types there are lots of differences
between 32-bit and 64-bit machines.
  The differences can be reduced by spelling 32/64-bit types as
  [unsigned] long and by always using the basic type instead of a
  derived type.  Old code (e.g., FreeBSD-3) did this, but now almost
  everything is declared using the derived fixed-width types
  int32_t/int64_t etc., so there is always a spelling difference for
  32/64-bit types.  The only exceptions are floating point types, and
  the broken __clock_t type.
    clock_t is unsigned long on i386, but is int32_t on amd64.  This
    is backwards at best.  I think this brokenness came from NetBSD.
    _BSD_CLOCK_T_ was correct (unsigned long) for all arches in 4.4BSD,
    and i386 just didn't break this.  But now __clock_t is broken for
    all arches except i386: it is __uint32_t on arm and powerpc, which
    is just a different spelling for the 32-bit case and is at least
    ABI-compatible for the 64-bit case, but for all other arches
    including all 64-bit ones, it broken to __int32_t.  Perhaps the
    difference is explained by "long" being bad for ABI compatibility.
    Old code like 4.4BSD-Lite1 used long excessively (since technically,
    int might be only 16 bits).  Even pid_t was long in 4.4BSD-Lite1.
    NetBSD changed many of these longs to ints or int32_t's for ABI
    compatibility and/or because 64 bit longs are just too wide, and
    FreeBSD eventually picked up these changes (mostly via 4.4BSD-Lite2
    for general typedefs and directly from NetBSD for 64-bit arches).
    So pid_t is now int32_t and clock_t is mostly broken.  clock_t
    really needs all 64 bits if they are readily available, but has
    been reduced to 31, especially when 64 are readily available.
    OTOH, if you just want ABI and API compatibility for clock_t,
    then it should have been changed to uint32_t for all arches and
    not defined in any MD types file.  There is now a minor API
    compatibility for printing clock_t's -- %lu format must be used
    on i386, %u on others, and %d on most.  Except for the gratuitous
    loss of unsignedness, this is just a special case of printing
    a typedefed types.  clock_t may also be a floating point type,
    so the only good way to print it is to convert it to [long] double
    and then worry about the correct floating point format (how much
    precision should it have?...).

Copied and modified: head/sys/x86/include/_types.h (from r232259, 
head/sys/amd64/include/_types.h)
==============================================================================
--- head/sys/amd64/include/_types.h     Tue Feb 28 15:52:01 2012        
(r232259, copy source)
+++ head/sys/x86/include/_types.h       Tue Feb 28 18:15:28 2012        
(r232261)
@@ -54,19 +54,41 @@ typedef     short                   __int16_t;
typedef unsigned short          __uint16_t;
typedef int                     __int32_t;
typedef unsigned int            __uint32_t;
+#ifdef _LP64
typedef long                    __int64_t;
typedef unsigned long           __uint64_t;


This is about the only ifdef that is really needed.

+#else
+#ifndef lint
+__extension__


An old bug -- work around broken lints.  Although messy, this is not
messy enough to be correct -- __extension__ is a hard-coded gccism.
Elsewhere, in much less important code than this, there are messy
ifdefs to avoid hard-coded gccisms.

+#endif
+/* LONGLONG */


long long has only been standard for 13 years now, so broken lints
still need this messy markup.

+typedef        long long               __int64_t;
+#ifndef lint
+__extension__
+#endif
+/* LONGLONG */
+typedef        unsigned long long      __uint64_t;
+#endif


I ifdefed all this correctly 15+ years ago so that it compiled (but
didn't run if *int64_t was used) for a non-gcc K&R compiler.  The
long long abomination was not used, and __attribute__(()) was used
to declare *int64_t, but only under a gcc ifdef.  This wasn't broken
until 8 Jan 2011 by, erm, us.  We also added the __extensions__.
The justification was that long long is now standard.  But there
are still the old messes for lint, and new breakage for non-gcc to
support C90.


/*
 * Standard type definitions.
 */
+#ifdef _LP64
typedef __int32_t       __clock_t;              /* clock()... */
typedef __int64_t       __critical_t;
typedef double          __double_t;
typedef float           __float_t;
typedef __int64_t       __intfptr_t;
-typedef        __int64_t       __intmax_t;
typedef __int64_t       __intptr_t;
+#else
+typedef        unsigned long   __clock_t;
+typedef        __int32_t       __critical_t;
+typedef        long double     __double_t;
+typedef        long double     __float_t;
+typedef        __int32_t       __intfptr_t;
+typedef        __int32_t       __intptr_t;
+#endif


[unsigned] long would work without ifdefs for everything except to
preserve the broken __clock_t, and the FP types.  Except for i386's
with correctly-sized longs (64 bits).  We may have discussed these
too.

@@ -75,6 +97,7 @@ typedef       __int8_t        __int_least8_t;
typedef __int16_t       __int_least16_t;
typedef __int32_t       __int_least32_t;
typedef __int64_t       __int_least64_t;
+#ifdef _LP64
typedef __int64_t       __ptrdiff_t;            /* ptr1 - ptr2 */
typedef __int64_t       __register_t;
typedef __int64_t       __segsz_t;              /* segment size (in pages) */
@@ -82,8 +105,18 @@ typedef     __uint64_t      __size_t;               /* 
sizeof(
typedef __int64_t       __ssize_t;              /* byte count or error */
typedef __int64_t       __time_t;               /* time()... */
typedef __uint64_t      __uintfptr_t;
-typedef        __uint64_t      __uintmax_t;
typedef __uint64_t      __uintptr_t;
+#else
+typedef        __int32_t       __ptrdiff_t;
+typedef        __int32_t       __register_t;
+typedef        __int32_t       __segsz_t;
+typedef        __uint32_t      __size_t;
+typedef        __int32_t       __ssize_t;
+typedef        __int32_t       __time_t;
+typedef        __uint32_t      __uintfptr_t;
+typedef        __uint32_t      __uintptr_t;
+#endif


[unsigned] long would work without ifdefs for all of these, since all
these expanded naturally to the register width.  Perhaps a better way,
which also works for i386's with correctly-sized longs, is to define
almost everything in terms of registers -- as __[u]register_t.

+typedef        __uint64_t      __uintmax_t;
typedef __uint32_t      __uint_fast8_t;
typedef __uint32_t      __uint_fast16_t;
typedef __uint32_t      __uint_fast32_t;
@@ -92,12 +125,23 @@ typedef    __uint8_t       __uint_least8_t;
typedef __uint16_t      __uint_least16_t;
typedef __uint32_t      __uint_least32_t;
typedef __uint64_t      __uint_least64_t;
+#ifdef _LP64
typedef __uint64_t      __u_register_t;
typedef __uint64_t      __vm_offset_t;
-typedef        __int64_t       __vm_ooffset_t;
typedef __uint64_t      __vm_paddr_t;
-typedef        __uint64_t      __vm_pindex_t;
typedef __uint64_t      __vm_size_t;
+#else
+typedef        __uint32_t      __u_register_t;
+typedef        __uint32_t      __vm_offset_t;
+#ifdef PAE
+typedef        __uint64_t      __vm_paddr_t;
+#else
+typedef        __uint32_t      __vm_paddr_t;
+#endif
+typedef        __uint32_t      __vm_size_t;
+#endif
+typedef        __int64_t       __vm_ooffset_t;
+typedef        __uint64_t      __vm_pindex_t;


Similarly.  The patch, and possibly the ifdefs, are hard to read here.
There's a nested ifdef for PAE.  PAE doesn't apply for amd64.  The
above assumes that the cases where it doesn't apply are classified
by !_LP64.

x86/include didn't have many files in it before this and similar
commits in the same batch, and the first file that I looked at in
it has various new and old convolutions and bugs:

x86/include/_align.h:
% /*
%  * Round p (pointer or byte index) up to a correctly-aligned value
%  * for all data types (int, long, ...).   The result is unsigned int
%  * and must be cast to any desired pointer type.
%  */

This comment was blindly copied from i386.  It doesn't match the
code below.

% #define       _ALIGNBYTES     (sizeof(register_t) - 1)
% #define       _ALIGN(p)       (((uintptr_t)(p) + _ALIGNBYTES) & ~_ALIGNBYTES)

This code is broken, since it uses register_t and uintptr_t which are not
necessarily defined here.  Old code was careful to use only basic types,
so that no undocumented prerequisites are required and so that this not
broken when the undocumented prerequisites are not accidentally supplied.

The type of the result is actually uintptr_t.  On i386, uintptr_t is
just a different spelling of unsigned int.  On amd64, it is different.

This file shouldn't exist.

{amd64,i386,pc98}/include/_align.h just include x86/include/_align.h.
They shouldn't exist either.

i386/include/param.h:
% #include <machine/_align.h>

This include is misplaced (outside of the idempotency ifdef for this
file).  The definitions used to be here for technical reasons in
their correct implementation.  There used to be an ifdef here so that
this file could be included to define only the alignment macros so
as to not get namespace pollution in unusual cases.  This was
"cleaned up" for the unusual cases to pessimize the usual cases.
The cleanups have rotted, so they now have not just 1, but 2 layers
of nested includes of _align.h to untangle to see what this file is
doing.  2 layers give about the same level of obfuscation as the
correct implementation, plus more inefficiencies than only 1 layer.

%% #ifndef _I386_INCLUDE_PARAM_H_

% #define       _I386_INCLUDE_PARAM_H_

%% /*

%  * Machine dependent constants for Intel 386.
%  */

%% /*

%  * Round p (pointer or byte index) up to a correctly-aligned value
%  * for all data types (int, long, ...).   The result is unsigned int
%  * and must be cast to any desired pointer type.
%  */

The comment is still correct for the definitions here.

% #ifndef _ALIGNBYTES
% #define _ALIGNBYTES   (sizeof(int) - 1)
% #endif
% #ifndef _ALIGN
% #define _ALIGN(p)     (((unsigned)(p) + _ALIGNBYTES) & ~_ALIGNBYTES)
% #endif

But these definitions are unreachable, since <machine/_align.h> is
included unconditionally, and it defines the alignment macros
unconditionally.

%%


Mistakes near here also added this style bug (extra blank line).

% #define __HAVE_ACPI

amd64/include/param.h:
%  * $FreeBSD: src/sys/amd64/include/param.h,v 1.37 2011/07/19 13:00:30 attilio 
Exp $
%  */

%%


Extra blank lines are scattered randomly and happen to be in different
places.

% #ifndef _AMD64_INCLUDE_PARAM_H_
% #define       _AMD64_INCLUDE_PARAM_H_

%% #include <machine/_align.h>


Unlike for i386, this is placed almost correctly.  It is inside the
idempotency ifdef...

%% /*

%  * Machine dependent constants for AMD64.
%  */

... but not inside the comment that should describe everything in this
file.

%%


Another random extra blank line.  Or not so random.

Unlike for i386, the old definitions are not kept here.  They used to
be placed up where the include is now on i386 (outside of the idempotency
ifdef).  They used to use u_long to avoid prerequisites and have a
correct comment for them.  For i386 there are now 2 correct comments
about them (but with confusing spelling in one), while for amd64 there
is only 1 incorrect comment.

% ...
% #define       ALIGNBYTES              _ALIGNBYTES
% #define       ALIGN(p)                _ALIGN(p)

These (and ALIGNED_POINTER()) are the primary APIs for these macros
(the APIs with underscores are just to keep them out of the namespace
in POSIX headers).  The above definitions of the primary APIs are
completely MI once the underscored versions are defined, so they
shouldn't be defined here or in any other MD header.

Just about all of the amd64 and i386 param.h are the same, although
not as MI as ALIGNBYTES etc.  I don't really want the definitions
moved to x86 though.  They are bad enough where they are.  Other
bugs in them include:
- amd64_btop() and amd64_ptob() are named gratuitously differently
  than i386_btop() and i386_ptob().  Names like md_btop() and
  md_btop() or just btop() and btop() would be better.  Not having
  these APIs might be better still.  There is considerable confusion
  between these APIs and others that give the same results, and since
  the results are the same and there is no type checking, it is
  unlikely that the logically correct API is always used:
  - the corresponding "MI" APIs are ctob() and btoc().  These are
    ancient.  'c' in them means 'clicks' (groups of pages).  Groups
    of more than one page have never been used in FreeBSD.  The
    implementation of these macros is sort of backwards, with the
    assumption that the group size is always 1 page hard-coded in
    their definitions via PAGE_MASK and PAGE_SHIFT.
  - atop() and ptoa() may also be "MI", but have MD implementations.
    They are confusingly similar to ctob() and btoc().  'p' (page)
    in them always means the same as 'c' (clicks) in practice.  'a'
    (address) in them always means the same as 'b' (address in bytes,
    or size in bytes) in practice.  The logical differences are
    subtle.  There isn't even a MI API like atop() for for sizes
    instead of addresses.
  - {amd64,i386}_btop() and {amd64,i386}_ptob() logically handle either
    sizes or addresses, and convert to and from pages.  They are the
    easiest to use provided you assume that the address space is flat
    and that the pages aren't grouped into clicks, but they are MD
    and have the worse spelling.  This is sort of backwards again.  It
    is MD layers that should use shorter spelling of their variables
    and APIs, and hard-code assumptions.  i386 mostly uses atop(), and
    most of these uses are doubly logically wrong, since they are
    mostly for sizes and not for addresses, amd are very MD:
    - ctob: 13 instances
    - btoc: 0
    - atop: 34 instances (counting its definition).  Most are for segment
      limits, this are for sizes, thus are abuses.
    - ptoa: 13 ... mostly for sizes.  Confusion between this and ctob is
      apparently perfectly divided.
    - i386_ptob: 3 instances.  Just 1 use in pmap.c duplicated for xen.
    - i386_btop: 9 instances.  Just 4+2 uses in pmap.c's, 2 in pmap.h
      (in a macros and an inline that expand to many more uses)

One of the reasons for putting all types declarations in the same file
although this gives inefficiencies by requiring the compiler to read
and parse very large files to get the few definitions that are actually,
is to avoid the convolutions and ifdefs for minimising the declarations.
Convolutions and ifdefs for other purposes defeat this to some extent.
Although I was responsible for some of the tiny _foo.h files like
<sys/_mutex.h> which are used to minimise namespace problems, I don't
like the profileration of such files to handle even tinier problems.
The worst that I have noticed are _null.h (this should be handled in
machine/_types.h, where it used to be possible to handle it without
ifdefs), _align.h, and *stdint.h (there are now 4 layers of convolutions
with internal branches for <stdint.h>:
- /usr/include/stdint.h -> sys/stdint.h (only a symlink)
  - sys/stdint.h includes sys/_stdint.h and machine/_stdint.h
    - sys/_stdint.h is a mistake to allow sys/types.h to include it
      instead of declaring historical pollution for itself, but only
      for some pollution that is now standard in <stdint.h> -- types.h
      still declares things like u_int8_t unconditionally for itself;
      the handling of these is a bit simpler because they don't need
      ifdefs to prevent re-declaration in <stdint.h>.  Now it is harder
      to see what types.h declares.
    - machine/_stdint.h was the correct implementation.  Now it includes
      x86/_stdint.h on amd64 and i386.
      - x86/_stdint.h.  Most of the details are now here, 4 layers deep,
        with ifdefs.

Ideally, everything would be in <stdint.h> directly.  I don't know how
to do it in a layer, but is it too much to ask for only 2 layers?
Hmm, I do know how to do it in 1 layer -- flatten it at install time.
Anyone who cared about efficiency and readability would do this :-).
But I usually read the uninstalled sources.

Bruce
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Re: svn commit: r232261 - in head/sys: amd64/include i386/include pc98/include x86/include

Reply via email to