On Wed, 22 Jun 2011, Warner Losh wrote:
On Jun 22, 2011, at 3:26 AM, Attilio Rao wrote:
2011/6/22 Warner Losh <i...@bsdimp.com>:
On Jun 21, 2011, at 5:27 PM, Alan Cox wrote:
On 06/21/2011 16:09, Attilio Rao wrote:
2011/6/21 Bruce Evans<b...@optusnet.com.au>:
On Tue, 21 Jun 2011, Bjoern A. Zeeb wrote:
On Jun 19, 2011, at 7:13 PM, Alan Cox wrote:
Hi Alan,
Author: alc
Date: Sun Jun 19 19:13:24 2011
New Revision: 223307
URL: http://svn.freebsd.org/changeset/base/223307
Log:
Precisely document the synchronization rules for the page's dirty field.
(Saying that the lock on the object that the page belongs to must be
held
only represents one aspect of the rules.)
Eliminate the use of the page queues lock for atomically performing
read-
modify-write operations on the dirty field when the underlying
architecture
supports atomic operations on char and short types.
Document the fact that 32KB pages aren't really supported.
contrary to the tinderbox I'd like to point out that all mips kernels
built by universe are broken with a SVN HEAD from earlier today. Could you
please check and see if you can fix it? The errors I get are:
vm_page.o: In function `vm_page_clear_dirty':
/sys/vm/vm_page.c:(.text+0x18d0): undefined reference to `atomic_clear_8'
/sys/vm/vm_page.c:(.text+0x18d0): relocation truncated to fit: R_MIPS_26
against `atomic_clear_8'
vm_page.o: In function `vm_page_set_validclean':
/sys/vm/vm_page.c:(.text+0x38f0): undefined reference to `atomic_clear_8'
/sys/vm/vm_page.c:(.text+0x38f0): relocation truncated to fit: R_MIPS_26
against `atomic_clear_8'
Atomic types shorter than int cannot be used in MI code, since they might
not exist. Apparently they don't exist on mips. jake@ fixed all their
old uses for sparc4 in ~Y2K.
I'm sure they do, they exist in support.S though and may not have the
_8 form (they may just have the _char version). I may look at the code
again to be sure.
It appears that while mips/include/atomic.h declares the existence of
atomic_clear_8, mips/mips/support.S doesn't implement it. In other words,
only support for int's and short's is currently implemented, not char's:
# grep atomic_clear mips/mips/support.S
* atomic_clear_32(u_int32_t *a, u_int32_t b)
LEAF(atomic_clear_32)
END(atomic_clear_32)
* atomic_clear_16(u_int16_t *a, u_int16_t b)
LEAF(atomic_clear_16)
END(atomic_clear_16)
The current crop of atomic*16 and atomic*8 functions have the restriction
that the address must be 32-bit aligned (and it forces this by aligning to
32-bits silently and then operates on the low 8 or 16 bits in that word!)
I'm guessing that this is likely just wrong. Comments?
Warner
That is wrong, of course, and my personal opinion is that one should
not implement atomic operations if they cannot be done efficiently
(example: if you need to disable interrupts or similar expensive
operation just to assure atomicity of operation, just don't support
it) as long as not having _8/_char is perfectly fine.
I think it can be efficient, for some reasonable definition of efficient. The
masking and shifting operations aren't that onerous to write and the
instructions cycles would be dwarfed by the ll/sc pair, which are required to
implement atomic.
The issue is that the code today is clearly wrong and needs to be fixed.
I think it is still good to discourage use of non-bus-width atomic ops
by not supporting at the MI level them for any arch.
Suppose only bus-width accesses are atomic, as seems to be the case
on mips. Then you can still do atomic accesses on bytes by doing an
atomic op on the neigbouring 3 or 7 bytes. But this may increase
memory contention by a factor of 3 or 7. The contention can be
reduced either by putting only rarely-accessed bytes near the bytes
that need the atomic ops, or by putting unused padding there, or
by not using byte-sized objects to begin with (this gives the same
effect and even the same memory accesses as unused padding bytes --
there are unused bits in the objects instead of external unused bytes).
The former can be arranged easily enough in the vm_page struct. But
MI code shouldn't know about this.
Suppose byte accesses are atomic at the instruction level. Then the
hardware probably still needs to expand to the bus width. Depending
on how well it virtualizes this, there may be the same factor of 3, 7
or more in memory contention. The relevant bus width may be the cache
line size which on modern CPUs is much more than 4 bytes and probably
more like 32. So the CPU must already being doing good virtualization
to make 4-byte atomic accesses not too slow (non-atomic accesses are
already virtualized by the caches). For efficiency, we really should
pay more attention to packing related int variables into cache lines
(either the same line to increase cache hits or separate lines, to
reduce contention), but the best way to do this is very MD and unclear.
Bruce
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"