On Tue, 2007-04-24 at 16:22 +0200, Andi Kleen wrote: > > Why would you need any kind of lock when just changing a single bit, > > if it didn't affect other bits of the same word? Just as you don't > > need a lock when simply assigning a word, setting a bit to 0 or 1 > > is simple in itself (it doesn't matter if it was 0 or 1 before). > > > > > But, I think that concurrent bit operation on different bits > > > is just like OR operation , so lock prefix is not needed. > > > > I firmly believe that it is; but I'm not a processor expert. > > Think of the CPU cache like the page cache. The VFS cannot change > anything directly on disk - it always has to read a page (or block); > change it there even if it's only a single bit and back. > > Now imagine multiple independent kernels to that disk doing this in parallel > without any locking. You will lose data on simple changes because > of data races. > > The CPU is similar. The memory is the disk; the protocol talking to > the memory only knows cache lines. There are multiple CPUs talking > to that memory. The CPUs cannot change anything > without reading a full cache line first and then later writing it back. > There is "just do OR operation" when talking to memory, just like > disks don't have such a operation; just read and write. > > [there are some special cases with uncached mappings, but they are not > relevant here] > > With lock the CPU ensures the read-modify-write cycle is atomic, > without it doesn't. > > The CPU also guarantees you that multiple writes don't get lost > even when they hit the same cache line (otherwise you would need lock > for anything in the same cache line), but it doesn't guarantee that > for a word. The details for that vary by architecture; on x86 it is > any memory access, on others it can be only guaranteed for long stores. > > The exact rules for all this can be quite complicated (and also vary > by architecture), this is a simplification but should be enough as a rough > guide.
Intel 80386 Reference Programmer's Manual reads: " When accessing a bit in memory, the processor may access four bytes starting from the memory address given by: Effective Address + (4 * (BitOffset DIV 32)) for a 32-bit operand size, or two bytes starting from the memory address given by: Effective Address + (2 * (BitOffset DIV 16)) for a 16-bit operand size. It may do this even when only a single byte needs to be accessed in order to get at the given bit. " This seems to imply that we need the lock prefix even when updating different bits within the same long (not just the same byte). Is this interpretation correct? - Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/