Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-13 Thread Mike Smith
> On Mon, Jul 12, 1999 at 10:38:03PM -0700, Mike Smith wrote: > > I said: > > > than indirect function calls on some architectures: inline > > > branched code. So you still have a global variable selecting > > > locked/non-locked, but it's a boolean, rather than a pointer. > > > Your atomic macro

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-13 Thread Peter Jeremy
Matthew Dillon <[EMAIL PROTECTED]> wrote: >:I'm not sure there's any reason why you shouldn't. If you changed the >:semantics of a stack segment so that memory addresses below the stack >:pointer were irrelevant, you could implement a small, 0-cycle, on-chip >:stack (that overflowed into memory).

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-13 Thread Alan Cox
Before this thread on "cache coherence" and "memory consistency" goes any further, I'd like to suggest a time-out to read something like http://www-ece.rice.edu/~sarita/Publications/models_tutorial.ps. A lot of what I'm reading has a grain of truth but isn't quite right. This paper appeared as a

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Peter Jeremy
Matthew Dillon <[EMAIL PROTECTED]> wrote: >:[1] A locked instruction implies a synchronous RMW cycle. In order >:to meet write-ordering guarantees (without which, a locked RMW >:cycle would be useless as a semaphore primitive), it implies a >:complete write serialization, and probably

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Mike Haertel
>This is a fairly key statement in context, and an opinion here would >count for a lot; are function calls likely to become more or less >expensive in time? Ambiguous question. First answer: Assume we're hitting the cache, taking no branch mispredicts, and everything is generally going at "the

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Mike Haertel
>Second answer: in the real world, we're nearly always hitting the >cache on stack operations associated with calls and argument passing, >but not less often on operations in the procedure body. So, in ^^^ typo Urk. I meant to say "less often", delete the "not". To Unsubscribe: send mail

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Matthew Dillon
:... I would also like to add a few more notes in regards to write pipelines. Write pipelines are not used any more, at least not long ones. The reason is simply the cache coherency issue again. Until the data is actually written into the L1 cache, it is acoherent. Acoher

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Andrew Reilly
On Mon, Jul 12, 1999 at 10:38:03PM -0700, Mike Smith wrote: > I said: > > than indirect function calls on some architectures: inline > > branched code. So you still have a global variable selecting > > locked/non-locked, but it's a boolean, rather than a pointer. > > Your atomic macros are then {

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Mike Smith
> On Mon, Jul 12, 1999 at 07:09:58PM -0700, Mike Smith wrote: > > > Although function calls are more expensive than inline code, > > > they aren't necessarily a lot more so, and function calls to > > > non-locked RMW operations are certainly much cheaper than > > > inline locked RMW operations. >

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Andrew Reilly
On Mon, Jul 12, 1999 at 07:09:58PM -0700, Mike Smith wrote: > > Although function calls are more expensive than inline code, > > they aren't necessarily a lot more so, and function calls to > > non-locked RMW operations are certainly much cheaper than > > inline locked RMW operations. > > This is

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Matthew Dillon
: :Based on general computer architecture principles, I'd say that a lock :prefix is likely to become more expensive[1], whilst a function call :will become cheaper[2] over time. :... : :[1] A locked instruction implies a synchronous RMW cycle. In order :to meet write-ordering guarantees (wit

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Matthew Dillon
: :I'm not sure there's any reason why you shouldn't. If you changed the :semantics of a stack segment so that memory addresses below the stack :pointer were irrelevant, you could implement a small, 0-cycle, on-chip :stack (that overflowed into memory). I don't know whether this :semantic chang

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Peter Jeremy
Matthew Dillon <[EMAIL PROTECTED]> wrote: >The change in code flow used to be the expensive piece, but not any >more. You typically either see a branch prediction cache (Intel) >offering a best-case of 0-cycle latency, or a single-cycle latency >that is slot-fillable (MIPS). In

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Peter Jeremy
Mike Smith <[EMAIL PROTECTED]> wrote: >> Although function calls are more expensive than inline code, >> they aren't necessarily a lot more so, and function calls to >> non-locked RMW operations are certainly much cheaper than >> inline locked RMW operations. > >This is a fairly key statement in c

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Matthew Dillon
:I assumed too much in asking the question; I was specifically :interested in indirect function calls, since this has a direct impact :on method-style implementations. Branch prediction caches are typically PC-sensitive. An indirect method call will never be as fast as a direct call,

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Mike Smith
> > : > :> Although function calls are more expensive than inline code, > :> they aren't necessarily a lot more so, and function calls to > :> non-locked RMW operations are certainly much cheaper than > :> inline locked RMW operations. > : > :This is a fairly key statement in context, and an opin

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Matthew Dillon
: :> Although function calls are more expensive than inline code, :> they aren't necessarily a lot more so, and function calls to :> non-locked RMW operations are certainly much cheaper than :> inline locked RMW operations. : :This is a fairly key statement in context, and an opinion here would

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Mike Smith
> Although function calls are more expensive than inline code, > they aren't necessarily a lot more so, and function calls to > non-locked RMW operations are certainly much cheaper than > inline locked RMW operations. This is a fairly key statement in context, and an opinion here would count for

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Poul-Henning Kamp
In message <[EMAIL PROTECTED]>, John-Mark Gurney writes: >Matthew Dillon scribbled this message on Jul 12: >> p.s. I'm pretty sure that the lock prefix costs nothing on a UP system, >> and probably wouldn't be noticed on an SMP system either because the >> write-allocation overhead is

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Luoqi Chen
> do away with the lock prefix on non-SMP machines. I don't know if the > SMP variable is accessible from within the i386/include/atomic.h header > file, though. > SMP is globally defined (in opt_global.h). -lq To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe free

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Mikhail Teterin
Mike Haertel once wrote: > Anyway, taking all that into account, I still agree with Dillon that > it is a better software solution to allow the same loadable drivers to > work for both UP and MP systems whenever possible. What's wrong, again with /modules and /modules.smp? If some third party

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Matthew Dillon
Here we are: Empty loop mode 09.21 ns/loop nproc=1 lcks=EMPTY Tight loop, 1 and 2 processes, with and without lock prefix mode 1 16.48 ns/loop nproc=1 lcks=no mode 2 23.65 ns/loop nproc=2 lcks=no mode 3 93.02 ns/loop nproc=1 lcks=yes mod

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Mike Haertel
You might think that, due to MESI state bits in the cache and bus coherency protocols, that locks are "free". Unfortunately, the lock prefix has a measurable cost on a UP system, at least on P6 and later processors. The reason is that the locked memory operation is an "at-retirement" operation,

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Matthew Dillon
:actually, I'm not so sure, it guarantees that NO other bus operation :will succeed while this is happening... what happens if a pci bus :mastering card makes a modification to this value? sure, it normally :won't happen, but it can... and w/o the lock prefix, this CAN happen :from what I unders

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread John-Mark Gurney
Matthew Dillon scribbled this message on Jul 12: > p.s. I'm pretty sure that the lock prefix costs nothing on a UP system, > and probably wouldn't be noticed on an SMP system either because the > write-allocation overhead is already pretty bad. But I haven't tested > it. actuall

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Matthew Dillon
:>p.s. I'm pretty sure that the lock prefix costs nothing on a UP system, :>and probably wouldn't be noticed on an SMP system either because the :>write-allocation overhead is already pretty bad. But I haven't tested :>it. : :it's actually quite expensive in terms of bus bandwidt

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Poul-Henning Kamp
>p.s. I'm pretty sure that the lock prefix costs nothing on a UP system, >and probably wouldn't be noticed on an SMP system either because the >write-allocation overhead is already pretty bad. But I haven't tested >it. it's actually quite expensive in terms of bus bandwidth bec

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Matthew Dillon
:> We don't need the lock prefix for the current SMP implementation. A lock :> prefix would be needed in a multithreaded implementation but should not be :> added unless the kernel is an SMP kernel otherwise UP performance would :> suffer. :> :> -- :> Doug Rabson Mail: [

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Bruce Evans
>I was under the impression that a locked instruction was essentially free >at runtime, with the sole exception of being one byte larger. No, they are very expensive, at least when done in a minimal loop (8 cycles on my P5/133 UP and 16 cycles on my Celeron/450). ISTR Steve Passe saying that the

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Luoqi Chen
> We don't need the lock prefix for the current SMP implementation. A lock > prefix would be needed in a multithreaded implementation but should not be > added unless the kernel is an SMP kernel otherwise UP performance would > suffer. > > -- > Doug Rabson Mail: [EMAIL

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Peter Wemm
Doug Rabson wrote: > On Mon, 12 Jul 1999, Peter Jeremy wrote: > > > Mike Haertel <[EMAIL PROTECTED]> wrote: > > >Um. FYI on x86, even if the compiler generates the RMW > > >form "addl $1, foo", it's not atomic. If you want it to > > >be atomic you have to precede the opcode with a LOCK > > >pre

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Oliver Fromme
Doug Rabson wrote in list.freebsd-current: > On Mon, 12 Jul 1999, Peter Jeremy wrote: > > That said, it should be fairly simple to change Matt's new in-line > > assembler versions to insert LOCK prefixes when building an SMP > > kernel. (Although I don't know that this is necessary yet, given

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Doug Rabson
On Mon, 12 Jul 1999, Peter Jeremy wrote: > Mike Haertel <[EMAIL PROTECTED]> wrote: > >Um. FYI on x86, even if the compiler generates the RMW > >form "addl $1, foo", it's not atomic. If you want it to > >be atomic you have to precede the opcode with a LOCK > >prefix 0xF0. > > I'd noticed that p

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-12 Thread Doug Rabson
On Sun, 11 Jul 1999, Mike Haertel wrote: > >On Sat, 10 Jul 1999, Matthew Dillon wrote: > >> > >> The supposedly atomic functions in i386/include/atomic.h are not > >> as atomic as was previously thought :-): > >> > >> #define atomic_add_short(P, V) (*(u_short*)(P) += (V)) > >[.

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-11 Thread Matthew Dillon
: :That said, it should be fairly simple to change Matt's new in-line :assembler versions to insert LOCK prefixes when building an SMP :kernel. (Although I don't know that this is necessary yet, given :the `Big Giant Lock'). : :There remains the problem of locating all the operations in the kerne

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-11 Thread Peter Jeremy
Mike Haertel <[EMAIL PROTECTED]> wrote: >Um. FYI on x86, even if the compiler generates the RMW >form "addl $1, foo", it's not atomic. If you want it to >be atomic you have to precede the opcode with a LOCK >prefix 0xF0. I'd noticed that point as well. The top of sys/i386/include/atomic.h _doe

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-11 Thread Mike Haertel
>On Sat, 10 Jul 1999, Matthew Dillon wrote: >> >> The supposedly atomic functions in i386/include/atomic.h are not >> as atomic as was previously thought :-): >> >> #define atomic_add_short(P, V) (*(u_short*)(P) += (V)) >[...] > >Before I fixed this stuff for the alpha, the += e

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-11 Thread Doug Rabson
On Sun, 11 Jul 1999, Alan Cox wrote: > On Sun, Jul 11, 1999 at 08:12:52AM +0100, Doug Rabson wrote: > > > > What a nightmare. This must be due to egcs compiling things differently > > from gcc 2.7.1. ... > > Yes, at least for the one case in vm_pageout_flush. (I checked > the analogous code on

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-11 Thread Alan Cox
Actually, I should have said swap_pager_getpages and not vm_pageout_flush. Alan To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-10 Thread Alan Cox
On Sun, Jul 11, 1999 at 08:12:52AM +0100, Doug Rabson wrote: > > What a nightmare. This must be due to egcs compiling things differently > from gcc 2.7.1. ... Yes, at least for the one case in vm_pageout_flush. (I checked the analogous code on a 3.x-STABLE system and it appears to be fine for t

Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-10 Thread Doug Rabson
On Sat, 10 Jul 1999, Matthew Dillon wrote: > > The supposedly atomic functions in i386/include/atomic.h are not > as atomic as was previously thought :-): > > #define atomic_add_short(P, V) (*(u_short*)(P) += (V)) > > I looked at that kinda funny. But C doesn't guarentee

"objtrm" problem probably found (was Re: Stuck in "objtrm")

1999-07-10 Thread Matthew Dillon
The supposedly atomic functions in i386/include/atomic.h are not as atomic as was previously thought :-): #define atomic_add_short(P, V) (*(u_short*)(P) += (V)) I looked at that kinda funny. But C doesn't guarentee a RMW opcode for a "+=" !!!. Alan found an example s