On Thu, Jun 23, 2011 at 5:35 PM, Florian Pflug <f...@phlo.org> wrote: >> Well, I'm sure there is some effect, but my experiments seem to >> indicate that it's not a very important one. Again, please feel free >> to provide contrary evidence. I think the basic issue is that - in >> the best possible case - padding the LWLocks so that you don't have >> two locks sharing a cache line can reduce contention on the busier >> lock by at most 2x. (The less busy lock may get a larger reduction >> but that may not help you much.) If you what you really need is for >> the contention to decrease by 1000x, you're just not really moving the >> needle. > > Agreed. OTOH, adding a few dummy entries to the LWLocks array to separate > the most heavily contested LWLocks for the others might still be > worthwhile.
Hey, if we can show that it works, sign me up. >> That's why the basic fast-relation-lock patch helps so much: >> it replaces a system where every lock request results in contention >> with a system where NONE of them do. >> >> I tried rewriting the LWLocks using CAS. It actually seems to make >> things slightly worse on the tests I've done so far, perhaps because I >> didn't make it respect spins_per_delay. Perhaps fetch-and-add would >> be better, but I'm not holding my breath. Everything I'm seeing so >> far leads me to the belief that we need to get rid of the contention >> altogether, not just contend more quickly. > > Is there a patch available? How did you do the slow path (i.e. the > case where there's contention and you need to block)? It seems to > me that without some kernel support like futexes it's impossible > to do better than LWLocks already do, because any simpler scheme > like > while (atomic_inc_post(lock) > 0) { > atomic_dec(lock); > block(lock); > } > for the shared-locker case suffers from a race condition (the lock > might be released before you actually block()). Attached... > The idea would be to start out with something trivial like the above. > Maybe with an #if for compilers which have something like GCC's > __sync_synchronize(). We could then gradually add implementations > for specific architectures, hopefully done by people who actually > own the hardware and can test. Yes. But if we go that route, then we have to also support a code path for architectures for which we don't have that support. That's going to be more work, so I don't want to do it until we have a case where there is a good, clear benefit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
lwlock-v1.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers