[PERFORM] MIT benchmarks pgsql multicore (up to 48)performance
Hi, for whom it may concern: http://pdos.csail.mit.edu/mosbench/ They tested with 8.3.9, i wonder what results 9.0 would give. Best regards and keep up the good work Hakan
Re: [PERFORM] Issue for partitioning with extra check constriants
> And your point is? The design center for the current setup is maybe 5 > or 10 partitions. We didn't intend it to be used for more partitions > than you might have spindles to spread the data across. Where did that come from? It certainly wasn't anywhere when the feature was introduced. Simon intended for this version of partitioning to scale to 100-200 partitions (and it does, provided that you dump all other table constraints), and partitioning has nothing to do with spindles. I think you're getting it mixed up with tablespaces. The main reason for partitioning is ease of maintenance (VACUUM, dropping partitions, etc.) not any kind of I/O optimization. I'd like to add the following statement to our docs on partitioning, in section 5.9.4: = Constraint exclusion is tested for every CHECK constraint on the partitions, even CHECK constraints which have nothing to do with the partitioning scheme. This can add siginficant extra planner time, especially if your partitions have CHECK constraints which are costly to evaluate. For performance, it can be a good idea to eliminate all extra CHECK constraints on partitions or to re-implement them as triggers. = >In case you haven't noticed, we have very finite > amounts of manpower that's competent to do planner surgery. Point. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Issue for partitioning with extra check constriants
On Mon, 2010-10-04 at 11:34 -0700, Josh Berkus wrote: > > And your point is? The design center for the current setup is maybe 5 > > or 10 partitions. We didn't intend it to be used for more partitions > > than you might have spindles to spread the data across. > > Where did that come from? Yeah that is a bit odd. I don't recall any discussion in regards to such a weird limitation. > It certainly wasn't anywhere when the feature > was introduced. Simon intended for this version of partitioning to > scale to 100-200 partitions (and it does, provided that you dump all > other table constraints), and partitioning has nothing to do with > spindles. I think you're getting it mixed up with tablespaces. Great! that would be an excellent addition. > > The main reason for partitioning is ease of maintenance (VACUUM, > dropping partitions, etc.) not any kind of I/O optimization. Well that is certainly "a" main reason but it is not "the" main reason. We have lots of customers using it to manage very large amounts of data using the constraint exclusion features (and gaining from the smaller index sizes). Jd -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [HACKERS] MIT benchmarks pgsql multicore (up to 48)performance
Dan, (btw, OpenSQL Confererence is going to be at MIT in 2 weeks. Think anyone from the MOSBENCH team could attend? http://www.opensqlcamp.org/Main_Page) > The big takeaway for -hackers, I think, is that lock manager > performance is going to be an issue for large multicore systems, and > the uncontended cases need to be lock-free. That includes cases where > multiple threads are trying to acquire the same lock in compatible > modes. Yes; we were aware of this due to work Jignesh did at Sun on TPC-E. > Currently even acquiring a shared heavyweight lock requires taking out > an exclusive LWLock on the partition, and acquiring shared LWLocks > requires acquiring a spinlock. All of this gets more expensive on > multicores, where even acquiring spinlocks can take longer than the > work being done in the critical section. Certainly, the question has always been how to fix it without breaking major features and endangering data integrity. > Note that their implementation of the lock manager omits some features > for simplicity, like deadlock detection, 2PC, and probably any > semblance of portability. (These are the sort of things we're allowed > to do in the research world! :-) Well, nice that you did! We'd never have that much time to experiment with non-production stuff as a group in the project. So, now we have a theoretical solution which we can look at maybe implementing parts of in some watered-down form. > The other major bottleneck they ran into was a kernel one: reading from > the heap file requires a couple lseek operations, and Linux acquires a > mutex on the inode to do that. The proper place to fix this is > certainly in the kernel but it may be possible to work around in > Postgres. Or we could complain to Kernel.org. They've been fairly responsive in the past. Too bad this didn't get posted earlier; I just got back from LinuxCon. So you know someone who can speak technically to this issue? I can put them in touch with the Linux geeks in charge of that part of the kernel code. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] MIT benchmarks pgsql multicore (up to 48)performance
On Mon, Oct 4, 2010 at 8:44 AM, Hakan Kocaman wrote: > Hi, > for whom it may concern: > http://pdos.csail.mit.edu/mosbench/ > They tested with 8.3.9, i wonder what results 9.0 would give. > Best regards and keep up the good work They mention that these tests were run on the older 8xxx series opterons which has much slower memory speed and HT speed as well. I wonder how much better the newer 6xxx series magny cours would have done on it... When I tested some simple benchmarks like pgbench, I got scalability right to 48 processes on our 48 core magny cours machines. Still, lots of room for improvement in kernel and pgsql. -- To understand recursion, one must first understand recursion. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How does PG know if data is in memory?
2010/10/4 Greg Smith : > Craig Ringer wrote: >> >> If some kind of cache awareness was to be added, I'd be interested in >> seeing a "hotness" measure that tracked how heavily a given relation/index >> has been accessed and how much has been read from it recently. A sort of >> age-scaled blocks-per-second measure that includes both cached and uncached >> (disk) reads. This would let the planner know how likely parts of a given >> index/relation are to be cached in memory without imposing the cost of >> tracking the cache in detail. I'm still not sure it'd be all that useful, >> though... > > Yup, that's one of the design ideas scribbled in my notes, as is the idea of > what someone dubbed a "heat map" that tracked which parts of the relation > where actually the ones in RAM, the other issue you mentioned. The problem > facing a lot of development possibilities in this area is that we don't have > any continuous benchmarking of complicated plans going on right now. So if > something really innovative is done, there's really no automatic way to test > the result and then see what types of plans it improves and what it makes > worse. Until there's some better performance regression work like that > around, development on the optimizer has to favor being very conservative. * tracking specific block is not very easy because of readahead. You end-up measuring exactly if a block was in memory at the moment you requested it physicaly, not at the moment the first seek/fread happen. It is still interesting stat imho. I wonder how that can add value to the planner. * If the planner knows more about the OS cache it can guess the effective_cache_size on its own, which is probably already nice to have. Extract from postgres code: * We use an approximation proposed by Mackert and Lohman, "Index Scans * Using a Finite LRU Buffer: A Validated I/O Model", ACM Transactions * on Database Systems, Vol. 14, No. 3, September 1989, Pages 401-424. Planner use that in conjunction with effective_cache_size to guess if it is interesting to scan the index. All is to know if this model is still valid in front of a more precise knowledge of the OS page cache... and also if it matches how different systems like windows and linux handle page cache. Hooks around cost estimation should help writing a module to rethink that part of the planner and make it use the statistics about cache. I wonder if adding such hooks to core impact its performances ? Anyway doing that is probably the easier and shorter way to test the behavior. > > -- > Greg Smith, 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD > PostgreSQL Training, Services and Support www.2ndQuadrant.us > Author, "PostgreSQL 9.0 High Performance" Pre-ordering at: > https://www.packtpub.com/postgresql-9-0-high-performance/book > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How does PG know if data is in memory?
On 10/04/2010 04:22 AM, Greg Smith wrote: I had a brain-storming session on this subject with a few of the hackers in the community in this area a while back I haven't had a chance to do something with yet (it exists only as a pile of scribbled notes so far). There's a couple of ways to collect data on what's in the database and OS cache, and a couple of ways to then expose that data to the optimizer. But that needs to be done very carefully, almost certainly as only a manual process at first, because something that's producing cache feedback all of the time will cause plans to change all the time, too. Where I suspect this is going is that we may end up tracking various statistics over time, then periodically providing a way to export a mass of "typical % cached" data back to the optimizer for use in plan cost estimation purposes. But the idea of monitoring continuously and always planning based on the most recent data available has some stability issues, both from a "too many unpredictable plan changes" and a "ba d short-term feedback loop" perspective, as mentioned by Tom and Kevin already. Why not monitor the distribution of response times, rather than "cached" vs. not? That a) avoids the issue of discovering what was a cache hit b) deals neatly with multilevel caching c) feeds directly into cost estimation. Cheers, Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Issue for partitioning with extra check constriants
Josh Berkus writes: >> And your point is? The design center for the current setup is maybe 5 >> or 10 partitions. We didn't intend it to be used for more partitions >> than you might have spindles to spread the data across. > Where did that come from? It certainly wasn't anywhere when the feature > was introduced. Simon intended for this version of partitioning to > scale to 100-200 partitions (and it does, provided that you dump all > other table constraints), and partitioning has nothing to do with > spindles. I think you're getting it mixed up with tablespaces. [ shrug... ] If Simon thought that, he obviously hadn't done any careful study of the planner's performance. You can maybe get that far as long as the partitions have just very simple constraints, but anything nontrivial won't scale. As you found out. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance