Re: [PERFORM] CPU bound
On 12/13/2010 10:43 AM, Royce Ausburn wrote: Hi all, I notice that when restoring a DB on a laptop with an SDD, typically postgres is maxing out a CPU - even during a COPY. I wonder, what is postgres usually doing with the CPU? I would have thought the disk would usually be the bottleneck in the DB, but occasionally it's not. We're embarking on a new DB server project and it'd be helpful to understand where the CPU is likely to be the bottleneck. A few thoughts: - Pg isn't capable of using more than one core for a single task, so if you have one really big job, you'll more easily start struggling on CPU. Restores appear to be a pain point here, though recent work has been done to address that. - Even with pg_dump/pg_restore's parallel restore, you can't be using more than one core to do work for a single COPY or other individual operation. You can only parallelize down to the table level at the moment. - Pg's design has always focused on rotating media. It can make sense to trade increased CPU costs for reduced I/O when disk storage is slower relative to CPU/RAM. There aren't, AFAIK, many controls beyond the random/seq io knobs to get Pg to try to save CPU at the cost of more I/O when opportunities to do so appear. - Pg's CPU load depends a lot on the data types and table structures in use. What're your tables like? Do they have indexes added at the end, or are they created with indexes then populated with rows? The former is MUCH faster. Are they full of NUMERIC fields? Those seem to be incredibly slow compared to int/float/etc, which is hardly surprising given their storage and how they work. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound
On 12/12/10 6:43 PM, Royce Ausburn wrote: > Hi all, > > I notice that when restoring a DB on a laptop with an SDD, typically postgres > is maxing out a CPU - even during a COPY. I wonder, what is postgres usually > doing with the CPU? I would have thought the disk would usually be the > bottleneck in the DB, but occasionally it's not. We're embarking on a new DB > server project and it'd be helpful to understand where the CPU is likely to > be the bottleneck. That's pretty normal; as soon as you get decent disk, especially something like an SSD with a RAM cache, you become CPU-bound. COPY does a LOT of parsing and data manipulation. Index building, of course, is almost pure CPU if you have a decent amount of RAM available. If you're restoring from a pg_dump file, and have several cores available, I suggest using parallel pg_restore. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hardware recommendations
> -Original Message- > From: Greg Smith [mailto:g...@2ndquadrant.com] > Sent: Saturday, December 11, 2010 2:18 AM > To: Benjamin Krajmalnik > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Hardware recommendations > > > What sort of total read/write rates are you seeing when iostat is > showing the system 85% busy? That's a useful number to note as an > estimate of just how random the workload is. > I did a vacuum full of the highly bloated, constantly accessed tables, which has improved the situation significantly. I am not seeing over 75% busy right now, but these are some values for the high busy presently: 71% 344 w/s 7644 kw/s 81% 392 w/s 8880 kw/s 79% 393 w/s 9526 kw/s 75% 443 w/s 10245 kw/s 80% 436 w/s 10157 kw/s 76% 392 w/s 8438 kw/s > Have you increased checkpoint parameters like checkpoint_segments? You > need to avoid having checkpoints too often if you're going to try and > use 4GB of memory for shared_buffers. > Yes, I have it configured at 1024 checkpoint_segments, 5min timeout,0.9 compiostat -x 5letion_target > > It's nice to put the logs onto a separate disk because it lets you > measure exactly how much I/O is going to them, relative to the > database. It's not really necessary though; with 14 disks you'll be at > the range where you can mix them together and things should still be > fine. > Thx. I will place them in their own RAID1 (or mirror if I end up going to ZFS) > > > On the processor front, are there advantages to going to X series > processors as opposed to the E series (especially since I am I/O > bound)? Is anyone running this type of hardware, specially on FreeBSD? > Any opinions, especially concerning the Areca controllers which they > use? > > > > It sounds like you should be saving your hardware dollars for more RAM > and disks, not getting faster procesors. The Areca controllers are > fast > and pretty reliable under Linux. I'm not aware of anyone using them > for > PostgreSQL in production on FreeBSD. Aberdeen may have enough > customers > doing that to give you a good opinion on how stable that is likely to > be; they're pretty straight as vendors go. You'd want to make sure to > stress test that hardware/software combo as early as possible > regardless, it's generally a good idea and you wouldn't be running a > really popular combination. > Thx. That was my overall plan - that's why I am opting for drives, cache on the controller, and memory. > -- > Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD > PostgreSQL Training, Services and Supportwww.2ndQuadrant.us > "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Tunning Postgres
2010/12/8 salima : > > Pessoal, > > Estou com uma dúvida ao fazer Tunning no arquivo de configuração do > Postgres. > > Minha aplicação tem varios acessos simultâneos, chegando picos de 2000 mil > ou até mais. Por ser uma aplicação Web fica dificil de se estipular o > "max_connections", sem contar que o restante dos parâmetros faz dependencia > com este. > > Tenho um servidor dedicado ao Postgre e gostaria de saber qual a melhor ou > uma sugestão de configuração, para esta máquina e aplicação. > > Servidor DELL > Intel 2 processadores 3.6 GHz Xeon > 4 GBs RAM > 2 HDs de 320 GB (RAID1) > Sistema Operacional Linux - CentOS I think you'll need to post this message in English to get much help here. Or you could try: https://listas.postgresql.org.br/cgi-bin/mailman/listinfo/pgbr-geral -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hardware recommendations
> -Original Message- > From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance- > ow...@postgresql.org] On Behalf Of Benjamin Krajmalnik > Sent: Monday, December 13, 2010 1:45 PM > To: Greg Smith > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Hardware recommendations > > > > > -Original Message- > > From: Greg Smith [mailto:g...@2ndquadrant.com] > > Sent: Saturday, December 11, 2010 2:18 AM > > To: Benjamin Krajmalnik > > Cc: pgsql-performance@postgresql.org > > Subject: Re: [PERFORM] Hardware recommendations > > > > Have you increased checkpoint parameters like checkpoint_segments? > You > > need to avoid having checkpoints too often if you're going to try and > > use 4GB of memory for shared_buffers. > > > > Yes, I have it configured at 1024 checkpoint_segments, 5min timeout,0.9 > compiostat -x 5letion_target I would consider bumping that checkpoint timeout duration to a bit longer and see if that helps any if you are still looking for knobs to fiddle with. YMMV. -Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Tunning Postgres
Try going through the archives first because your question probably has been answered many times already (altho there is no definitive question as to what server postgresql would need to run to fit your purpose). Also, this is English list. If you prefer to ask questions in Brazilian/Portuguese than try postgresql.org.br -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Help with bulk read performance
On Nov 1, 2010, at 9:15 AM, Dan Schaffer wrote: > We have an application that needs to do bulk reads of ENTIRE Postgres tables > very quickly (i.e. select * from table). We have observed that such > sequential scans run two orders of magnitude slower than observed raw disk > reads (5 MB/s versus 100 MB/s). Part of this is due to the storage overhead > we have observed in Postgres. In the example below, it takes 1 GB to store > 350 MB of nominal data. However that suggests we would expect to get 35 MB/s > bulk read rates. > > Observations using iostat and top during these bulk reads suggest that the > queries are CPU bound, not I/O bound. In fact, repeating the queries yields > similar response times. Presumably if it were an I/O issue the response > times would be much shorter the second time through with the benefit of > caching. > > We have tried these simple queries using psql, JDBC, pl/java stored > procedures, and libpq. In all cases the client code ran on the same box as > the server. > We have experimented with Postgres 8.1, 8.3 and 9.0. > > We also tried playing around with some of the server tuning parameters such > as shared_buffers to no avail. > > Here is uname -a for a machine we have tested on: > > Linux nevs-bdb1.fsl.noaa.gov 2.6.18-194.17.1.el5 #1 SMP Mon Sep 20 07:12:06 > EDT 2010 x86_64 x86_64 x86_64 GNU/Linux > > A sample dataset that reproduces these results looks like the following > (there are no indexes): > > Table "bulk_performance.counts" > Column | Type | Modifiers > +-+--- > i1 | integer | > i2 | integer | > i3 | integer | > i4 | integer | > > There are 22 million rows in this case. > > We HAVE observed that summation queries run considerably faster. In this > case, > > select sum(i1), sum(i2), sum(i3), sum(i4) from bulk_performance.counts > > runs at 35 MB/s. > > > Our business logic does operations on the resulting data such that the output > is several orders of magnitude smaller than the input. So we had hoped that > by putting our business logic into stored procedures (and thus drastically > reducing the amount of data flowing to the client) our throughput would go > way up. This did not happen. > > So our questions are as follows: > > Is there any way using stored procedures (maybe C code that calls SPI > directly) or some other approach to get close to the expected 35 MB/s doing > these bulk reads? Or is this the price we have to pay for using SQL instead > of some NoSQL solution. (We actually tried Tokyo Cabinet and found it to > perform quite well. However it does not measure up to Postgres in terms of > replication, data interrogation, community support, acceptance, etc). Have you by chance tried EXPLAIN ANALYZE SELECT * FROM bulk_performance.counts? That will throw away the query results, which removes client-server considerations. Also, when you tested raw disk IO, did you do it with an 8k block size? That's the default size of a Postgres block, so all of it's IO is done that way. What does iostat show you? Are you getting a decent number of read requests/second? -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance