Re: [PERFORM] PostgreSQL 9.0.4 blocking in lseek?
On 28/12/2011 19:41, Matteo Beccati wrote: > On 28/12/2011 19:07, Claudio Freire wrote: >> On Wed, Dec 28, 2011 at 3:02 PM, Matteo Beccati wrote: >>> The query eventually completed in more than 18h. For comparison a normal >>> run doesn't take more than 1m for that specific step. >>> >>> Do you think that bad stats and suboptimal plan alone could explain such >>> a behaviour? >> >> Did you get the explain analyze output? > > Unfortunately I stopped it as I thought it wasn't going to return > anything meaningful. I've restarted the import process and it will break > right before the problematic query. Let's see if I can get any more info > tomorrow. So, I'm running again the EXPLAIN ANALYZE, although I don't expect it to return anytime soon. However I've discovered a few typos in the index creation. If we add it to the fact that row estimates are off for this specific query, I can understand that the chosen plan might have been way far from optimal with some badly picked statistics. This is the explain analyze of the query with proper indexes in place. As you can see estimates are still off, even though run time is ~20s: http://explain.depesz.com/s/1UY For comparison, here is the old explain output: http://explain.depesz.com/s/TqD The case is closed and as Tom pointed out already the lseek-only activity is due to the fact that the table is fully cached in the shared buffers and a sequential scan inside a nested loop is consistent with it. Sorry for the noise. Cheers -- Matteo Beccati Development & Consulting - http://www.beccati.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgresql Replication Performance
On Thu, Dec 29, 2011 at 3:33 AM, sgupta wrote: > I am doing POC on Posgtresql replication. I am using latest version of > postgresql i.e. 9.1. There are multiple replication solutions avaliable in > the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide in-built > replication solutions (Streaming replication, Warm Standby and hot standby). > I am confused which solution is best for the financial application for which > I am doing POC. The application will write around 160 million records with > row size of 2.5 KB in database. My questions is for following scenarios > which replication solution will be suitable: > > If I would require replication for backup purpose only > If I would require to scale the reads > If I would require High Avaliability and Consistency > Also It will be very helpful if you can share the perfomance or experience > with postgresql replication solutions. The built in HS/SR integrates with the postgres engine (over the WAL system) at a very low level and is going to be generally faster and more robust. More importantly, it has a very low administrative overhead -- the underlying mechanism of log shipping has been tweaked and refined continually since PITR was released in 8.0. Once you've done it a few times, it's a five minute procedure to replicate a database (not counting, heh, the base database copy). The main disadvantage of HS/SR is inflexibility: you get an exact replica of a database cluster. Slony (which is a trigger based system) and pgpool (which is statement replication) can do a lot of funky things that hs/sr can't do -- so they definitely fill a niche depending on what your requirements are. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] parse - bind take more time than execute
MirrorX wrote: > -the table is this -> > \d configurations > Table "public.configurationcontext" One of my concerns was that you might actually be selecting against a view rather than a table, and the above doesn't reassure me that you're not. Why the difference between "configurations" and "configurationcontext"? -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgresql Replication Performance
On 12/29/2011 11:33 AM, sgupta wrote: I am doing POC on Posgtresql replication. I am using latest version of postgresql i.e. 9.1. There are multiple replication solutions avaliable in the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide in-built replication solutions (Streaming replication, Warm Standby and hot standby). I am confused which solution is best for the financial application for which I am doing POC. The application will write around 160 million records with row size of 2.5 KB in database. My questions is for following scenarios which replication solution will be suitable: If I would require replication for backup purpose only If I would require to scale the reads If I would require High Avaliability and Consistency Also It will be very helpful if you can share the perfomance or experience with postgresql replication solutions. Thanks -- View this message in context: http://postgresql.1045698.n5.nabble.com/Postgresql-Replication-Performance-tp5107278p5107278.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. Which replication solution will be suitable depends on your needs and database architecture. Starting with PGCluster, I can say only, that PGCluster last released in 2005 year, so you can not use it with Postgres 9.0 =) Slony-I is a good solution if you want to have cascade replication from Slave to Slave or you want to replicate only several parts of your database (because Slony performs table level like replication) PGPool-II is an advanced load balancer and pooling solution. Which also has replication support. Pgpool-II is query based replication utility, which performs queries on several database servers. If you are looking for performance and stability I do not recommend using PGPool as replication software. Postgres Streaming replication is WAL based replication, so using this type of replication you will have absolutely identical database servers, what is best choice for HA and scaling reads. Also this choice is not practically affecting performance, because it is not adding any latency to database layer. Also you could read about difference between Slony and Streaming replications here http://scanningpages.wordpress.com/2010/10/09/9-0-streaming-replication-vs-slony/ -- Best regards Aleksej Trofimov
Re: [PERFORM] parse - bind take more time than execute
thx for your reply :) -the timings come from the log -the table is this -> \d configurations Table "public.configurationcontext" Column | Type | Modifiers ---++--- id| numeric(18,0) | not null category | numeric(18,0) | pr_oid| numeric(18,0) | var_attrs | character varying(255) | num_value | numeric(18,0) | Indexes: "pk_configurations" PRIMARY KEY, btree (id) "conf_index" btree (category, pr_oid, num_value) and one query is this -> SELECT * FROM configurations WHERE pr_oid=$1 AND num_value=$2 -the table has only 2500 rows -this messages used to appear a lot after i created a new index for the 2 columns mentioned above in the query, since i thought that the 3-column index wouldnt be of much help since the first column was not defined in the query. now i have dropped this extra index and i see much less records in the log about the bind/parse phase of the query -the server has 4 cores, 12 GB ram, and fata disks. the settings from the query are these -> name | current_setting -+-- version | PostgreSQL 8.4.7 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50), 64-bit archive_command | cp -i %p /var/lib/pgsql/wals/%f http://postgresql.1045698.n5.nabble.com/parse-bind-take-more-time-than-execute-tp5102940p5107985.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgresql Replication Performance
On 12/29/2011 05:00 PM, Claudio Freire wrote: Second, is that WAL activity on streaming replication or WAL shipping is documented to contain more data than on non-replicated setups. What is not clear is how much more data. This not only affects our network bandwidth estimations, but also I/O load on the master server, slowing writes (and some reads that cannot happen on the slave). Our database has about 2MB/s writes on the WAL array, we had about 160 IOPS in average when replications was switched off, and 165-170 IOPS in replication. This I think could be explained with statistical error, so we have not experienced any I/O load on our master server since replication was configured. -- Best regards Aleksej Trofimov -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Postgresql Replication Performance
I am doing POC on Posgtresql replication. I am using latest version of postgresql i.e. 9.1. There are multiple replication solutions avaliable in the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide in-built replication solutions (Streaming replication, Warm Standby and hot standby). I am confused which solution is best for the financial application for which I am doing POC. The application will write around 160 million records with row size of 2.5 KB in database. My questions is for following scenarios which replication solution will be suitable: If I would require replication for backup purpose only If I would require to scale the reads If I would require High Avaliability and Consistency Also It will be very helpful if you can share the perfomance or experience with postgresql replication solutions. Thanks -- View this message in context: http://postgresql.1045698.n5.nabble.com/Postgresql-Replication-Performance-tp5107278p5107278.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgresql Replication Performance
On Thu, Dec 29, 2011 at 11:33 AM, Aleksej Trofimov wrote: > Postgres Streaming replication is WAL based replication, so using this type > of replication you will have absolutely identical database servers, what is > best choice for HA and scaling reads. Also this choice is not practically > affecting performance, because it is not adding any latency to database > layer. Let me chime in, because I'm in a similar situation. I'm preparing a POC WAL-replicated environment, and testing up until now has been inconclusive since we lack the kind of hardware in our test environment. I know I should require it, testing on similar hardware is the only way to get reliable results, but getting the budget approved would take way too long, and right now we're in a hurry to scale reads. So getting the hardware is not an option, my option is asking those who have the experience :-) I gather WAL replication introduces only a few possible bottlenecks. First, network bandwidth between master and slaves, and my app does write a lot - our monitoring tools show, today, an average of 1MB/s writes on the WAL array, with peaks exceeding 8MB/s, which can easily saturate our lowly 100Mb/s links. No worries, we can upgrade to 1Gb/s links. Second, is that WAL activity on streaming replication or WAL shipping is documented to contain more data than on non-replicated setups. What is not clear is how much more data. This not only affects our network bandwidth estimations, but also I/O load on the master server, slowing writes (and some reads that cannot happen on the slave). So, my question is, in your experience, how much of an increase in WAL activity can be expected? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance