Re: [PERFORM] PostgreSQL 9.0.4 blocking in lseek?

2011-12-29 Thread Matteo Beccati
On 28/12/2011 19:41, Matteo Beccati wrote:
> On 28/12/2011 19:07, Claudio Freire wrote:
>> On Wed, Dec 28, 2011 at 3:02 PM, Matteo Beccati  wrote:
>>> The query eventually completed in more than 18h. For comparison a normal
>>> run doesn't take more than 1m for that specific step.
>>>
>>> Do you think that bad stats and suboptimal plan alone could explain such
>>> a behaviour?
>>
>> Did you get the explain analyze output?
> 
> Unfortunately I stopped it as I thought it wasn't going to return
> anything meaningful. I've restarted the import process and it will break
> right before the problematic query. Let's see if I can get any more info
> tomorrow.

So, I'm running again the EXPLAIN ANALYZE, although I don't expect it to
return anytime soon.

However I've discovered a few typos in the index creation. If we add it
to the fact that row estimates are off for this specific query, I can
understand that the chosen plan might have been way far from optimal
with some badly picked statistics.

This is the explain analyze of the query with proper indexes in place.
As you can see estimates are still off, even though run time is ~20s:

http://explain.depesz.com/s/1UY

For comparison, here is the old explain output:

http://explain.depesz.com/s/TqD

The case is closed and as Tom pointed out already the lseek-only
activity is due to the fact that the table is fully cached in the shared
buffers and a sequential scan inside a nested loop is consistent with it.

Sorry for the noise.


Cheers
-- 
Matteo Beccati

Development & Consulting - http://www.beccati.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgresql Replication Performance

2011-12-29 Thread Merlin Moncure
On Thu, Dec 29, 2011 at 3:33 AM, sgupta  wrote:
> I am doing POC on Posgtresql replication. I am using latest version of
> postgresql i.e. 9.1. There are multiple replication solutions avaliable in
> the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide in-built
> replication solutions (Streaming replication, Warm Standby and hot standby).
> I am confused which solution is best for the financial application for which
> I am doing POC. The application will write around 160 million records with
> row size of 2.5 KB in database. My questions is for following scenarios
> which replication solution will be suitable:
>
> If I would require replication for backup purpose only
> If I would require to scale the reads
> If I would require High Avaliability and Consistency
> Also It will be very helpful if you can share the perfomance or experience
> with postgresql replication solutions.

The built in HS/SR integrates with the postgres engine (over the WAL
system) at a very low level and is going to be generally faster and
more robust.  More importantly, it has a very low administrative
overhead -- the underlying mechanism of log shipping has been tweaked
and refined continually since PITR was released in 8.0.  Once you've
done it a few times, it's a five minute procedure to replicate a
database (not counting, heh, the base database copy).

The main disadvantage of HS/SR is inflexibility: you get an exact
replica of a database cluster.  Slony (which is a trigger based
system) and pgpool (which is statement replication) can do a lot of
funky things that hs/sr can't do -- so they definitely fill a niche
depending on what your requirements are.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] parse - bind take more time than execute

2011-12-29 Thread Kevin Grittner
MirrorX  wrote:
 
> -the table is this ->
> \d configurations
> Table "public.configurationcontext"
 
One of my concerns was that you might actually be selecting against a
view rather than a table, and the above doesn't reassure me that
you're not.  Why the difference between "configurations" and
"configurationcontext"?
 
-Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgresql Replication Performance

2011-12-29 Thread Aleksej Trofimov

On 12/29/2011 11:33 AM, sgupta wrote:

I am doing POC on Posgtresql replication. I am using latest version of
postgresql i.e. 9.1. There are multiple replication solutions avaliable in
the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide in-built
replication solutions (Streaming replication, Warm Standby and hot standby).
I am confused which solution is best for the financial application for which
I am doing POC. The application will write around 160 million records with
row size of 2.5 KB in database. My questions is for following scenarios
which replication solution will be suitable:

If I would require replication for backup purpose only
If I would require to scale the reads
If I would require High Avaliability and Consistency
Also It will be very helpful if you can share the perfomance or experience
with postgresql replication solutions.

Thanks

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Postgresql-Replication-Performance-tp5107278p5107278.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.



Which replication solution will be suitable depends on your needs and 
database architecture.
Starting with PGCluster, I can say only, that PGCluster last released in 
2005 year, so you can not use it with Postgres 9.0 =)
Slony-I is a good solution if you want to have cascade replication from 
Slave to Slave or you want to replicate only several parts of your 
database (because Slony performs table level like replication)
PGPool-II is an advanced load balancer and pooling solution. Which also 
has replication support. Pgpool-II is query based replication utility, 
which performs queries on several database servers. If you are looking 
for performance and stability I do not recommend using PGPool as 
replication software.
Postgres Streaming replication is WAL based replication, so using this 
type of replication you will have absolutely identical database servers, 
what is best choice for HA and scaling reads. Also this choice is not 
practically affecting performance, because it is not adding any latency 
to database layer.


Also you could read about difference between Slony and Streaming 
replications here 
http://scanningpages.wordpress.com/2010/10/09/9-0-streaming-replication-vs-slony/



--
Best regards

Aleksej Trofimov



Re: [PERFORM] parse - bind take more time than execute

2011-12-29 Thread MirrorX
thx for your reply :)

-the timings come from the log
-the table is this ->
\d configurations
  Table "public.configurationcontext"
  Column   |  Type  | Modifiers 
---++---
 id| numeric(18,0)  | not null
 category  | numeric(18,0)  | 
 pr_oid| numeric(18,0)  | 
 var_attrs | character varying(255) | 
 num_value | numeric(18,0)  | 
Indexes:
"pk_configurations" PRIMARY KEY, btree (id)
"conf_index" btree (category, pr_oid, num_value)

and one query is this ->
SELECT * FROM configurations WHERE pr_oid=$1 AND num_value=$2

-the table has only 2500 rows
-this messages used to appear a lot after i created a new index for the 2
columns mentioned above in the query, since i thought that the 3-column
index wouldnt be of much help since the first column was not defined in the
query. now i have dropped this extra index and i see much less records in
the log about the bind/parse phase of the query

-the server has 4 cores, 12 GB ram, and fata disks. the settings from the
query are these ->
  name   |  
  
current_setting  
-+--
 version | PostgreSQL 8.4.7 on
x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat
4.1.2-50), 64-bit
 archive_command | cp -i %p /var/lib/pgsql/wals/%f
http://postgresql.1045698.n5.nabble.com/parse-bind-take-more-time-than-execute-tp5102940p5107985.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgresql Replication Performance

2011-12-29 Thread Aleksej Trofimov

On 12/29/2011 05:00 PM, Claudio Freire wrote:

Second, is that WAL activity on streaming replication or WAL shipping
is documented to contain more data than on non-replicated setups. What
is not clear is how much more data. This not only affects our network
bandwidth estimations, but also I/O load on the master server, slowing
writes (and some reads that cannot happen on the slave).
Our database has about 2MB/s writes on the WAL array, we had about 160 
IOPS in average when replications was switched off, and 165-170 IOPS in 
replication. This I think could be explained with statistical error, so 
we have not experienced any I/O load on our master server since 
replication was configured.



--
Best regards

Aleksej Trofimov


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Postgresql Replication Performance

2011-12-29 Thread sgupta
I am doing POC on Posgtresql replication. I am using latest version of
postgresql i.e. 9.1. There are multiple replication solutions avaliable in
the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide in-built
replication solutions (Streaming replication, Warm Standby and hot standby).
I am confused which solution is best for the financial application for which
I am doing POC. The application will write around 160 million records with
row size of 2.5 KB in database. My questions is for following scenarios
which replication solution will be suitable:

If I would require replication for backup purpose only
If I would require to scale the reads
If I would require High Avaliability and Consistency
Also It will be very helpful if you can share the perfomance or experience
with postgresql replication solutions.

Thanks

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Postgresql-Replication-Performance-tp5107278p5107278.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgresql Replication Performance

2011-12-29 Thread Claudio Freire
On Thu, Dec 29, 2011 at 11:33 AM, Aleksej Trofimov
 wrote:
> Postgres Streaming replication is WAL based replication, so using this type
> of replication you will have absolutely identical database servers, what is
> best choice for HA and scaling reads. Also this choice is not practically
> affecting performance, because it is not adding any latency to database
> layer.

Let me chime in, because I'm in a similar situation. I'm preparing a
POC WAL-replicated environment, and testing up until now has been
inconclusive since we lack the kind of hardware in our test
environment. I know I should require it, testing on similar hardware
is the only way to get reliable results, but getting the budget
approved would take way too long, and right now we're in a hurry to
scale reads.

So getting the hardware is not an option, my option is asking those
who have the experience :-)

I gather WAL replication introduces only a few possible bottlenecks.

First, network bandwidth between master and slaves, and my app does
write a lot - our monitoring tools show, today, an average of 1MB/s
writes on the WAL array, with peaks exceeding 8MB/s, which can easily
saturate our lowly 100Mb/s links. No worries, we can upgrade to 1Gb/s
links.

Second, is that WAL activity on streaming replication or WAL shipping
is documented to contain more data than on non-replicated setups. What
is not clear is how much more data. This not only affects our network
bandwidth estimations, but also I/O load on the master server, slowing
writes (and some reads that cannot happen on the slave).

So, my question is, in your experience, how much of an increase in WAL
activity can be expected?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance