[PERFORM] 700K Inserts in transaction

2013-02-14 Thread Asmir Mustafic

Hi everybody!
I'm new in mailing list, and i have a little question.


The tables are:
postalcodes (place_id, code),  PK(place_id, code) 600K of rws
places (id, name),  PK(id), INDEX(name) 3M of rows

I've to insert another 600k of rows into postalcodes table, in a single 
transaction, omitting duplicates.


The insert query is a prepared statement like this:

INSERT INTO postalcodes (place_id, code)
SELECT places.id, :code
FROM places
LEFT JOIN postalcodes (postalcodes.place_id = places.id and 
postalcodes.code = :code)

WHERE places.name = :name AND postalcodes.place_id IS NULL

Inserting rows works well (3000 queries per second), but when i reach 
30K of executed statements, the insert rate slows down to 500/1000 
queries per second).


Doing a commit every 20K of inserts, the insert rate remain 3000 queries 
per second.


There is a limit of inserts in a transaction?





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] PG_XLOG 27028 files running out of space

2013-02-14 Thread Tory M Blue
My postgres db ran out of space. I have 27028 files in the pg_xlog
directory. I'm unclear what happened this has been running flawless for
years. I do have archiving turned on and run an archive command every 10
minutes.

I'm not sure how to go about cleaning this up, I got the DB back up, but
I've only got 6gb free on this drive and it's going to blow up, if I can't
relieve some of the stress from this directory over 220gb.

What are my options?

Thanks

Postgres 9.1.6
slon 2.1.2

Tory


Re: [PERFORM] PG_XLOG 27028 files running out of space

2013-02-14 Thread Ian Lawrence Barwick
2013/2/14 Tory M Blue 

> My postgres db ran out of space. I have 27028 files in the pg_xlog
> directory. I'm unclear what happened this has been running flawless for
> years. I do have archiving turned on and run an archive command every 10
> minutes.
>
> I'm not sure how to go about cleaning this up, I got the DB back up, but
> I've only got 6gb free on this drive and it's going to blow up, if I can't
> relieve some of the stress from this directory over 220gb.
>
> What are my options?
>
> Thanks
>
> Postgres 9.1.6
> slon 2.1.2


I can't give any advice right now, but I'd suggest posting more details of
your
setup, including as much of your postgresql.conf file as possible
 (especially
the checkpoint_* and archive_* settings) and also the output of
pg_controldata.

Ian Barwick


Re: [PERFORM] PG_XLOG 27028 files running out of space

2013-02-14 Thread Tory M Blue
On Thu, Feb 14, 2013 at 3:01 AM, Ian Lawrence Barwick wrote:

> 2013/2/14 Tory M Blue 
>
>> My postgres db ran out of space. I have 27028 files in the pg_xlog
>> directory. I'm unclear what happened this has been running flawless for
>> years. I do have archiving turned on and run an archive command every 10
>> minutes.
>>
>> I'm not sure how to go about cleaning this up, I got the DB back up, but
>> I've only got 6gb free on this drive and it's going to blow up, if I can't
>> relieve some of the stress from this directory over 220gb.
>>
>> What are my options?
>>
>> Thanks
>>
>> Postgres 9.1.6
>> slon 2.1.2
>
>
> I can't give any advice right now, but I'd suggest posting more details of
> your
> setup, including as much of your postgresql.conf file as possible
>  (especially
> the checkpoint_* and archive_* settings) and also the output of
> pg_controldata.
>
> Ian Barwick
>

Thanks Ian

I figured it out and figured out a way around it for now.

My archive destination had it's ownership changed and thus the archive
command could not write to the directory. I didn't catch this until well it
was too late. So 225GB, 27000 files later.

I found a few writeups on how to clear this up and use the command true in
the archive command to quickly and easily delete a bunch of wal files from
the pg_xlog directory in short order. So that worked and now since I know
what the cause was, I should be able to restore my pg_archive PITR configs
and be good to go.

This is definitely one of those bullets I would rather not of  taken, but
the damage appears to be minimal (thank you postgres)

Thanks again
Tory


Re: [PERFORM] PG_XLOG 27028 files running out of space

2013-02-14 Thread Heikki Linnakangas

On 14.02.2013 12:49, Tory M Blue wrote:

My postgres db ran out of space. I have 27028 files in the pg_xlog
directory. I'm unclear what happened this has been running flawless for
years. I do have archiving turned on and run an archive command every 10
minutes.

I'm not sure how to go about cleaning this up, I got the DB back up, but
I've only got 6gb free on this drive and it's going to blow up, if I can't
relieve some of the stress from this directory over 220gb.

What are my options?


You'll need to delete some of the oldest xlog files to release disk 
space. But first you need to make sure you don't delete any files that 
are still needed, and what got you into this situation in the first place.


You say that you "run an archive command every 10 minutes". What do you 
mean by that? archive_command specified in postgresql.conf is executed 
automatically by the system, so you don't need to and should not run 
that manually. After archive_command has run successfully, and the 
system doesn't need the WAL file for recovery anymore (ie. after the 
next checkpoint), the system will delete the archived file to release 
disk space. Clearly that hasn't been working in your system for some 
reason. If archive_command doesn't succeed, ie. it returns a non-zero 
return code, the system will keep retrying forever until it succeeds, 
without deleting the file. Have you checked the logs for any 
archive_command errors?


To get out of the immediate trouble, run "pg_controldata", and make note 
of this line:


Latest checkpoint's REDO WAL file:00010001

Anything older than that file is not needed for recovery. You can delete 
those, if you have them safely archived.


- Heikki


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PG_XLOG 27028 files running out of space

2013-02-14 Thread Albe Laurenz
Tory M Blue wrote:
> My postgres db ran out of space. I have 27028 files in the pg_xlog directory. 
> I'm unclear what
> happened this has been running flawless for years. I do have archiving turned 
> on and run an archive
> command every 10 minutes.
> 
> I'm not sure how to go about cleaning this up, I got the DB back up, but I've 
> only got 6gb free on
> this drive and it's going to blow up, if I can't relieve some of the stress 
> from this directory over
> 220gb.

> Postgres 9.1.6
> slon 2.1.2

Are there any messages in the log file?
Are you sure that archiving works, i.e. do WAL files
show up in your archive location?

The most likely explanation for what you observe is that
archive_command returns a non-zero result (fails).
That would lead to a message in the log.

Yours,
Laurenz Albe


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PG_XLOG 27028 files running out of space

2013-02-14 Thread Tory M Blue
On Thu, Feb 14, 2013 at 3:08 AM, Heikki Linnakangas  wrote:

> On 14.02.2013 12:49, Tory M Blue wrote:
>
>> My postgres db ran out of space. I have 27028 files in the pg_xlog
>> directory. I'm unclear what happened this has been running flawless for
>> years. I do have archiving turned on and run an archive command every 10
>> minutes.
>>
>> I'm not sure how to go about cleaning this up, I got the DB back up, but
>> I've only got 6gb free on this drive and it's going to blow up, if I can't
>> relieve some of the stress from this directory over 220gb.
>>
>> What are my options?
>>
>
> You'll need to delete some of the oldest xlog files to release disk space.
> But first you need to make sure you don't delete any files that are still
> needed, and what got you into this situation in the first place.
>
> You say that you "run an archive command every 10 minutes". What do you
> mean by that? archive_command specified in postgresql.conf is executed
> automatically by the system, so you don't need to and should not run that
> manually. After archive_command has run successfully, and the system
> doesn't need the WAL file for recovery anymore (ie. after the next
> checkpoint), the system will delete the archived file to release disk
> space. Clearly that hasn't been working in your system for some reason. If
> archive_command doesn't succeed, ie. it returns a non-zero return code, the
> system will keep retrying forever until it succeeds, without deleting the
> file. Have you checked the logs for any archive_command errors?
>
> To get out of the immediate trouble, run "pg_controldata", and make note
> of this line:
>
> Latest checkpoint's REDO WAL file:00010001
>
> Anything older than that file is not needed for recovery. You can delete
> those, if you have them safely archived.
>
> - Heikki
>

Thanks  Heikki,

Yes I misspoke with the archive command, sorry, that was a timeout and in
my haste/disorientation I misread/spoke. So I'm clear on that.

I'm also over my issue after discovering the problem, but pg_controldata is
something I could of used initially in my panic, so I've added that command
to my toolbox and appreciate the response!

Thanks
Tory


Re: [PERFORM] 700K Inserts in transaction

2013-02-14 Thread nik9000
Are the duplicates evenly distributed?  You might have started on a big chunk 
of dupes.

I'd go about this by loading my new data in a new table, removing the dupes, 
then inserting all the new data into the old table. That way you have more 
granular information about the process. And you can do the initial load with 
copy if you need it. And you can remove the dupes outside of a transaction. 

Nik

Sent from my iPhone

On Feb 14, 2013, at 5:28 AM, Asmir Mustafic  wrote:

> Hi everybody!
> I'm new in mailing list, and i have a little question.
> 
> 
> The tables are:
> postalcodes (place_id, code),  PK(place_id, code) 600K of rws
> places (id, name),  PK(id), INDEX(name) 3M of rows
> 
> I've to insert another 600k of rows into postalcodes table, in a single 
> transaction, omitting duplicates.
> 
> The insert query is a prepared statement like this:
> 
> INSERT INTO postalcodes (place_id, code)
> SELECT places.id, :code
> FROM places
> LEFT JOIN postalcodes (postalcodes.place_id = places.id and postalcodes.code 
> = :code)
> WHERE places.name = :name AND postalcodes.place_id IS NULL
> 
> Inserting rows works well (3000 queries per second), but when i reach 30K of 
> executed statements, the insert rate slows down to 500/1000 queries per 
> second).
> 
> Doing a commit every 20K of inserts, the insert rate remain 3000 queries per 
> second.
> 
> There is a limit of inserts in a transaction?
> 
> 
> 
> 
> 
> -- 
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 12.04

2013-02-14 Thread Merlin Moncure
On Tue, Feb 12, 2013 at 11:25 AM, Dan Kogan  wrote:
> Hello,
>
>
>
> We upgraded from Ubuntu 11.04 to Ubuntu 12.04 and almost immediately
> obeserved increased CPU usage and significantly higher load average on our
> database server.
>
> At the time we were on Postgres 9.0.5.  We decided to upgrade to Postgres
> 9.2 to see if that resolves the issue, but unfortunately it did not.
>
>
>
> Just for illustration purposes, below are a few links to cpu and load graphs
> pre and post upgrade.
>
>
>
> https://s3.amazonaws.com/iqtell.ops/Load+Average+Post+Upgrade.png
>
> https://s3.amazonaws.com/iqtell.ops/Load+Average+Pre+Upgrade.png
>
>
>
> https://s3.amazonaws.com/iqtell.ops/Server+CPU+Post+Upgrade.png
>
> https://s3.amazonaws.com/iqtell.ops/Server+CPU+Pre+Upgrade.png
>
>
>
> We also tried tweaking kernel parameters as mentioned here -
> http://www.postgresql.org/message-id/50e4aab1.9040...@optionshouse.com, but
> have not seen any improvement.
>
>
>
>
>
> Any advice on how to trace what could be causing the change in CPU usage and
> load average is appreciated.
>
>
>
> Our postgres version is:
>
>
>
> PostgreSQL 9.2.2 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro
> 4.6.3-1ubuntu5) 4.6.3, 64-bit
>
>
>
> OS:
>
>
>
> Linux ip-10-189-175-25 3.2.0-37-virtual #58-Ubuntu SMP Thu Jan 24 15:48:03
> UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>
>
>
> Hardware (this an Amazon Ec2 High memory quadruple extra large instance):
>
>
>
> 8 core Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz
>
> 68 GB RAM
>
> RAID10 with 8 drives using xfs
>
> Drives are EBS with provisioned IOPS, with 1000 iops each
>
>
>
> Postgres Configuration:
>
>
>
> archive_command = rsync -a %p slave:/var/lib/postgresql/replication_load/%f
>
> archive_mode = on
>
> checkpoint_completion_target = 0.9
>
> checkpoint_segments = 64
>
> checkpoint_timeout = 30min
>
> default_text_search_config = pg_catalog.english
>
> external_pid_file = /var/run/postgresql/9.2-main.pid
>
> lc_messages = en_US.UTF-8
>
> lc_monetary = en_US.UTF-8
>
> lc_numeric = en_US.UTF-8
>
> lc_time = en_US.UTF-8
>
> listen_addresses = *
>
> log_checkpoints=on
>
> log_destination=stderr
>
> log_line_prefix = %t [%p]: [%l-1]
>
> log_min_duration_statement =500
>
> max_connections=300
>
> max_stack_depth=2MB
>
> max_wal_senders=5
>
> shared_buffers=4GB
>
> synchronous_commit=off
>
> unix_socket_directory=/var/run/postgresql
>
> wal_keep_segments=128
>
> wal_level=hot_standby
>
> work_mem=8MB

does your application have a lot of concurrency?  history has shown
that postgres is highly sensitive to changes in the o/s scheduler
(which changes a lot from release to release).

also check this:
zone reclaim 
(http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html)

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Surprising no use of indexes - low performance

2013-02-14 Thread Nicolas Charles

Hello,

I've been struggling to understand what's happening on my 
databases/query for several days, and I'm turning to higher mind for a 
logical answer.


I'm dealing with a fairly large database, containing logs informations, 
that I crunch to get data out of it, with several indexes on them that I 
hoped were logical


\d ruddersysevents
 Table « public.ruddersysevents »
  Colonne   |   Type   |  
Modificateurs

+--+--
 id | integer  | non NULL Par défaut, 
nextval('serial'::regclass)

 executiondate  | timestamp with time zone | non NULL
 nodeid | text | non NULL
 directiveid| text | non NULL
 ruleid | text | non NULL
 serial | integer  | non NULL
 component  | text | non NULL
 keyvalue   | text |
 executiontimestamp | timestamp with time zone | non NULL
 eventtype  | character varying(64)|
 policy | text |
 msg| text |
Index :
"ruddersysevents_pkey" PRIMARY KEY, btree (id)
"component_idx" btree (component)
"configurationruleid_idx" btree (ruleid)
"executiontimestamp_idx" btree (executiontimestamp)
"keyvalue_idx" btree (keyvalue)
"nodeid_idx" btree (nodeid)
Contraintes de vérification :
"ruddersysevents_component_check" CHECK (component <> ''::text)
"ruddersysevents_configurationruleid_check" CHECK (ruleid <> ''::text)
"ruddersysevents_nodeid_check" CHECK (nodeid <> ''::text)
"ruddersysevents_policyinstanceid_check" CHECK (directiveid <> 
''::text)



It contains 11018592 entries, with the followinf patterns :
108492 distinct executiontimestamp
14 distinct nodeid
59 distinct directiveid
26 distinct ruleid
35 distinct serial

Related table/index size are
relation|  size
+-
 public.ruddersysevents | 3190 MB
 public.nodeid_idx  | 614 MB
 public.configurationruleid_idx | 592 MB
 public.ruddersysevents_pkey| 236 MB
 public.executiontimestamp_idx  | 236 MB


I'm crunching the data by looking for each 
nodeid/ruleid/directiveid/serial with an executiontimestamp in an interval:


explain analyze select executiondate, nodeid, ruleid, directiveid, 
serial, component, keyValue, executionTimeStamp, eventtype, policy, msg 
from RudderSysEvents where 1=1  and nodeId = 
'31264061-5ecb-4891-9aa4-83824178f43d'  and ruleId = 
'61713ff1-aa6f-4c86-b3cb-7012bee707dd' and serial = 10 and 
executiontimestamp between to_timestamp('2012-11-22 16:00:16.005', 
'-MM-DD HH24:MI:SS.MS') and to_timestamp('2013-01-25 18:53:52.467', 
'-MM-DD HH24:MI:SS.MS') ORDER BY executionTimeStamp asc;
 Sort  (cost=293125.41..293135.03 rows=3848 width=252) (actual 
time=28628.922..28647.952 rows=62403 loops=1)

   Sort Key: executiontimestamp
   Sort Method:  external merge  Disk: 17480kB
   ->  Bitmap Heap Scan on ruddersysevents  (cost=74359.66..292896.27 
rows=3848 width=252) (actual time=1243.150..28338.927 rows=62403 loops=1)
 Recheck Cond: ((nodeid = 
'31264061-5ecb-4891-9aa4-83824178f43d'::text) AND (ruleid = 
'61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text))
 Filter: ((serial = 10) AND (executiontimestamp >= 
to_timestamp('2012-11-22 16:00:16.005'::text, '-MM-DD 
HH24:MI:SS.MS'::text)) AND (executiontimestamp <= 
to_timestamp('2013-01-25 18:53:52.467'::text, '-MM-DD 
HH24:MI:SS.MS'::text)))
 ->  BitmapAnd  (cost=74359.66..74359.66 rows=90079 width=0) 
(actual time=1228.610..1228.610 rows=0 loops=1)
   ->  Bitmap Index Scan on nodeid_idx  
(cost=0.00..25795.17 rows=716237 width=0) (actual time=421.365..421.365 
rows=690503 loops=1)
 Index Cond: (nodeid = 
'31264061-5ecb-4891-9aa4-83824178f43d'::text)
   ->  Bitmap Index Scan on configurationruleid_idx  
(cost=0.00..48562.32 rows=1386538 width=0) (actual time=794.490..794.490 
rows=1381391 loops=1)
 Index Cond: (ruleid = 
'61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text)

 Total runtime: 28657.352 ms



I'm surprised that the executiontimestamp index is not used, since it 
seems to be where most of the query time is spent.


For all my tests, I removed all the incoming logs, so that this table 
has only selects and no writes


I'm using Postgres 8.4, on a quite smallish VM, with some process 
runnings, with the following non default configuration

shared_buffers = 112MB
work_mem = 8MB
maintenance_work_mem = 48MB
max_stack_depth = 3MB
wal_buffers = 1MB
effective_cache_size = 128MB
checkpoint_segments = 6

Increasing the shared_buffers to 384, 1GB or 1500MB didn't impr

Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 12.04

2013-02-14 Thread Dan Kogan
Thanks for the info.
Our application does have a lot of concurrency.  We checked the zone reclaim 
parameter and it is turn off (that was the default, we did not have to change 
it).

Dan

-Original Message-
From: Merlin Moncure [mailto:mmonc...@gmail.com] 
Sent: Thursday, February 14, 2013 9:08 AM
To: Dan Kogan
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 
12.04

On Tue, Feb 12, 2013 at 11:25 AM, Dan Kogan  wrote:
> Hello,
>
>
>
> We upgraded from Ubuntu 11.04 to Ubuntu 12.04 and almost immediately 
> obeserved increased CPU usage and significantly higher load average on 
> our database server.
>
> At the time we were on Postgres 9.0.5.  We decided to upgrade to 
> Postgres
> 9.2 to see if that resolves the issue, but unfortunately it did not.
>
>
>
> Just for illustration purposes, below are a few links to cpu and load 
> graphs pre and post upgrade.
>
>
>
> https://s3.amazonaws.com/iqtell.ops/Load+Average+Post+Upgrade.png
>
> https://s3.amazonaws.com/iqtell.ops/Load+Average+Pre+Upgrade.png
>
>
>
> https://s3.amazonaws.com/iqtell.ops/Server+CPU+Post+Upgrade.png
>
> https://s3.amazonaws.com/iqtell.ops/Server+CPU+Pre+Upgrade.png
>
>
>
> We also tried tweaking kernel parameters as mentioned here - 
> http://www.postgresql.org/message-id/50e4aab1.9040...@optionshouse.com
> , but have not seen any improvement.
>
>
>
>
>
> Any advice on how to trace what could be causing the change in CPU 
> usage and load average is appreciated.
>
>
>
> Our postgres version is:
>
>
>
> PostgreSQL 9.2.2 on x86_64-unknown-linux-gnu, compiled by gcc 
> (Ubuntu/Linaro
> 4.6.3-1ubuntu5) 4.6.3, 64-bit
>
>
>
> OS:
>
>
>
> Linux ip-10-189-175-25 3.2.0-37-virtual #58-Ubuntu SMP Thu Jan 24 
> 15:48:03 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>
>
>
> Hardware (this an Amazon Ec2 High memory quadruple extra large instance):
>
>
>
> 8 core Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz
>
> 68 GB RAM
>
> RAID10 with 8 drives using xfs
>
> Drives are EBS with provisioned IOPS, with 1000 iops each
>
>
>
> Postgres Configuration:
>
>
>
> archive_command = rsync -a %p 
> slave:/var/lib/postgresql/replication_load/%f
>
> archive_mode = on
>
> checkpoint_completion_target = 0.9
>
> checkpoint_segments = 64
>
> checkpoint_timeout = 30min
>
> default_text_search_config = pg_catalog.english
>
> external_pid_file = /var/run/postgresql/9.2-main.pid
>
> lc_messages = en_US.UTF-8
>
> lc_monetary = en_US.UTF-8
>
> lc_numeric = en_US.UTF-8
>
> lc_time = en_US.UTF-8
>
> listen_addresses = *
>
> log_checkpoints=on
>
> log_destination=stderr
>
> log_line_prefix = %t [%p]: [%l-1]
>
> log_min_duration_statement =500
>
> max_connections=300
>
> max_stack_depth=2MB
>
> max_wal_senders=5
>
> shared_buffers=4GB
>
> synchronous_commit=off
>
> unix_socket_directory=/var/run/postgresql
>
> wal_keep_segments=128
>
> wal_level=hot_standby
>
> work_mem=8MB

does your application have a lot of concurrency?  history has shown that 
postgres is highly sensitive to changes in the o/s scheduler (which changes a 
lot from release to release).

also check this:
zone reclaim 
(http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html)

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Surprising no use of indexes - low performance

2013-02-14 Thread Ireneusz Pluta


W dniu 2013-02-14 16:35, Nicolas Charles pisze:
I'm crunching the data by looking for each nodeid/ruleid/directiveid/serial with an 
executiontimestamp in an interval:


explain analyze select executiondate, nodeid, ruleid, directiveid, serial, component, keyValue, 
executionTimeStamp, eventtype, policy, msg from RudderSysEvents where 1=1  and nodeId = 
'31264061-5ecb-4891-9aa4-83824178f43d'  and ruleId = '61713ff1-aa6f-4c86-b3cb-7012bee707dd' and 
serial = 10 and executiontimestamp between to_timestamp('2012-11-22 16:00:16.005', '-MM-DD 
HH24:MI:SS.MS') and to_timestamp('2013-01-25 18:53:52.467', '-MM-DD HH24:MI:SS.MS') ORDER BY 
executionTimeStamp asc;
 Sort  (cost=293125.41..293135.03 rows=3848 width=252) (actual time=28628.922..28647.952 
rows=62403 loops=1)

   Sort Key: executiontimestamp
   Sort Method:  external merge  Disk: 17480kB
   ->  Bitmap Heap Scan on ruddersysevents (cost=74359.66..292896.27 rows=3848 width=252) (actual 
time=1243.150..28338.927 rows=62403 loops=1)
 Recheck Cond: ((nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text) AND (ruleid = 
'61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text))
 Filter: ((serial = 10) AND (executiontimestamp >= to_timestamp('2012-11-22 
16:00:16.005'::text, '-MM-DD HH24:MI:SS.MS'::text)) AND (executiontimestamp <= 
to_timestamp('2013-01-25 18:53:52.467'::text, '-MM-DD HH24:MI:SS.MS'::text)))
 ->  BitmapAnd  (cost=74359.66..74359.66 rows=90079 width=0) (actual 
time=1228.610..1228.610 rows=0 loops=1)
   ->  Bitmap Index Scan on nodeid_idx (cost=0.00..25795.17 rows=716237 width=0) 
(actual time=421.365..421.365 rows=690503 loops=1)

 Index Cond: (nodeid = 
'31264061-5ecb-4891-9aa4-83824178f43d'::text)
   ->  Bitmap Index Scan on configurationruleid_idx  (cost=0.00..48562.32 rows=1386538 
width=0) (actual time=794.490..794.490 rows=1381391 loops=1)

 Index Cond: (ruleid = 
'61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text)
 Total runtime: 28657.352 ms



I'm surprised that the executiontimestamp index is not used, since it seems to be where most of 
the query time is spent.


this use pattern is quite similar to the one I used to have problem with. The key problem here is 
that planner wants to bitmapand on indexes that are spread on all the table, on all timestamp 
values, regardless you are interested in only a narrow timestamp window, and is quite aggressive on 
using bitmapscan feature. So the planner needs to be directed more precisely.


You could try the above again with:

SET enable_bitmapscan TO off ?

It helped in my case.

You may also try close the timestamp condition in a "preselecting" CTE, and doing the rest of finer 
filtering outside of it, like:


with
p as (select * from RudderSysEvents where executiontimestamp between '2012-11-22 16:00:16.005' and 
'2013-01-25 18:53:52.467')
select executiondate, nodeid, ruleid, directiveid, serial, component, keyValue, executionTimeStamp, 
eventtype, policy, msg

from p
where nodeId = '31264061-5ecb-4891-9aa4-83824178f43d'  and ruleId = 
'61713ff1-aa6f-4c86-b3cb-7012bee707dd' and serial = 10


As a side note, I think that all your indexes, except the timestamp one, are unnecessary, because of 
low distribution or their values, and, as you see, the confuse they make to the planner.


Eventually, you may use one of the columns as a second one to a two column index together with 
timestamp, the one which may always be used for filtering and add its filtering inside the CTE part.


HTH,
Irek.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 12.04

2013-02-14 Thread Josh Berkus
On 02/13/2013 05:30 PM, Dan Kogan wrote:
> Just to be clear - I was describing the current situation in our production.
> 
> We were running pgbench on different Ununtu versions today.  I don’t have 
> 12.04 setup at the moment, but I do have 12.10, which seems to be performing 
> about the same as 12.04 in our tests with pgbench.
> Running pgbench with 8 jobs and 32 clients resulted in load average of about 
> 15 and TPS was 51350.

What size database?

> 
> Question - how many cores does your server have?  Ours has 8 cores.

32

I suppose I could throw multiple pgbenches at it.  I just dont' see the
load numbers as unusual, but I don't have a similar pre-12.04 server to
compare with.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Surprising no use of indexes - low performance

2013-02-14 Thread Jeff Janes
On Thu, Feb 14, 2013 at 7:35 AM, Nicolas Charles
 wrote:
>
> It contains 11018592 entries, with the followinf patterns :
> 108492 distinct executiontimestamp
> 14 distinct nodeid
> 59 distinct directiveid
> 26 distinct ruleid
> 35 distinct serial

How many entries fall within a typical query interval of executiontimestamp?

...
>
> I'm surprised that the executiontimestamp index is not used, since it seems
> to be where most of the query time is spent.

I do not draw that conclusion from your posted information.  Can you
highlight the parts of it that lead you to this conclusion?

> For all my tests, I removed all the incoming logs, so that this table has
> only selects and no writes
>
> I'm using Postgres 8.4, on a quite smallish VM, with some process runnings,

A lot of improvements have been made since 8.4 which would make this
kind of thing easier to figure out.  What is smallish?

> with the following non default configuration
> shared_buffers = 112MB
> work_mem = 8MB
> maintenance_work_mem = 48MB
> max_stack_depth = 3MB
> wal_buffers = 1MB
> effective_cache_size = 128MB

effective_cache_size seems small unless you expect to have a lot of
this type of query running simultaneously, assuming you have at least
4GB of RAM, which I'm guessing you do based on your next comments.

> checkpoint_segments = 6
>
> Increasing the shared_buffers to 384, 1GB or 1500MB didn't improve the
> performances (less than 10%). I would have expected it to improve, since the
> indexes would all fit in RAM

If the indexes fit in RAM, they fit in RAM.  If anything, increasing
shared_buffers could make it harder to fit them entirely in RAM.  If
your shared buffers undergo a lot of churn, then the OS cache and the
shared buffers tend to uselessly mirror each other, meaning there is
less space for non-redundant pages.

>
> create index composite_idx on ruddersysevents (executiontimestamp, ruleid,
> serial, nodeid);

I wouldn't expect this to work well for this particular query.  Since
the leading column is used in a range test, the following columns
cannot be used efficiently in the index structure.  You should put the
equality-tested columns at the front of the index and the range-tested
one at the end of it.


>
> 2/ Removing nodeid from the index did lower again the perf
> create index composite2_idx on ruddersysevents (executiontimestamp, ruleid,
> serial);


I doubt that 84888.349 vs 83717.901 is really a meaningful difference.

> 3/ Removing executiontimestamp from the composite index makes the query
> performs better at the begining of its uses (around 17 secondes), but over
> time it degrades (I'm logging query longer than 20 secondes, and there are
> very rare in the first half of the batch, and getting more and more common
> at the end) to what is below

If the batch processing adds data, it is not surprising the query
slows down.  It looks like it is still faster at the end then the
previous two cases, right?


> So my question is :
> "Why *not* indexing the column which is not used makes the query slower over
> time, while not slowing the application?"

I don't know what column you are referring to here.  But it sounds
like you think that dropping the leading column from an index is a
minor change.  It is not.  It makes a fundamentally different index.

Cheers,

Jeff


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 12.04

2013-02-14 Thread Dan Kogan
We used scale factor of 3600.  
Yeah, maybe other people see similar load average, we were not sure.
However, we saw a clear difference right after the upgrade.  
We are trying to determine whether it makes sense for us to go to 11.04 or 
maybe there is something here we are missing.

-Original Message-
From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Josh Berkus
Sent: Thursday, February 14, 2013 1:38 PM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 
12.04

On 02/13/2013 05:30 PM, Dan Kogan wrote:
> Just to be clear - I was describing the current situation in our production.
> 
> We were running pgbench on different Ununtu versions today.  I don’t have 
> 12.04 setup at the moment, but I do have 12.10, which seems to be performing 
> about the same as 12.04 in our tests with pgbench.
> Running pgbench with 8 jobs and 32 clients resulted in load average of about 
> 15 and TPS was 51350.

What size database?

> 
> Question - how many cores does your server have?  Ours has 8 cores.

32

I suppose I could throw multiple pgbenches at it.  I just dont' see the load 
numbers as unusual, but I don't have a similar pre-12.04 server to compare with.


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 12.04

2013-02-14 Thread Josh Berkus
On 02/14/2013 12:41 PM, Dan Kogan wrote:
> We used scale factor of 3600.  
> Yeah, maybe other people see similar load average, we were not sure.
> However, we saw a clear difference right after the upgrade.  
> We are trying to determine whether it makes sense for us to go to 11.04 or 
> maybe there is something here we are missing.

Well, I'm seeing a higher system % on CPU than I expect (around 15% on
each core), and a MUCH higher context-switch than I expect (up to 500K).
 Is that anything like you're seeing?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 12.04

2013-02-14 Thread Dan Kogan
Yes, we are seeing higher system % on the CPU, not sure how to quantify in 
terms of % right now - will check into that tomorrow.
We were not checking the context switch numbers during our benchmark, will 
check that tomorrow as well.

-Original Message-
From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Josh Berkus
Sent: Thursday, February 14, 2013 6:58 PM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 
12.04

On 02/14/2013 12:41 PM, Dan Kogan wrote:
> We used scale factor of 3600.  
> Yeah, maybe other people see similar load average, we were not sure.
> However, we saw a clear difference right after the upgrade.  
> We are trying to determine whether it makes sense for us to go to 11.04 or 
> maybe there is something here we are missing.

Well, I'm seeing a higher system % on CPU than I expect (around 15% on each 
core), and a MUCH higher context-switch than I expect (up to 500K).
 Is that anything like you're seeing?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 12.04

2013-02-14 Thread Scott Marlowe
If you run your benchmarks for more than a few minutes I highly
recommend enabling sysstat service data collection, then you can look
at it after the fact with sar.  VERY useful stuff both for
benchmarking and post mortem on live servers.

On Thu, Feb 14, 2013 at 9:32 PM, Dan Kogan  wrote:
> Yes, we are seeing higher system % on the CPU, not sure how to quantify in 
> terms of % right now - will check into that tomorrow.
> We were not checking the context switch numbers during our benchmark, will 
> check that tomorrow as well.
>
> -Original Message-
> From: pgsql-performance-ow...@postgresql.org 
> [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Josh Berkus
> Sent: Thursday, February 14, 2013 6:58 PM
> To: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] High CPU usage / load average after upgrading to 
> Ubuntu 12.04
>
> On 02/14/2013 12:41 PM, Dan Kogan wrote:
>> We used scale factor of 3600.
>> Yeah, maybe other people see similar load average, we were not sure.
>> However, we saw a clear difference right after the upgrade.
>> We are trying to determine whether it makes sense for us to go to 11.04 or 
>> maybe there is something here we are missing.
>
> Well, I'm seeing a higher system % on CPU than I expect (around 15% on each 
> core), and a MUCH higher context-switch than I expect (up to 500K).
>  Is that anything like you're seeing?
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance



-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance