Re: [PERFORM] High load average with PostgreSQL 7.4.2 on debian/ibm eserver.
On Tue, 29 Jun 2004 17:55:37 +0200, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Performance issue, I'm experiencing here, is somewhat > weird - server gets high average load (from 5 up to 15, > 8 on average). Standard performance monitoring > utilities (like top) show that CPUs are not loaded > (below 20%, often near zero). So ... you never actually say what the performance issue you experience is. Having a high load average is not necessarily a performance issue. What is it that you want to fix? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] Reading recommendations
Hi I am looking for some references to literature. While we have used PostgreSQL in the past for a fair number of smaller projects, we are now starting to use it on a larger scale and hence need to get into issues of performance optimisation and tuning. While I am OK with using the EXPLAIN features, I am getting quite insecure when facing things like the options in the postgresql.conf file. For example reading the man page on fsync option, it tells me to "read the detailed documentation before using this!" I then read the Admin guide where I get told that the benefits of this feature are issue of debate, leaving me with little help as to how to make up my mind on this issue. So I turn to this mailing list, but starting reading in the archive realise that compared to the knowledge standard here, I am as wise as a baby. I have read most of Bruce Momjian's book on PostgreSQL (Can I update my 2001 edition somehow? ;-) I have Sams' PostgreSQL Developer's Handbook (which is OK too), but offers little to nothing on operational issues. I have read most of the admin (and user) guide, but it does not help me really understand the issues: CPU_INDEX_TUPLE_COST (floating point) Sets the query optimizer’s estimate of the cost of processing each index tuple during an index scan. This is measured as a fraction of the cost of a sequential page fetch. No idea what this means! (And should I feel bad for it?) I am an application programmer with a little UNIX know-how. What books or sources are out there that I can buy/download and that I should read to get to grips with the more advanced issues of running PostgreSQL? More on what we do (for those interested): We use PostgreSQL mainly with its PostGIS extension as the database backend for Zope-based applications. Adding PostGIS features is what has started to cause noticeable increase in the server load. We're using the GIS enabled system on this platform: PostgreSQL 7.3.4 PostGIS 0.8 Zope 2.7.5 Python 2.3.5 (Database-based functions are written in PL/PGSQL, not python!!) on a 2-CPU (450MHz Intel P3) Compaq box (some Proliant flavour) With a SCSI 4-disk RAID system (mirrored and striped) SunOS 5.8 (Which I think is Solaris 8) The server is administrated by my host (co-located). We cannot easily upgrade to a newer version of Solaris, because we could not find a driver for the disk controller used in this server. (And our host did not manage to write/patch one up.) As a business, we are creating and operating on-line communities, (for an example go to http://www.theguidlife.net) not only from a technical point of view, but also supporting the communities in producing content. BTW. If you are a SQL/python programmer in (or near) Lanarkshire, Scotland, we have a vacancy. ;-) Cheers Marc -- Marc Burgauer Sharedbase Ltd http://www.sharedbase.com Creating and supporting on-line communities ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] tuning Postgres for large data import (using Copy from)
Hello, I'd like to tune Postgres for large data import (using Copy from). here are a few steps already done: 1) use 3 different disks for: -1: source data -2: index tablespaces -3: data tablespaces 2) define all foreign keys as initially deferred 3) tune some parameters: max_connections =20 shared_buffers =3 work_mem = 8192 maintenance_work_mem = 32768 checkpoint_segments = 12 (I also modified the kernel accordingly) 4) runs VACUUM regulary The server runs RedHat and has 1GB RAM In the production (which may run on a better server), I plan to: - import a few millions rows per day, - keep up to ca 100 millions rows in the db - delete older data I've seen a few posting on hash/btree indexes, which say that hash index do not work very well in Postgres; currently, I only use btree indexes. Could I gain performances whole using hash indexes as well ? How does Postgres handle concurrent copy from on: same table / different tables ? I'd be glad on any further suggestion on how to further increase my performances. Marc -- +++ Lassen Sie Ihren Gedanken freien Lauf... z.B. per FreeSMS +++ GMX bietet bis zu 100 FreeSMS/Monat: http://www.gmx.net/de/go/mail ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Optimising queries involving unions
Hi, I'm using a workaround for this kind of issues: consider: select A from (select B from T1 where C union select B from T2 where C union select B from T3 where C ) foo where D in your case: SELECT u.txt FROM ( SELECT id, txt FROM largetable1,smalltable t WHERE t.id = u.id AND t.foo = 'bar' UNION ALL SELECT id, txt FROM largetable2,smalltable t WHERE t.id = u.id AND t.foo = 'bar' ) u and select A from foo where C and D (A, B, C, D being everything you want, C and D may also include "GROUP BY,ORDER...) The first version will be handled correctly by the optimiser, whereas in the second version, Postgres will first build the UNION and then run the query on it. I'm having large tables with identical structure, one per day. Instead of defining a view on all tables, I' using functions that "distribute" my query on all tables. The only issue if that I need to define a type that match the result structure and each return type needs its own function. Example: (The first parameter is a schema name, the four next corresponds to A, B, C, D - create type T_i2_vc1 as (int_1 int,int_2 int,vc_1 varchar); CREATE OR REPLACE FUNCTION vq_T_i2_vc1(varchar,varchar,varchar,varchar,varchar) RETURNS setof T_i2_vc1 AS $$ DECLARE result T_i2_vc1%rowtype; mviews RECORD; sql varchar; counter int; BEGIN select into counter 1; -- loop on all daily tables FOR mviews IN SELECT distinct this_day FROM daylist order by plainday desc LOOP IF counter =1 THEN select INTO sql 'SELECT '||mviews.this_day||' AS plainday, '||$2||' FROM '||$3||'_'||mviews.plainday||' WHERE '||$4; ELSE select INTO sql sql||' UNION ALL SELECT '||mviews.this_day||' AS plainday, '||$2||' FROM '||$3||'_'||mviews.plainday||' WHERE '||$4; END IF; select into counter counter+1; END LOOP; select INTO sql 'SELECT '||$1||' FROM ('||sql||')foo '||$5; for result in EXECUTE (sql) LOOP return NEXT result; end loop; return ; END; $$ LANGUAGE plpgsql; Note: in your case the function shoud have a further parameter to join largetable(n) to smalltable in the "sub queries" HTH, Marc > I've got a query that I think the query optimiser should be able > to work it's magic on but it doesn't! I've had a look around and > asked on the IRC channel and found that the current code doesn't > attempt to optimise for what I'm asking it to do at the moment. > Here's a bad example: > > SELECT u.txt > FROM smalltable t, ( > SELECT id, txt FROM largetable1 > UNION ALL > SELECT id, txt FROM largetable2) u > WHERE t.id = u.id > AND t.foo = 'bar'; > > I was hoping that "smalltable" would get moved up into the union, > but it doesn't at the moment and the database does a LOT of extra > work. In this case, I can manually do quite a couple of transforms > to move things around and it does the right thing: > > SELECT txt > FROM ( > SELECT l.id as lid, r.id as rid, r.foo, l.txt > FROM largetable1 l, smalltable r > UNION ALL > SELECT l.id as lid, r.id as rid, r.foo, l.txt > FROM largetable1 l, smalltable r) > WHERE foo = 'bar'; > AND lid = rid > > The optimiser is intelligent enough to move the where clauses up > into the union and end end up with a reasonably optimal query. > Unfortunatly, in real life, the query is much larger and reorganising > everything manually isn't really feasible! -- Weitersagen: GMX DSL-Flatrates mit Tempo-Garantie! Ab 4,99 Euro/Monat: http://www.gmx.net/de/go/dsl ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] TIP 9: the planner will ignore... & datatypes
TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match But INT2, INT4, INT8 and "SERIAL" are considered to be a unique datatype. Am I Right? Thanks, Marc -- Geschenkt: 3 Monate GMX ProMail gratis + 3 Ausgaben stern gratis ++ Jetzt anmelden & testen ++ http://www.gmx.net/de/go/promail ++ ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] Query limitations (size, number of UNIONs ...)
Hello, I've split my data in daily tables to keep them in an acceptable size. Now I have quite complex queries which may be very long if I need to query a large number of daily tables. I've just made a first test wich resulted in a query being 15KB big annd containing 63 UNION. The Query plan in PGAdmin is about 100KB big with 800 lines :-) The performance is not such bad, but I'm wondering if there are some POSTGRES limitations I should take care of with this strategy. Thanks, Marc -- Geschenkt: 3 Monate GMX ProMail gratis + 3 Ausgaben stern gratis ++ Jetzt anmelden & testen ++ http://www.gmx.net/de/go/promail ++ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] Slow Query
Left Join (cost=2.33..13056.04 rows=149982 width=83) (actual time=0.406..953.781 rows=138491 loops=1) Merge Cond: (("outer".assetid)::text = "inner"."?column4?") -> Index Scan using sq_ast_pkey on sq_ast a (cost=0.00..14952.78 rows=149982 width=48) (actual time=0.154..388.872 rows=138488 loops=1) -> Sort (cost=2.33..2.43 rows=37 width=43) (actual time=0.235..0.264 rows=37 loops=1) Sort Key: (u.assetid)::text -> Seq Scan on sq_ast_url u (cost=0.00..1.37 rows=37 width=43) (actual time=0.036..0.103 rows=37 loops=1) -> Index Scan using sq_ast_path_ast on sq_ast_path ap (cost=0.00..1926.18 rows=42071 width=23) (actual time=0.110..105.918 rows=42661 loops=1) -> Sort (cost=249.05..249.14 rows=36 width=21) (actual time=0.310..0.324 rows=13 loops=1) Sort Key: (l.minorid)::text -> Index Scan using sq_ast_lnk_majorid on sq_ast_lnk l (cost=0.00..248.12 rows=36 width=21) (actual time=0.141..0.282 rows=13 loops=1) Index Cond: ((majorid)::text = '2'::text) Filter: (link_type <= 2) -> Index Scan using sq_ast_lnk_tree_linkid on sq_ast_lnk_tree lt (cost=0.00..11.41 rows=2 width=8) (actual time=0.043..0.045 rows=1 loops=13) Index Cond: ("outer".linkid = lt.linkid) Total runtime: 2170.165 ms (22 rows) THE DESC of the sq_ast table. future_v3_schema=# \d sq_ast Table "public.sq_ast" Column | Type | Modifiers ---+-+- assetid | character varying(15) | not null type_code | character varying(100) | not null version | character varying(20) | not null default '0.0.0'::character varying name | character varying(255) | not null default ''::character varying short_name | character varying(255) | not null default ''::character varying status | integer | not null default 1 languages | character varying(50) | not null default ''::character varying charset | character varying(50) | not null default ''::character varying force_secure | character(1) | not null default '0'::bpchar created | timestamp without time zone | not null created_userid | character varying(255) | not null updated | timestamp without time zone | not null updated_userid | character varying(255) | not null published | timestamp without time zone | published_userid | character varying(255) | status_changed | timestamp without time zone | status_changed_userid | character varying(255) | Indexes: "sq_asset_pkey" primary key, btree (assetid) "sq_ast_created" btree (created) "sq_ast_name" btree (name) "sq_ast_published" btree (published) "sq_ast_type_code" btree (type_code) "sq_ast_updated" btree (updated) Any ideas? -- Marc McIntyre MySource Matrix Lead Developer ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Looking for tips
Title: Re: [PERFORM] Looking for tips Hi, I have a similar application, but instead of adding new items to the db once at time, I retrieve new IDs from a sequence (actually only every 10'000 times) and write a csv file from perl. When finished, I load all new record in one run with Copy. hth, Marc Mamin From: [EMAIL PROTECTED] on behalf of Oliver CrosbySent: Wed 7/20/2005 3:50 AMTo: PFCCc: Sven Willenberger; Dawid Kuroczko; Kevin Grittner; [EMAIL PROTECTED]; pgsql-performance@postgresql.orgSubject: Re: [PERFORM] Looking for tips Sorry for the lack of specifics...We have a file generated as a list of events, one per line. Supposelines 1,2,3,5,7,11,etc were related, then the last one would specifythat it's the last event. Gradually this gets assembled by a perlscript and when the last event is encountered, it gets inserted intothe db. For a given table, let's say it's of the form (a,b,c) where'a' is a pkey, 'b' is indexed, and 'c' is other related information.The most common 'b' values are cached locally with the perl script tosave us having to query the db. So what we end up having is:if 'b' exists in cache, use cached 'a' value and continueelse if 'b' exists in the db, use the associated 'a' value and continueelse add a new line with 'b', return the new 'a' and continueThe local cache was a huge time saver with mysql. I've tried making aplpgsql function that handles everything in one step on the db side,but it didn't show any improvement. Time permitting, I'll try some newapproaches with changing the scripts and queries, though right now Iwas just hoping to tune postgresql.conf to work better with thehardware available.Thanks to everyone for your help. Very much appreciated.---(end of broadcast)---TIP 5: don't forget to increase your free space map settings
[PERFORM] insert performance for win32
Hi, I usually use PostgreSQL coupled with Linux, but I have to use Windows for a perticular project. So I wanted to do some tests to know if the performance will be acceptable (I don't need PostgreSQL to be as fast with windows as with linux, but it has to be usable...). I started with trying to do lots of inserts, and I'm quite astonished by the catastrophics results ... The test : The computer was the same (my workstation, a PIV Dell with SATA disk), dual boot The windows OS is XP. Both Oses are PostgreSQL 8.0.3 Both PostgreSQL clusters (windows and linux) have the same tuning (shared_buffers=2, wal_buffers=128, checkpoint_segments=10) Before each test, the clusters are vacuum analyzed, and the test database is recreated. The script is quite dumb : BEGIN; CREATE TABLE test (col1 serial, col2 text); INSERT INTO test (col2) values ('test'); INSERT INTO test (col2) values ('test'); INSERT INTO test (col2) values ('test'); INSERT INTO test (col2) values ('test'); INSERT INTO test (col2) values ('test'); .. 500,000 times Then COMMIT. I know it isn't realistic, but I needed to start with something :) The results are as follows : Linux : 1'9'' Windows : 9'38'' What I've tried to solve, and didn't work : - Deactivate antivirus on windows - fsync=no - raise the checkpoint_segments value (32) - remove hyperthreading (who knows...) I don't know what could cause this (I'm not a windows admin...at all). All I see is a very high kernel load during the execution of this script, but I can't determine where it comes from. I'd like to know if this is a know problem, if there is something I can do, etc... Thanks a lot. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] insert performance for win32
> > In my experience win32 is par with linux generally with a few gotchas on > either side. Are your times with fsync=no? It's much harder to give > apples-apples comparison with fsync=on for various reasons. It is with fsync=off on windows, fsync=on on linux > > Are you running stats_command_string=on? Try disabling and compare > results. Deactivated on windows, activated on linux > Is your loading app running locally or on the server? Yes > > I am very interesting in discovering sources of high cpu load problems > on win32. If you are still having problems could you get a gprof > profile together? There is a recent thread on win32-hackers discussing > how to do this. I'll give it a look > > Merlin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] insert performance for win32
On Tuesday 06 September 2005 19:11, Merlin Moncure wrote: > > This makes me wonder if we are looking in the wrong place. Maybe the > > problem is coming from psql? More results to follow. > > problem is not coming from psql. > > One thing I did notice that in a 250k insert transaction the insert time > grows with #recs inserted. Time to insert first 50k recs is about 27 > sec and last 50 k recs is 77 sec. I also confimed that size of table is > not playing a role here. > > Marc, can you do select timeofday() every 50k recs from linux? Also a > gprof trace from linux would be helpful. > Here's the timeofday ... i'll do the gprof as soon as I can. Every 5 rows... Wed Sep 07 13:58:13.860378 2005 CEST Wed Sep 07 13:58:20.926983 2005 CEST Wed Sep 07 13:58:27.928385 2005 CEST Wed Sep 07 13:58:35.472813 2005 CEST Wed Sep 07 13:58:42.825709 2005 CEST Wed Sep 07 13:58:50.789486 2005 CEST Wed Sep 07 13:58:57.553869 2005 CEST Wed Sep 07 13:59:04.298136 2005 CEST Wed Sep 07 13:59:11.066059 2005 CEST Wed Sep 07 13:59:19.368694 2005 CEST > Merlin > > ---(end of broadcast)--- > TIP 9: In versions below 8.0, the planner will ignore your desire to >choose an index scan if your joining column's datatypes do not >match ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] insert performance for win32
Le Mercredi 02 Novembre 2005 14:54, Magnus Hagander a écrit : > > > I've done the tests with rc1. This is still as slow on windows ... > > > > about > > > > > 6-10 > > > times slower thant linux (via Ip socket). (depending on > > > > using prepared > > > > > queries, etc...) > > > > > > By the way, we've tried to insert into the windows database from a > > > > linux > > > > > psql > > > client, via the network. In this configuration, inserting is only > > > > about 2 > > > > > times slower than inserting locally (the linux client had a > > > > slower CPU > > > > > 1700Mhz agains 3000). > > > Could it be related to a problem in the windows psql client ? > > > > [OK, I'm bringing this back on-list, and bringing it to > > QingQing's attention, who I secretly hope is the right person > > to be looking at this problem :)] > > > > Just to recap Marc and I have been looking at the performance > > disparity between windows and linux for a single transaction > > statement by statement insert on a very narrow table with no > > keys from a remote client. Marc's observations showed (and I > > verified) that windows is much slower in this case than it > > should be. I gprof'ed both the psql client and the server > > during the insert and didn't see anything seriously out of > > order...unfortunately QQ's latest win32 performance tweaks > > haven't helped. > > > > Marc's observation that by switching to a linux client drops > > time down drastically is really intersing! > > Could this be a case of the network being slow, as we've seen a couple > of times before? And if you run psql on the local box, you get it > double. > > Do you get a speed difference between the local windows box and a remote > wnidows box? > > //Magnus The Windows-Windows test is local (via loopback interface) The Linux (client) - Windows (server) is via network (100Mbits) I can't test with 2 windows box ... I haven't got that much (all machines linux, except the test one...) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Sort performance on large tables
I have run into this type of query problem as well. I solved it in my application by the following type of query. SELECT tlid FROM completechain AS o WHERE not exists ( SELECT 1 FROM completechain WHERE tlid=o.tlid and ogc_fid!=o.ogc_fid ); Assumes of course that you have an index on tlid. > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Charlie Savage > Sent: Tuesday, November 08, 2005 2:05 AM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] Sort performance on large tables > > Hi everyone, > > I have a question about the performance of sort. > > Setup: Dell Dimension 3000, Suse 10, 1GB ram, PostgreSQL 8.1 > RC 1 with PostGIS, 1 built-in 80 GB IDE drive, 1 SATA Seagate > 400GB drive. The IDE drive has the OS and the WAL files, the > SATA drive the database. > From hdparm the max IO for the IDE drive is about 50Mb/s and > the SATA drive is about 65Mb/s. Thus a very low-end machine > - but it used just for development (i.e., it is not a > production machine) and the only thing it does is run a > PostgresSQL database. > > I have a staging table called completechain that holds US > tiger data (i.e., streets and addresses for the US). The > table is approximately 18GB. Its big because there is a lot > of data, but also because the table is not normalized (it > comes that way). > > I want to extract data out of the file, with the most > important values being stored in a column called tlid. The > tlid field is an integer, and the values are 98% unique. > There is a second column called ogc_fid which is unique (it > is a serial field). I need to extract out unique TLID's > (doesn't matter which duplicate I get rid of). To do this I > am running this query: > > SELECT tlid, min(ogc_fid) > FROM completechain > GROUP BY tlid; > > The results from explain analyze are: > > "GroupAggregate (cost=10400373.80..11361807.88 rows=48071704 > width=8) (actual time=7311682.715..8315746.835 rows=47599910 loops=1)" > " -> Sort (cost=10400373.80..10520553.06 rows=48071704 > width=8) (actual time=7311682.682..7972304.777 rows=48199165 loops=1)" > "Sort Key: tlid" > "-> Seq Scan on completechain (cost=0.00..2228584.04 > rows=48071704 width=8) (actual time=27.514..773245.046 > rows=48199165 loops=1)" > "Total runtime: 8486057.185 ms" > > Doing a similar query produces the same results: > > SELECT DISTINCT ON (tlid), tlid, ogc_fid FROM completechain; > > Note it takes over 10 times longer to do the sort than the > full sequential scan. > > Should I expect results like this? I realize that the > computer is quite low-end and is very IO bound for this > query, but I'm still surprised that the sort operation takes so long. > > Out of curiosity, I setup an Oracle database on the same > machine with the same data and ran the same query. Oracle > was over an order of magnitude faster. Looking at its query > plan, it avoided the sort by using "HASH GROUP BY." Does > such a construct exist in PostgreSQL (I see only hash joins)? > > Also as an experiment I forced oracle to do a sort by running > this query: > > SELECT tlid, min(ogc_fid) > FROM completechain > GROUP BY tlid > ORDER BY tlid; > > Even with this, it was more than a magnitude faster than Postgresql. > Which makes me think I have somehow misconfigured postgresql > (see the relevant parts of postgresql.conf below). > > Any idea/help appreciated. > > Thanks, > > Charlie > > > --- > > #- > -- > # RESOURCE USAGE (except WAL) > #- > -- > > shared_buffers = 4 # 4 buffers * 8192 > bytes/buffer = 327,680,000 bytes > #shared_buffers = 1000# min 16 or > max_connections*2, 8KB each > > temp_buffers = 5000 > #temp_buffers = 1000 # min 100, 8KB each > #max_prepared_transactions = 5# can be 0 or more > # note: increasing max_prepared_transactions costs ~600 bytes > of shared memory # per transaction slot, plus lock space (see > max_locks_per_transaction). > > work_mem = 16384# in Kb > #work_mem = 1024 # min 64, size in KB > > maintenance_work_mem = 262144# in kb > #maintenance_work_mem = 16384 # min 1024, size in KB > #max_stack_depth = 2048 # min 100, size in KB > > # - Free Space Map - > > max_fsm_pages = 6 > #max_fsm_pages = 2# min > max_fsm_relations*16, 6 bytes each > > #max_fsm_relations = 1000 # min 100, ~70 bytes each > > # - Kernel Resource Usage - > > #max_files_per_process = 1000 # min 25 > #preload_libraries = '' > > # - Cost-Based Vacuum Delay - > > #vacuum_cost_delay = 0#
[PERFORM] sort/limit across union all
We have a large DB with partitioned tables in postgres. We have had trouble with a ORDER/LIMIT type query. The order and limit are not pushed down to the sub-tables CREATE TABLE base ( foo int ); CREATE TABLE bar_0 extra int ) INHERITS (base); ALTER TABLE bar ADD PRIMARY KEY (foo); -- repeated for bar_0... bar_40 SELECT foo FROM base ORDER BY foo LIMIT 10; is real slow. What is required to make the query planner generate the following instead... (code change i know, but how hard would it be?) SELECT foo FROM ( SELECT * FROM bar_0 ORDER BY foo LIMIT 10 UNION ALL SELECT * FROM bar_1 ORDER BY foo LIMIT 10 ) AS base ORDER BY foo LIMIT 10;
[PERFORM] partitioning
Hi, I've been working on trying to partition a big table (I've never partitioned a table in any other database till now). Everything went ok, except one query that didn't work afterwards. I've put the partition description, indexes, etc ..., and the explain plan attached. The query is extremely fast without partition (index scan backards on the primary key) The query is : "select * from logs order by id desc limit 100;" id is the primary key. It is indexed on all partitions. But the explain plan does full table scan on all partitions. While I think I understand why it is doing this plan right now, is there something that could be done to optimize this case ? Or put a warning in the docs about this kind of behaviour. I guess normally someone would partition to get faster queries :) Anyway, I thought I should mention this, as it has been quite a surprise. CREATE TABLE logs_150 (CHECK ( id_machine = 150)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_163 (CHECK ( id_machine = 163)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_289 (CHECK ( id_machine = 289)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_319 (CHECK ( id_machine = 319)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_238 (CHECK ( id_machine = 238)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_148 (CHECK ( id_machine = 148)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_176 (CHECK ( id_machine = 176)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_164 (CHECK ( id_machine = 164)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_316 (CHECK ( id_machine = 316)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_313 (CHECK ( id_machine = 313)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_217 (CHECK ( id_machine = 217)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_167 (CHECK ( id_machine = 167)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_287 (CHECK ( id_machine = 287)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_279 (CHECK ( id_machine = 279)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_248 (CHECK ( id_machine = 248)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_280 (CHECK ( id_machine = 280)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_165 (CHECK ( id_machine = 165)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_330 (CHECK ( id_machine = 330)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_149 (CHECK ( id_machine = 149)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_157 (CHECK ( id_machine = 157)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_272 (CHECK ( id_machine = 272)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_311 (CHECK ( id_machine = 311)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_309 (CHECK ( id_machine = 309)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_318 (CHECK ( id_machine = 318)) INHERITS (logs) TABLESPACE data_logs; CREATE UNIQUE INDEX logs_150_pkey ON logs_150 (id) TABLESPACE index_logs;CREATE INDEX date_150 ON logs_150 (date) TABLESPACE index_logs ;CREATE INDEX event_150 ON logs_150 (evenement) TABLESPACE index_logs; . logs=> explain select * from logs order by id desc limit 100; QUERY PLAN - Limit (cost=16524647.29..16524647.54 rows=100 width=295) -> Sort (cost=16524647.29..16568367.11 rows=17487927 width=295) Sort Key: public.logs.id -> Result (cost=0.00..827622.27 rows=17487927 width=295) -> Append (cost=0.00..827622.27 rows=17487927 width=295) -> Seq Scan on logs (cost=0.00..826232.78 rows=17451978 width=165) -> Seq Scan on logs_150 logs (cost=0.00..199.04 rows=6104 width=144) -> Seq Scan on logs_163 logs (cost=0.00..261.79 rows=7079 width=169) -> Seq Scan on logs_289 logs (cost=0.00..428.93 rows=10693 width=200) -> Seq Scan on logs_319 logs (cost=0.00..31.92 rows=992 width=129) -> Seq Scan on logs_238 logs (cost=0.00..28.01 rows=701 width=199) -> Seq Scan on logs_148 logs (cost=0.00..80.15 rows=2015 width=195) -> Seq Scan on logs_176 logs (cost=0.00..12.40 rows=240 width=295) -> Seq Scan on logs_164 logs (cost=0.00..12.40 rows=240 width=295) -> Seq Scan on logs_316 logs (cost=0.00..12.40 rows=240 width=295) -> Seq Scan on logs_313 logs (cost=0.00..12.40 rows=240 width=295) -> Seq Scan on logs_217 logs (cost=0.00..12.40 rows=240 width=295) -> Seq Scan on logs_167 logs (cost=0.00..57.36 rows=1536 width=170) -> Seq Scan on logs_287 logs (cost=
Re: [PERFORM] partitioning
Yes, that's how I solved it... and I totally agree that it's hard for the planner to guess what to do on the partitions. But maybe there should be something in the docs explaining the limitations ... I'm only asking for the biggest 100 ids from the table, so I thought maybe the planner would take the 100 biggest from all partitions or something like that and return me the 100 biggest from those results. It didn't and that's quite logical. What I meant is that I understand why the planner chooses this plan, but maybe it should be written somewhere in the docs that some plans will be worse after partitionning. Le Mardi 13 Décembre 2005 12:50, vous avez écrit : > I just saw that there is no where clause in the query, that you had > fed to explain plan. > you need to include a where clause based on id_machine column to see the > effect. > > On 12/13/05, Pandurangan R S <[EMAIL PROTECTED]> wrote: > > Did you set constraint_exclusion = true in postgresql.conf file? > > > > On 12/13/05, Marc Cousin <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > > > I've been working on trying to partition a big table (I've never > > > partitioned a table in any other database till now). > > > Everything went ok, except one query that didn't work afterwards. > > > > > > I've put the partition description, indexes, etc ..., and the explain > > > plan attached. > > > > > > The query is extremely fast without partition (index scan backards on > > > the primary key) > > > > > > The query is : "select * from logs order by id desc limit 100;" > > > id is the primary key. > > > > > > It is indexed on all partitions. > > > > > > But the explain plan does full table scan on all partitions. > > > > > > While I think I understand why it is doing this plan right now, is > > > there something that could be done to optimize this case ? Or put a > > > warning in the docs about this kind of behaviour. I guess normally > > > someone would partition to get faster queries :) > > > > > > Anyway, I thought I should mention this, as it has been quite a > > > surprise. > > > > > > > > > > > > ---(end of > > > broadcast)--- TIP 1: if posting/reading through > > > Usenet, please send an appropriate subscribe-nomail command to > > > [EMAIL PROTECTED] so that your message can get through to the > > > mailing list cleanly > > > > -- > > Regards > > Pandu ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Scaling SELECT:s with the number of disks on a stripe
Hello Peter, If you are dealing with timed data or similar, you may consider to partition your table(s). In order to deal with large data, I've built a "logical" partition system, whereas the target partition is defined by the date of my data (the date is part of the filenames that I import...). Instead of using the Postgres partitioning framework, I keep the tables boundaries within a refererence table. Then I've built a function that takes the different query parameters as argument (column list, where clause...). This functions retrieve the list of tables to query from my reference table and build the final query, binding the different subqueries from each partition with "UNION ALL". It also requires an additional reference table that describes the table columns (data type, behaviour , e.g. groupable,summable...) This allowed me to replace many "delete" with "drop table" statements, whis is probably the main advantage of the solution. The biggest issue was the implementation time ;-) but I'm really happy with the resulting performances. HTH, Marc -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Schuller Sent: Friday, March 30, 2007 7:17 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] Scaling SELECT:s with the number of disks on a stripe Hello, I am looking to use PostgreSQL for storing some very simple flat data mostly in a single table. The amount of data will be in the hundreds of gigabytes range. Each row is on the order of 100-300 bytes in size; in other words, small enough that I am expecting disk I/O to be seek bound (even if PostgreSQL reads a full pg page at a time, since a page is significantly smaller than the stripe size of the volume). The only important performance characteristics are insertion/deletion performance, and the performance of trivial SELECT queries whose WHERE clause tests equality on one of the columns. Other than absolute performance, an important goal is to be able to scale fairly linearly with the number of underlying disk drives. We are fully willing to take a disk seek per item selected, as long as it scales. To this end I have been doing some benchmarking to see whether the plan is going to be feasable. On a 12 disk hardware stripe, insertion performance does scale somewhat with concurrent inserters. However, I am seeing surprising effects with SELECT:s: a single selecter generates the same amount of disk activity as two concurrent selecters (I was easily expecting about twice). The query is simple: SELECT * FROM test WHERE value = 'xxx' LIMIT 1000; No ordering, no joins, no nothing. Selecting concurrently with two different values of 'xxx' yields the same amount of disk activity (never any significant CPU activity). Note that the total amount of data is too large to fit in RAM (> 500 million rows), and the number of distinct values in the value column is 1. The column in the WHERE clause is indexed. So my first question is - why am I not seeing this scaling? The absolute amount of disk activity with a single selecter is consistent with what I would expect from a SINGLE disk, which is completely expected since I never thought PostgreSQL would introduce disk I/O concurrency on its own. But this means that adding additional readers doing random-access reads *should* scale very well with 12 underlying disks in a stripe. (Note that I have seen fairly similar results on other RAID variants too, including software RAID5 (yes yes I know), in addition to the hardware stripe.) These tests have been done Linux 2.6.19.3 and PostgreSQL 8.1. Secondly, I am seeing a query plan switch after a certain threshold. Observe: perftest=# explain select * from test where val='7433' limit 1000; QUERY PLAN - Limit (cost=0.00..4016.50 rows=1000 width=143) -> Index Scan using test_val_ix on test (cost=0.00..206620.88 rows=51443 width=143) Index Cond: ((val)::text = '7433'::text) (3 rows) Now increasing to a limit of 1: perftest=# explain select * from test where val='7433' limit 1; QUERY PLAN -- Limit (cost=360.05..38393.36 rows=1 width=143) -> Bitmap Heap Scan on test (cost=360.05..196014.82 rows=51443 width=143) Recheck Cond: ((val)::text = '7433'::text) -> Bitmap Index Scan on test_val_ix (cost=0.00..360.05 rows=51443 width=0) Index Cond: ((val)::text = '7433'::text) (5 rows) The interesting part is that the latter query is entirely CPU bound (no disk I/O at all) for an extended period of time before even beginning to read data from disk. And wh
Re: [PERFORM] Planner doing seqscan before indexed join
You may try to change the planner's opinion using sub queries. Something like: select * from eventactivity, (select * from keyword_incidents, eventmain, eventgeo where eventmain.incidentid = keyword_incidents.incidentid and eventgeo.incidentid = keyword_incidents.incidentid and ( recordtext like '%JOSE CHAVEZ%' ) )foo where eventactivity.incidentid = foo.incidentid order by foo.entrydate limit 10000; HTH, Marc -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dan Harris Sent: Thursday, March 29, 2007 4:22 AM To: PostgreSQL Performance Subject: [PERFORM] Planner doing seqscan before indexed join 8.0.3 - Linux 2.6.18.. Freshly vacuumed and analyzed This database has been humming along fine for a while now, but I've got one of those sticky queries that is taking much too long to finish. After some digging, I've found that the planner is choosing to apply a necessary seq scan to the table. Unfortunately, it's scanning the whole table, when it seems that it could have joined it to a smaller table first and reduce the amount of rows it would have to scan dramatically ( 70 million to about 5,000 ). The table "eventactivity" has about 70million rows in it, index on "incidentid" The table "keyword_incidents" is a temporary table and has incidentid as its primary key. It contains 5125 rows. eventmain and eventgeo both have 2.1 million. My hope is that I can convince the planner to do the join to keyword_incidents *first* and then do the seq scan for the LIKE condition. Instead, it seems that it's seqscanning the whole 70 million rows first and then doing the join, which takes a lot longer than I'd like to wait for it. Or, maybe I'm misreading the explain output? Thanks again -Dan - Here's the query: explain analyze select * from keyword_incidents, eventactivity, eventmain, eventgeo where eventmain.incidentid = keyword_incidents.incidentid and eventgeo.incidentid = keyword_incidents.incidentid and eventactivity.incidentid = keyword_incidents.incidentid and ( recordtext like '%JOSE CHAVEZ%' ) order by eventmain.entrydate limit 1; --- Limit (cost=2388918.07..2388918.08 rows=1 width=455) (actual time=81771.186..81771.292 rows=26 loops=1) -> Sort (cost=2388918.07..2388918.08 rows=1 width=455) (actual time=81771.180..81771.215 rows=26 loops=1) Sort Key: eventmain.entrydate -> Nested Loop (cost=0.00..2388918.06 rows=1 width=455) (actual time=357.389..81770.982 rows=26 loops=1) -> Nested Loop (cost=0.00..2388913.27 rows=1 width=230) (actual time=357.292..81767.385 rows=26 loops=1) -> Nested Loop (cost=0.00..2388909.33 rows=1 width=122) (actual time=357.226..81764.501 rows=26 loops=1) -> Seq Scan on eventactivity (cost=0.00..2388874.46 rows=7 width=84) (actual time=357.147..81762.582 rows=27 loops=1) Filter: ((recordtext)::text ~~ '%JOSE CHAVEZ%'::text) -> Index Scan using keyword_incidentid_pkey on keyword_incidents (cost=0.00..4.97 rows=1 width=38) (actual time=0.034..0.036 rows=1 loops=27) Index Cond: (("outer".incidentid)::text = (keyword_incidents.incidentid)::text) -> Index Scan using eventgeo_incidentid_idx on eventgeo (cost=0.00..3.93 rows=1 width=108) (actual time=0.076..0.081 rows=1 loops=26) Index Cond: (("outer".incidentid)::text = (eventgeo.incidentid)::text) -> Index Scan using eventmain_incidentid_idx on eventmain (cost=0.00..4.78 rows=1 width=225) (actual time=0.069..0.075 rows=1 loops=26) Index Cond: (("outer".incidentid)::text = (eventmain.incidentid)::text) Total runtime: 81771.529 ms (15 rows) ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] copy from performance on large tables with indexes
Hello, Postgres: 8.2 os: Linux 4CPU, 4 GB RAM, Raid 1, 32 bit system work_mem: 600 Mb I have some tables which may become quite large (currently up to 6 Gb) . I initially fill them using copy from (files) . The import is fast enough as I only have a primary key on the table: about 18 minutes (over 300 Mb/minute) Then I need 5 additional indexes on it. Creation time: 30 minutes subsequently I compute some aggregations which need 4 hours and 30 minutes additional time And now the problem: If I get additional data for the table, the import become much more slower due to the indexes (about 30 times slower !): The performance degradation is probably due to the fact that all indexs are too large to be kept in memory. Moreover I guess that the indexes fill factors are too high (90%) During this second import, I have about 20% iowait time. The usual solution is to drop the indexes before the second import and rebuild them afterwards, but I feel unconfident doing this as I don't know how the system will react if some SELECT statements occures when the index are missing. I can hardly avoid this. So my idea for the second import process: 1) make a copy of the table: create table B as select * from table A; alter table B add constraint B_pk primary key (id); 2) import the new data in table B copy B from file; 3) create the required indexes on B create index Bix_1 on B.. create index Bix_2 on B.. create index Bix_2 on B.. create index Bix_2 on B.. 4) replace table A with table B alter table A renam to A_trash; alter table B renam to A; drop table A_trash; (and rename the indexes to get the original state) This seems to work but with side effects: The only objects that refer to the tables are functions and indexes. If a function is called within a same session before and after the table renaming, the second attempt fails (or use the table A_trash if it still exists). So I should close the session and start a new one before further processing. Errors in other live sessions are acceptable, but maybe you know a way to avoid them?) And now a few questions :-) - do you see any issue that prevent this workflow to work? - is there any other side effect to take care of ? - what is the maximum acceptable value for the parameter work_mem for my configuration (see the complete configuration below) - has anybody built a similar workflow ? - could this be a feature request to extend the capabilities of copy from ? Thanks for your time and attention, Marc Mamin
[PERFORM] tuning a function to insert/retrieve values from a reference table
Hello, I have a simple table id/value, and a function that returns the id of a given value, inserting the later if not yet present. The probability that a value already exist within the reference table is very high. Different db users may have their own reference table with different content, but as the table definition is identical, I've defined a public function to maintain these tables. Can I optimize this function with: a) remove the EXCEPTION clause (Is there an underlying lock that prevent concurrent inserts ?) b) declare the function being IMMUTABLE ? - although it may insert a new raw, the returned id is invariant for a given user (I don't really understand the holdability ov immutable functions; are the results cached only for the livetime of a prepared statement ?, or can they be shared by different sessions ?) Thanks, Marc --Table definition: create table ref_table ( id serial NOT NULL, v varchar NOT NULL, constraint ref_table_pk primary key (id) ) without oids; create unique index ref_table_uk on ref_table(v); -- Function: CREATE OR REPLACE FUNCTION public.get_or_insert_value("varchar") RETURNS INT AS $BODY$ DECLARE id_value INT; BEGIN SELECT INTO id_value id FROM ref_table WHERE v = $1; IF FOUND THEN RETURN id_value; ELSE --new value to be inserted DECLARE rec record; BEGIN FOR rec in INSERT INTO ref_table (v) VALUES ($1) RETURNING id LOOP return rec.id; END LOOP; EXCEPTION --concurrent access ? WHEN unique_violation THEN RETURN(SELECT id FROM ref_table WHERE v = $1); END; END IF; END; $BODY$ LANGUAGE 'plpgsql' VOLATILE;
[PERFORM] Weird row estimate
Hi, I'm having a weird problem on a query : I've simplified it to get the significant part (see end of message). The point is I've got a simple SELECT field FROM table WHERE 'condition1' Estimated returned rows : 5453 Then SELECT field FROM table WHERE 'condition2' Estimated returned rows : 705 Then SELECT field FROM table WHERE 'condition1' OR 'condition2' Estimated returned rows : 143998 Condition2 is a bit complicated (it's a subquery). Nevertheless, shouldn't the third estimate be smaller or equal to the sum of the two others ? Postgresql is 8.2.4 on Linux, stats are up to date, show default_statistics_target; default_statistics_target --- 1000 Any ideas ? explain analyze SELECT stc.CMD_ID FROM STOL_STC stc WHERE (stc.STC_DATE>='2007-07-05' AND stc.STC_DATEPLAN<='2007-07-05'); QUERY PLAN -- Seq Scan on stol_stc stc (cost=0.00..24265.15 rows=5453 width=8) (actual time=17.186..100.941 rows=721 loops=1) Filter: ((stc_date >= '2007-07-05'::date) AND (stc_dateplan <= '2007-07-05'::date)) Total runtime: 101.656 ms (3 rows) explain analyze SELECT stc.CMD_ID FROM STOL_STC stc WHERE stc.STC_ID IN (SELECT STC_ID FROM STOL_TRJ WHERE TRJ_DATEARRT>='2007-07-05' AND TRJ_DATEDEPT>=TRJ_DATEARRT AND (TRJ_DATEDEPT<='2007-07-05' OR TRJ_DATECREAT<='2007-07-05') ); QUERY PLAN Nested Loop (cost=4649.62..10079.52 rows=705 width=8) (actual time=6.266..13.037 rows=640 loops=1) -> HashAggregate (cost=4649.62..4657.13 rows=751 width=8) (actual time=6.242..6.975 rows=648 loops=1) -> Index Scan using stol_trj_fk5 on stol_trj (cost=0.00..4647.61 rows=803 width=8) (actual time=0.055..4.901 rows=688 loops=1) Index Cond: (trj_datearrt >= '2007-07-05'::date) Filter: ((trj_datedept >= trj_datearrt) AND ((trj_datedept <= '2007-07-05'::date) OR (trj_datecreat <= '2007-07-05'::date))) -> Index Scan using stol_stc_pk on stol_stc stc (cost=0.00..7.21 rows=1 width=16) (actual time=0.004..0.005 rows=1 loops=648) Index Cond: (stc.stc_id = stol_trj.stc_id) Total runtime: 13.765 ms (8 rows) explain analyze SELECT stc.CMD_ID FROM STOL_STC stc WHERE (stc.STC_DATE>='2007-07-05' AND stc.STC_DATEPLAN<='2007-07-05') OR (stc.STC_ID IN (SELECT STC_ID FROM STOL_TRJ WHERE TRJ_DATEARRT>='2007-07-05' AND TRJ_DATEDEPT>=TRJ_DATEARRT AND (TRJ_DATEDEPT<='2007-07-05' OR TRJ_DATECREAT<='2007-07-05') )); QUERY PLAN Seq Scan on stol_stc stc (cost=4649.62..29621.12 rows=143998 width=8) (actual time=21.564..146.365 rows=1048 loops=1) Filter: (((stc_date >= '2007-07-05'::date) AND (stc_dateplan <= '2007-07-05'::date)) OR (hashed subplan)) SubPlan -> Index Scan using stol_trj_fk5 on stol_trj (cost=0.00..4647.61 rows=803 width=8) (actual time=0.054..4.941 rows=688 loops=1) Index Cond: (trj_datearrt >= '2007-07-05'::date) Filter: ((trj_datedept >= trj_datearrt) AND ((trj_datedept <= '2007-07-05'::date) OR (trj_datecreat <= '2007-07-05'::date))) Total runtime: 147.407 ms SELECT count(*) from stol_stc ; count 140960 (1 row)
Re: [PERFORM] Weird row estimate
Le Wednesday 11 July 2007 22:35:31 Tom Lane, vous avez écrit : > Marc Cousin <[EMAIL PROTECTED]> writes: > > Nevertheless, shouldn't the third estimate be smaller or equal to the sum > > of the two others ? > > The planner's estimation for subplan conditions is pretty primitive > compared to joinable conditions. When you add the OR, it's no longer > possible to treat the IN like a join, and everything gets an order of > magnitude dumber :-( > > It might be worth trying this as a UNION of the two simple queries. Yes, it's much better on this query with a UNION. The problem is that this is a small set of the query, and there are several nested IN with an OR condition... But at least now I understand where it comes from. Thanks a lot. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
Postgres configuration for 64 CPUs, 128 GB RAM... Hello, We have the oppotunity to benchmark our application on a large server. I have to prepare the Postgres configuration and I'd appreciate some comments on it as I am not experienced with servers of such a scale. Moreover the configuration should be fail-proof as I won't be able to attend the tests. Our application (java + perl) and Postgres will run on the same server, whereas the application activity is low when Postgres has large transactions to process. There is a large gap between our current produtcion server (Linux, 4GB RAM, 4 cpus) and the benchmark server; one of the target of this benchmark is to verify the scalability of our application. And you have no reason to be envious as the server doesn't belong us :-) Thanks for your comments, Marc Mamin Posgres version: 8.2.1 Server Specifications: -- Sun SPARC Enterprise M8000 Server: http://www.sun.com/servers/highend/m8000/specs.xml File system: http://en.wikipedia.org/wiki/ZFS Planned configuration: # we don't expect more than 150 parallel connections, # but I suspect a leak in our application that let some idle connections open max_connections=2000 ssl = off #maximum allowed shared_buffers= 262143 # on our current best production server with 4GB RAM (not dedicated to Postgres), work_mem is set to 600 MB # this limitation is probably the bottleneck for our application as the files in pgsql_tmp grows up to 15 GB # during large aggregations (we have a locking mechanismus to avoid parallel processing of such transactions) work_mem = 31457280 # (30 GB) # index creation time is also an issue for us; the process is locking other large processes too. # our largest table so far is 13 GB + 11 GB indexes maintenance_work_mem = 31457280 # (30 GB) # more than the max number of tables +indexes expected during the benchmark max_fsm_relations = 10 max_fsm_pages = 180 # don't know if I schoud modify this. # seems to be sufficient on our production servers max_stack_depth = 2MB # vacuum will be done per hand between each test session autovacuum = off # required to analyse the benchmark log_min_duration_statement = 1000 max_prepared_transaction = 100 # seems to be required to drop schema/roles containing large number of objects max_locks_per_transaction = 128 # I use the default for the bgwriter as I couldnt find recommendation on those #bgwriter_delay = 200ms # 10-1ms between rounds #bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers scanned/round #bgwriter_lru_maxpages = 5 # 0-1000 buffers max written/round #bgwriter_all_percent = 0.333 # 0-100% of all buffers scanned/round #bgwriter_all_maxpages = 5 # 0-1000 buffers max written/round #WAL fsync = on #use default #wal_sync_method # we are using 32 on our production system wal_buffers=64 # we didn't make any testing with this parameter until now, but this should'nt be a relevant # point as our performance focus is on large transactions commit_delay = 0 #CHECKPOINT # xlog will be on a separate disk checkpoint_segments=256 checkpoint_timeout = 5min
Re: [PERFORM] Postgres configuration for 64 CPUs, 128 GB RAM...
Hello, thank you for all your comments and recommendations. I'm aware that the conditions for this benchmark are not ideal, mostly due to the lack of time to prepare it. We will also need an additional benchmark on a less powerful - more realistic - server to better understand the scability of our application. Our application is based on java and is generating dynamic reports from log files content. Dynamic means here that a repor will be calculated from the postgres data the first time it is requested (it will then be cached). Java is used to drive the data preparation and to handle/generate the reports requests. This is much more an OLAP system then an OLTP, at least for our performance concern. Data preparation: 1) parsing the log files with a heavy use of perl (regular expressions) to generate csv files. Prepared statements also maintain reference tables in the DB. Postgres performance is not an issue for this first step. 2) loading the csv files with COPY. As around 70% of the data to load come in a single daily table, we don't allow concurrent jobs for this step. We have between a few and a few hundreds files to load into a single table; they are processed one after the other. A primary key is always defined; for the case when the required indexes are alreay built and when the new data are above a given size, we are using a "shadow" table instead (without the indexes) , build the index after the import and then replace the live table with the shadow one. For example, we a have a table of 13 GB + 11 GB indexes (5 pieces). Performances : a) is there an "ideal" size to consider for our csv files (100 x 10 MB or better 1 x 1GB ?) b) maintenance_work_mem: I'll use around 1 GB as recommended by Stefan 3) Data agggregation. This is the heaviest part for Postgres. On our current system some queries need above one hour, with phases of around 100% cpu use, alterning with times of heavy i/o load when temporary results are written/read to the plate (pgsql_tmp). During the aggregation, other postgres activities are low (at least should be) as this should take place at night. Currently we have a locking mechanism to avoid having more than one of such queries running concurently. This may be to strict for the benchmark server but better reflect our current hardware capabilities. Performances : Here we should favorise a single huge transaction and consider a low probability to have another transaction requiring large sort space. Considering this, is it reasonable to define work_mem being 3GB (I guess I should raise this parameter dynamically before running the aggregation queries) 4) Queries (report generation) We have only few requests which are not satisfying while requiring large sort operations. The data are structured in different aggregation levels (minutes, hours, days) with logical time based partitions in oder to limit the data size to compute for a given report. Moreover we can scale our infrastrucure while using different or dedicated Postgres servers for different customers. Smaller customers may share a same instance, each of them having its own schema (The lock mechanism for large aggregations apply to a whole Postgres instance, not to a single customer) . The benchmark will help us to plan such distribution. During the benchmark, we will probably not have more than 50 not idle connections simultaneously. It is a bit too early for us to fine tune this part. The benchmark will mainly focus on the steps 1 to 3 During the benchmark, the Db will reach a size of about 400 GB, simulating 3 different customers, also with data quite equally splitted in 3 scheemas. I will post our configuration(s) later on. Thanks again for all your valuable input. Marc Mamin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] join tables vs. denormalization by trigger
Hello, I had a similar issue and -atfer testing - decided to merge the tables B and C into a single table. In my case the resulting table contains a large proportion of nulls which limits the size increase... You'll have to do some testing with your data to evaluate the performance gain. Hope to help, Marc -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Walter Mauritz Sent: Tuesday, September 04, 2007 8:53 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] join tables vs. denormalization by trigger Hi, I wonder about differences in performance between two scenarios: Background: Table A, ~50,000 records Table B, ~3,000,000 records (~20 cols) Table C, ~30,000,000 records (~10 cols) a query every 3sec. with limit 10 Table C depends on Table B wich depends on Table A, int8 foreign key, btree index * consider it a read only scenario (load data only in night, with time for vacuum analyze daily) * im required to show records from Table C, but also with some (~5cols) info from Table B * where clause always contains the foreign key to Table A * where clause may contain further 1-10 search parameter Scenario A) simply inner join Table B + C Scenario B) with use of trigger on insert/update I could push the required information from table B down to table C. -> so i would only require to select from table C. My question: 1) From your experience ... how much faster (approximately) in percent do you regard Scenario B faster than A ? 2) any other tips for such a read only scenario Thx for any attention :-) Walter -- GMX FreeMail: 1 GB Postfach, 5 E-Mail-Adressen, 10 Free SMS. Alle Infos und kostenlose Anmeldung: http://www.gmx.net/de/go/freemail ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] performance regression with Linux 2.6.33 and glibc 2.12
Hi. I hope I'm not going to expose an already known problem, but I couldn't find it mailing list archives (I only found http://archives.postgresql.org/pgsql- hackers/2009-12/msg01543.php). On one of my (non production) machines, I've just seen a very big performance regression (I was doing a very simple insert test). I had an 'old' 8.4 postgresql compiled a few month ago, performing very well, and my 'bleeding edge' 9.0, doing the same insert very slowly. I managed to find the cause of the regression : with Linux 2.6.33, O_DSYNC is now available. With glibc 2.12, O_DSYNC is available in userspace. Having both (they are both very new, 2.12 isn't even official on glibc website), my new build defaulted to open_datasync. The problem is that it is much slower. I tested it on 2 small machines (no big raid, just basic machines, with SATA or software RAID). Here is the trivial test : The configuration is the default configuration, just after initdb CREATE TABLE test (a int); CREATE INDEX idxtest on test (a); with wal_sync_method = open_datasync (new default) marc=# INSERT INTO test SELECT generate_series(1,10); INSERT 0 10 Time: 16083,912 ms with wal_sync_method = fdatasync (old default) marc=# INSERT INTO test SELECT generate_series(1,10); INSERT 0 10 Time: 954,000 ms Doing synthetic benchmarks with test_fsync: open_datasync performance, glibc 2.12, 2.6.34, 1 SATA drive Simple 8k write timing: write 0.037511 Compare file sync methods using one 8k write: open_datasync write56.998797 open_sync write 168.653995 write, fdatasync 55.359279 write, fsync 166.854911 Compare file sync methods using two 8k writes: open_datasync write, write113.342738 open_sync write, write339.066883 write, write, fdatasync57.336820 write, write, fsync 166.847923 Compare open_sync sizes: 16k open_sync write 169.423723 2 8k open_sync writes 336.457119 Compare fsync times on write() and new file descriptors (if the times are similar, fsync() can sync data written on a different descriptor): write, fsync, close 166.264048 write, close, fsync 168.702035 This is it, I just wanted to raise an alert on this: the degradation was 16- fold with this test. We wont see linux 2.6.33 + glibc 2.12 in production before months (I hope), but shouldn't PostgreSQL use fdatasync by default with Linux, seeing the result ? By the way, I re-did my tests with both 2.6.33, 2.6.34 and 2.6.35-rc1 and got the exact same result (O_DSYNC there, obviously, but also the performance degradation). Cheers Marc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance regression with Linux 2.6.33 and glibc 2.12
The Friday 04 June 2010 15:59:05, Tom Lane wrote : > Marc Cousin writes: > > I hope I'm not going to expose an already known problem, but I couldn't > > find it mailing list archives (I only found > > http://archives.postgresql.org/pgsql- hackers/2009-12/msg01543.php). > > You sure this isn't the well-known "ext4 actually implements fsync > where ext3 didn't" issue? > > regards, tom lane Everything is ext4. So I should have fsync working with write barriers on all the tests. I don't think this problem is of the same kind: I think it is really because of O_DSYNC appearing on 2.6.33, and PostgreSQL using it by default now. If my filesystem was lying to me about barriers, I should take no more performance hit with open_datasync than with fdatasync, should I ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] CREATE INDEX as bottleneck
Hello, in the last years, we have successfully manage to cope with our data growth using partitioning and splitting large aggregation tasks on multiple threads. The partitioning is done logically by our applicationn server, thus avoiding trigger overhead. There are a few places in our data flow where we have to wait for index creation before being able to distribute the process on multiple threads again. With the expected growth, create index will probably become a severe bottleneck for us. Is there any chance to see major improvement on it in a middle future ? I guess the question is naive, but why can't posgres use multiple threads for large sort operation ? best regards, Marc Mamin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CREATE INDEX as bottleneck
No, CONCURRENTLY is to improve table availability during index creation, but it degrades the performances. best regards, Marc Mamin -Original Message- From: Alex Hunsaker [mailto:bada...@gmail.com] Sent: Donnerstag, 11. November 2010 19:55 To: Marc Mamin Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] CREATE INDEX as bottleneck On Thu, Nov 11, 2010 at 06:41, Marc Mamin wrote: > There are a few places in our data flow where we have to wait for index > creation before being able to distribute the process on multiple threads > again. Would CREATE INDEX CONCURRENTLY help here? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] anti-join chosen even when slower than old plan
Hello, Just a short though: Is it imaginable to compare the prognoses of the plans with the actual results and somehow log the worst cases ? a) to help the DBA locate bad statistics and queries b) as additional information source for the planner This could possibly affect parameters of your formula on the fly. best regards, Marc Mamin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hashed subplan 5000x slower than two sequential operations
Hello, are the table freshly analyzed, with a sufficient default_statistics_target ? You may try to get a better plan while rewriting the query as an UNION to get rid of the OR clause. Something like (not tested): SELECT contexts.context_key FROM contexts JOIN articles ON (articles.context_key=contexts.context_key) WHERE contexts.parent_key = 392210 AND articles.indexed UNION SELECT context_key FROM ( SELECT contexts.context_key FROM contexts JOIN collection_data ON ( contexts.context_key = collection_data .context_key) WHERE collection_data.collection_context_key = 392210) ) foo JOIN articles ON (foo.context_key=contexts.context_key) WHERE articles.indexed ; I've had one similar problem where there was no way for the planner to notice that the query would systematically return very few rows. Here, my last resort was to disable some planner methods within the given transaction. regards, Marc Mamin -Ursprüngliche Nachricht- Von: pgsql-performance-ow...@postgresql.org im Auftrag von Shrirang Chitnis Gesendet: Mi 12/8/2010 8:05 An: Bryce Nesbitt; pgsql-performance@postgresql.org Betreff: Re: [PERFORM] hashed subplan 5000x slower than two sequential operations Bryce, The two queries are different: You are looking for contexts.context_key in first query WHERE (contexts.parent_key = 392210 OR contexts.context_key IN (SELECT collection_data.context_key FROM collection_data WHERE collection_data.collection_context_key = 392210) but second query has context.parent_key WHERE (contexts.parent_key = 392210 OR contexts.parent_key IN (392210,392210,395073,1304250)) Is the contexts.context_key an indexed field? contexts.parent_key certainly seems to be. HTH, Shrirang Chitnis Sr. Manager, Applications Development HOV Services Office: (866) 808-0935 Ext: 39210 shrirang.chit...@hovservices.com www.hovservices.com The information contained in this message, including any attachments, is attorney privileged and/or confidential information intended only for the use of the individual or entity named as addressee. The review, dissemination, distribution or copying of this communication by or to anyone other than the intended addressee is strictly prohibited. If you have received this communication in error, please immediately notify the sender by replying to the message and destroy all copies of the original message. -Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Bryce Nesbitt Sent: Thursday, December 09, 2010 12:24 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] hashed subplan 5000x slower than two sequential operations Can you help me understand how to optimize the following. There's a subplan which in this case returns 3 rows, but it is really expensive: = explain analyze SELECT contexts.context_key FROM contexts JOIN articles ON (articles.context_key=contexts.context_key) WHERE (contexts.parent_key = 392210 OR contexts.context_key IN (SELECT collection_data.context_key FROM collection_data WHERE collection_data.collection_context_key = 392210) ) AND articles.indexed ; QUERY PLAN Hash Join (cost=83054.41..443755.45 rows=261077 width=4) (actual time=4362.143..6002.808 rows=28 loops=1) Hash Cond: (articles.context_key = contexts.context_key) -> Seq Scan on articles (cost=0.00..345661.91 rows=522136 width=4) (actual time=0.558..3953.002 rows=517356 loops=1) Filter: indexed -> Hash (cost=69921.25..69921.25 rows=800493 width=4) (actual time=829.501..829.501 rows=31 loops=1) -> Seq Scan on contexts (cost=14.31..69921.25 rows=800493 width=4) (actual time=1.641..829.339 rows=31 loops=1) Filter: ((parent_key = 392210) OR (hashed subplan)) SubPlan -> Index Scan using collection_data_context_key_index on collection_data (cost=0.00..14.30 rows=6 width=4) (actual time=0.018..0.023 rows=3 loops=1) Index Cond: (collection_context_key = 392210) Total runtime: 6002.976 ms (11 rows) = explain analyze SELECT contexts.context_key FROM contexts JOIN articles ON (articles.context_key=contexts.context_key) WHERE (contexts.parent_key = 392210 OR contexts.parent_key IN (392210,392210,395073,1304250)) AND articles.indexed ;
Re: [PERFORM] hashed subplan 5000x slower than two sequential operations
Another point: would a conditionl index help ? on articles (context_key) where indexed regards, -Ursprüngliche Nachricht- Von: pgsql-performance-ow...@postgresql.org im Auftrag von Marc Mamin Gesendet: Mi 12/8/2010 9:06 An: Shrirang Chitnis; Bryce Nesbitt; pgsql-performance@postgresql.org Betreff: Re: [PERFORM] hashed subplan 5000x slower than two sequential operations Hello, are the table freshly analyzed, with a sufficient default_statistics_target ? You may try to get a better plan while rewriting the query as an UNION to get rid of the OR clause. Something like (not tested): SELECT contexts.context_key FROM contexts JOIN articles ON (articles.context_key=contexts.context_key) WHERE contexts.parent_key = 392210 AND articles.indexed UNION SELECT context_key FROM ( SELECT contexts.context_key FROM contexts JOIN collection_data ON ( contexts.context_key = collection_data .context_key) WHERE collection_data.collection_context_key = 392210) ) foo JOIN articles ON (foo.context_key=contexts.context_key) WHERE articles.indexed ; I've had one similar problem where there was no way for the planner to notice that the query would systematically return very few rows. Here, my last resort was to disable some planner methods within the given transaction. regards, Marc Mamin -Ursprüngliche Nachricht- Von: pgsql-performance-ow...@postgresql.org im Auftrag von Shrirang Chitnis Gesendet: Mi 12/8/2010 8:05 An: Bryce Nesbitt; pgsql-performance@postgresql.org Betreff: Re: [PERFORM] hashed subplan 5000x slower than two sequential operations Bryce, The two queries are different: You are looking for contexts.context_key in first query WHERE (contexts.parent_key = 392210 OR contexts.context_key IN (SELECT collection_data.context_key FROM collection_data WHERE collection_data.collection_context_key = 392210) but second query has context.parent_key WHERE (contexts.parent_key = 392210 OR contexts.parent_key IN (392210,392210,395073,1304250)) Is the contexts.context_key an indexed field? contexts.parent_key certainly seems to be. HTH, Shrirang Chitnis Sr. Manager, Applications Development HOV Services Office: (866) 808-0935 Ext: 39210 shrirang.chit...@hovservices.com www.hovservices.com The information contained in this message, including any attachments, is attorney privileged and/or confidential information intended only for the use of the individual or entity named as addressee. The review, dissemination, distribution or copying of this communication by or to anyone other than the intended addressee is strictly prohibited. If you have received this communication in error, please immediately notify the sender by replying to the message and destroy all copies of the original message. -Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Bryce Nesbitt Sent: Thursday, December 09, 2010 12:24 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] hashed subplan 5000x slower than two sequential operations Can you help me understand how to optimize the following. There's a subplan which in this case returns 3 rows, but it is really expensive: = explain analyze SELECT contexts.context_key FROM contexts JOIN articles ON (articles.context_key=contexts.context_key) WHERE (contexts.parent_key = 392210 OR contexts.context_key IN (SELECT collection_data.context_key FROM collection_data WHERE collection_data.collection_context_key = 392210) ) AND articles.indexed ; QUERY PLAN Hash Join (cost=83054.41..443755.45 rows=261077 width=4) (actual time=4362.143..6002.808 rows=28 loops=1) Hash Cond: (articles.context_key = contexts.context_key) -> Seq Scan on articles (cost=0.00..345661.91 rows=522136 width=4) (actual time=0.558..3953.002 rows=517356 loops=1) Filter: indexed -> Hash (cost=69921.25..69921.25 rows=800493 width=4) (actual time=829.501..829.501 rows=31 loops=1) -> Seq Scan on contexts (cost=14.31..69921.25 rows=800493 width=4) (actual time=1.641..829.339 rows=31 loops=1) Filter: ((parent_key = 392210) OR (hashed subplan)) SubPlan -> Index Scan using collection_data_context_key_index on collection_data (cost=0.00..14.30 rows=6 width=4) (actual time=0.018..0.023 rows=3 loops=1) Index Cond: (collection_context_key = 392210) Total runtime: 6002.976 ms (11 rows) =
Re: [PERFORM] UNION and bad performance
Hello, UNION will remove all duplicates, so that the result additionally requires to be sorted. Anyway, for performance issues, you should always start investigation with explain analyze . regards, Marc Mamin From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Olivier Pala Sent: Donnerstag, 9. Dezember 2010 11:52 To: pgsql-performance@postgresql.org Cc: Olivier Pala Subject: [PERFORM] UNION and bad performance Hi, I have a performance trouble with UNION query First I have this view : SELECT a,b,c FROM table1, table2 WHERE jointure AND condition Result : 150 000 records in 1~2 s Then, I adding an UNION into the same view : SELECT a,b,c FROM table1, table2 WHERE jointure AND condition UNION SELECT a,b,c FROM table3 Result : 150 200 records in 6~7 s Why, do I have bad performance only for 200 adding records ? Thanks SGBD : Postgres 8.3 et 8.4
Re: [PERFORM] Major performance problem after upgrade from 8.3 to 8.4
Hi, I had a similar problem with many left join, reading about planning optimization i tried to edit postgresql.conf and uncommented the line join_collapse_limit = 8 and set it to 1, disables collapsing of explicit . My query its taking 2000s in 8.4 and the same query 2ms in 8.3. Now its working fast in 8.4. Best regards, Marc -- View this message in context: http://postgresql.1045698.n5.nabble.com/Major-performance-problem-after-upgrade-from-8-3-to-8-4-tp2796390p3329435.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] inheritance: planning time vs children number vs column number
The Monday 28 February 2011 13:57:45, Heikki Linnakangas wrote : > On 28.02.2011 11:38, Marc Cousin wrote: > > I've been facing a very large (more than 15 seconds) planning time in a > > partitioned configuration. The amount of partitions wasn't completely > > crazy, around 500, not in the thousands. The problem was that there were > > nearly 1000 columns in the parent table (very special use case, there is > > a reason for this application for having these many columns). The check > > constraint was extremely simple (for each child, 1 column = 1 constant, > > always the same column). > > > > As I was surprised by this very large planning time, I have been trying > > to study the variation of planning time against several parameters: - > > number of columns > > - number of children tables > > - constraint exclusion's value (partition or off) > > > > What (I think) I measured is that the planning time seems to be O(n^2) > > for the number of columns, and O(n^2) for the number of children tables. > > > > Constraint exclusion had a limited impact on planning time (it added > > between 20% and 100% planning time when there were many columns). > > Testing here with a table with 1000 columns and 100 partitions, about > 80% of the planning time is looking up the statistics on attribute > width, to calculate average tuple width. I don't see O(n^2) behavior, > though, it seems linear. It is only based on experimentation, for my part, of course… If you measure the planning time, modifying either the columns or the partitions number, the square root of the planning time is almost perfectly proportional with the parameter you're playing with.
Re: [PERFORM] inheritance: planning time vs children number vs column number
The Monday 28 February 2011 16:35:37, Tom Lane wrote : > Marc Cousin writes: > > The Monday 28 February 2011 13:57:45, Heikki Linnakangas wrote : > >> Testing here with a table with 1000 columns and 100 partitions, about > >> 80% of the planning time is looking up the statistics on attribute > >> width, to calculate average tuple width. I don't see O(n^2) behavior, > >> though, it seems linear. > > > > It is only based on experimentation, for my part, of course⊠> > > > If you measure the planning time, modifying either the columns or the > > partitions number, the square root of the planning time is almost > > perfectly proportional with the parameter you're playing with. > > Could we see a concrete example demonstrating that? I agree with Heikki > that it's not obvious what you are testing that would have such behavior. > I can think of places that would have O(N^2) behavior in the length of > the targetlist, but it seems unlikely that they'd come to dominate > runtime at a mere 1000 columns. > > regards, tom lane I feel a little silly not having provided a test case from the start… A script doing a complete test is attached to this email. It's doing a simple CREATE TABLE test_father (col0 int,col1 int,col2 int,col3 int,col4 int,col5 int,col6 int,col7 int,col8 int,col9 int,col10 in t,col11 int,col12 int,col13 int,col14 int,col15 int,col16 int,col17 int,col18 int,col19 int,col20 int,col21 int,col22 int,co l23 int,…) Followed by 600 CREATE TABLE test_child_0 (CHECK (col0=0)) INHERITS (test_father); And a single SELECT col0 FROM test_father WHERE col0=0; Here are my results (from the same machine). I've done it with 600 partitions, to have big planning times. If you need a smaller one (this one takes nearly ten minutes to run) tell me. COLS:100 PARTITIONS:600 Time : 513,764 ms (sqrt : 22.6) COLS:200 PARTITIONS:600 Time : 906,214 ms (sqrt : 30.1) COLS:300 PARTITIONS:600 Time : 2255,390 ms (sqrt : 47.48) COLS:400 PARTITIONS:600 Time : 4816,820 ms (sqrt : 69.4) COLS:500 PARTITIONS:600 Time : 5736,602 ms (sqrt : 75.73) COLS:600 PARTITIONS:600 Time : 7659,617 ms (sqrt : 87.51) COLS:700 PARTITIONS:600 Time : 9313,260 ms (sqrt : 96.5) COLS:800 PARTITIONS:600 Time : 13700,353 ms (sqrt : 117.04) COLS:900 PARTITIONS:600 Time : 13914,765 ms (sqrt : 117.95) COLS:1000 PARTITIONS:600 Time : 20335,750 ms (sqrt : 142.6) COLS:1100 PARTITIONS:600 Time : 21048,958 ms (sqrt : 145.08) COLS:1200 PARTITIONS:600 Time : 27619,559 ms (sqrt : 166.18) COLS:1300 PARTITIONS:600 Time : 31357,353 ms (sqrt : 177.08) COLS:1400 PARTITIONS:600 Time : 34435,711 ms (sqrt : 185.57) COLS:1500 PARTITIONS:600 Time : 38954,676 ms (sqrt : 197.37) As for my previous results, these ones are on a machine doing a bit of other work, so some values may be a bit offset, and it's only one measure each time anyway. The CSV file I sent from the first email is obtained running the exact same commands, but playing on both columns and partitions, and averaged over 3 measures. Regards. script.sql.bz2 Description: application/bzip -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] inheritance: planning time vs children number vs column number
Le mardi 01 mars 2011 07:20:19, Tom Lane a écrit : > Marc Cousin writes: > > The Monday 28 February 2011 16:35:37, Tom Lane wrote : > >> Could we see a concrete example demonstrating that? I agree with Heikki > >> that it's not obvious what you are testing that would have such > >> behavior. I can think of places that would have O(N^2) behavior in the > >> length of the targetlist, but it seems unlikely that they'd come to > >> dominate runtime at a mere 1000 columns. > > > > I feel a little silly not having provided a test case from the startق� > > > > A script doing a complete test is attached to this email. > > I did some oprofile analysis of this test case. It's spending > essentially all its time in SearchCatCache, on failed searches of > pg_statistic. The cache accumulates negative entries for each probed > column, and then the searches take time proportional to the number of > entries, so indeed there is an O(N^2) behavior --- but N is the number > of columns times number of tables in your test case, not just the number > of columns. > > The cache is a hash table, so ideally the search time would be more or > less constant as the table grows, but to make that happen we'd need to > reallocate with more buckets as the table grows, and catcache.c doesn't > do that yet. We've seen a few cases that make that look worth doing, > but they tend to be pretty extreme, like this one. > > It's worth pointing out that the only reason this effect is dominating > the runtime is that you don't have any statistics for these toy test > tables. If you did, the cycles spent using those entries would dwarf > the lookup costs, I think. So it's hard to get excited about doing > anything based on this test case --- it's likely the bottleneck would be > somewhere else entirely if you'd bothered to load up some data. > > regards, tom lane Yes, for the same test case, with a bit of data in every partition and statistics up to date, planning time goes from 20 seconds to 125ms for the 600 children/1000 columns case. Which is of course more than acceptable. Now I've got to check it's the same problem on the real environment. I think it has quite a few empty partitions, so no statistics for them… Thanks a lot. Marc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] inheritance: planning time vs children number vs column number
The Tuesday 01 March 2011 16:33:51, Tom Lane wrote : > Marc Cousin writes: > > Le mardi 01 mars 2011 07:20:19, Tom Lane a écrit : > >> It's worth pointing out that the only reason this effect is dominating > >> the runtime is that you don't have any statistics for these toy test > >> tables. If you did, the cycles spent using those entries would dwarf > >> the lookup costs, I think. So it's hard to get excited about doing > >> anything based on this test case --- it's likely the bottleneck would be > >> somewhere else entirely if you'd bothered to load up some data. > > > > Yes, for the same test case, with a bit of data in every partition and > > statistics up to date, planning time goes from 20 seconds to 125ms for > > the 600 children/1000 columns case. Which is of course more than > > acceptable. > > [ scratches head ... ] Actually, I was expecting the runtime to go up > not down. Maybe there's something else strange going on here. > > regards, tom lane Then, what can I do to help ?
[PERFORM] partitioning and locking problems
We have a large database system designed around partitioning. Our application is characterized with - terabytes of data - billions of rows in dozens of base tables (and 100s of paritions) - 24x7 insert load of new data that cannot be stopped, data is time sensitive. - periodic reports that can have long running queries with query times measured in hours We have 2 classes of "maintenance" activities that are causing us problems: - periodically we need to change an insert rule on a view to point to a different partition. - periodically we need to delete data that is no longer needed. Performed via truncate. Under both these circumstances (truncate and create / replace rule) the locking behaviour of these commands can cause locking problems for us. The scenario is best illustrated as a series of steps: 1- long running report is running on view 2- continuous inserters into view into a table via a rule 3- truncate or rule change occurs, taking an exclusive lock. Must wait for #1 to finish. 4- new reports and inserters must now wait for #3. 5- now everyone is waiting for a single query in #1. Results in loss of insert data granularity (important for our application). Would like to understand the implications of changing postgres' code/locking for rule changes and truncate to not require locking out select statements? The following is a simplified schema to help illustrate the problem. create table a_1 ( pkey int primary key ); create table a_2 ( pkey int primary key ); create view a as select * from a_1 union all select * from a_2; create function change_rule(int) returns void as ' begin execute ''create or replace rule insert as on insert to a do instead insert into a_''||$1||''(pkey) values(NEW.pkey)''; end; ' language plpgsql; -- change rule, execute something like the following periodically select change_rule(1); We've looked at the code and the rule changes appear "easy" but we are concerned about the required changes for truncate. Thanks Marc ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] partitioning and locking problems
Tom, Do you mean it would be impossible to change the code so that existing selects continue to use the pre-truncated table until they commit? Or just require a more extensive change? The update/insert rule change appears to be more more doable? No? > -Original Message- > From: Tom Lane [mailto:[EMAIL PROTECTED] > Sent: Wednesday, February 01, 2006 12:50 AM > To: Marc Morin > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] partitioning and locking problems > > "Marc Morin" <[EMAIL PROTECTED]> writes: > > Would like to understand the implications of changing postgres' > > code/locking for rule changes and truncate to not require > locking out > > select statements? > > It won't work... > > regards, tom lane > > ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] partitioning and locking problems
Using a separate lock table is what we've decided to do in this particular case to serialize #1 and #3. Inserters don't take this lock and as such will not be stalled. > -Original Message- > From: Markus Schaber [mailto:[EMAIL PROTECTED] > Sent: Thursday, February 02, 2006 7:44 AM > To: Marc Morin > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] partitioning and locking problems > > Hi, Marc, > > Marc Morin wrote: > > > 1- long running report is running on view > > 2- continuous inserters into view into a table via a rule > > 3- truncate or rule change occurs, taking an exclusive lock. > > Must wait for #1 to finish. > > 4- new reports and inserters must now wait for #3. > > 5- now everyone is waiting for a single query in #1. Results > > in loss of insert data granularity (important for our application). > > Apart from having two separate views (one for report, one for > insert) as Richard suggested: > > If you have fixed times for #3, don't start any #1 that won't > finish before it's time for #3. > > You could also use the LOCK command on an empty lock table at > the beginning of each #1 or #3 transaction to prevent #3 from > getting the view lock before #1 is finished. > > > HTH, > Markus > > -- > Markus Schaber | Logical Tracking&Tracing International AG > Dipl. Inf. | Software Development GIS > > Fight against software patents in EU! www.ffii.org > www.nosoftwarepatents.org > > ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] partitioning and locking problems
All good ideas, unfortunately, we can't change the inserting applicatin code easily. > -Original Message- > From: Simon Riggs [mailto:[EMAIL PROTECTED] > Sent: Tuesday, February 07, 2006 5:09 PM > To: Marc Morin > Cc: Markus Schaber; pgsql-performance@postgresql.org > Subject: Re: [PERFORM] partitioning and locking problems > > On Thu, 2006-02-02 at 11:27 -0500, Marc Morin wrote: > > > > > 1- long running report is running on view > > > > 2- continuous inserters into view into a table > via a rule > > > > 3- truncate or rule change occurs, taking an > exclusive lock. > > > > Must wait for #1 to finish. > > > > 4- new reports and inserters must now wait for #3. > > > > 5- now everyone is waiting for a single query > in #1. Results > > > > in loss of insert data granularity (important for our > application). > > > Using a separate lock table is what we've decided to do in this > > particular case to serialize #1 and #3. Inserters don't take this > > lock and as such will not be stalled. > > Would it not be simpler to have the Inserters change from one > table to another either upon command, on a fixed timing cycle > or even better based upon one of the inserted values > (Logdate?) (or all 3?). (Requires changes in the application > layer: 3GL or db functions). > > The truncates can wait until the data has stopped being used. > > I'd be disinclined to using the locking system as a scheduling tool. > > Best Regards, Simon Riggs > > > ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] 10+hrs vs 15min because of just one index
We've done a lot of testing on large DB's with a lot of "inserts" and have a few comments. The updates are "treated" as a large "insert" as we all know from pg's point of view. We've run into 2 classes of problems: excessing WAL checkpoints and affects of low correlation. WAL log write's full 8K block for first modification, then only changes. This can be the source of "undesireable" behaviour during large batch inserts like this. From your config, a check point will be forced when (checkpoint_segments * 16 M) < rows * (8K/N*h + (1-h)*8K) * B Where h is the "hitrate" or correlation between the update scan and the index. Do you have a sense of what this is? In the limits, we have 100% correlation or 0% correlation. N is the lower cost of putting the change in the WAL entry, not sure what this is, but small, I am assuming, say N=100. B is the average number of blocks changed per updated row (assume B=1.1 for your case, heap,serial index have very high correlation) In the 0% correlation case, each updated row will cause the index update to read/modify the block. The modified block will be entirely written to the WAL log. After (30 * 16M) / (8K) / 1.1 ~ 55k rows, a checkpoint will be forced and all modified blocks in shared buffers will be written out. Increasing checkpoint_segments to 300 and seeing if that makes a difference. If so, the excessive WAL checkpoints are your issue. If performance is exactly the same, then I would assume that you have close to 0% correlation between the rows in the heap and index. Can you increase shared_buffers? With a low correlation index, the only solution is to hold the working set of blocks in memory. Also, make sure that the checkpoint segments are big enough for you to modify them in place, don't want checkpoints occurring Note that the more updates you do, the larger the tables/index become and the worse the problem becomes. Vacuuming the table is an "answer" but unfortunately, it tends to decrease correlation from our observations. :-( From our observations, dropping index and rebuilding them is not always practical, depends on your application; table will be exclusively locked during the transaction due to drop index. I haven't looked at pg's code for creating an index, but seriously suspect it's doing an extern sort then insert into the index. Such operations would have 100% correlation from the index insert point of view and the "sort" could be in memory or the tape variety (more efficient i/o pattern). Summary, # of indexes, index correlation, pg's multi versioning, shared_buffers and checkpoint_segments are interconnected in weird and wonderful ways... Seldom have found "simple" solutions to performance problems. Marc > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Aaron Turner > Sent: Friday, February 10, 2006 3:17 AM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] 10+hrs vs 15min because of just one index > > So I'm trying to figure out how to optimize my PG install > (8.0.3) to get better performance without dropping one of my indexes. > > Basically, I have a table of 5M records with 3 columns: > > pri_key (SERIAL) > data char(48) > groupid integer > > there is an additional unique index on the data column. > > The problem is that when I update the groupid column for all > the records, the query takes over 10hrs (after that I just > canceled the update). Looking at iostat, top, vmstat shows > I'm horribly disk IO bound (for data not WAL, CPU 85-90% > iowait) and not swapping. > > Dropping the unique index on data (which isn't used in the > query), running the update and recreating the index runs in > under 15 min. > Hence it's pretty clear to me that the index is the problem > and there's really nothing worth optimizing in my query. > > As I understand from #postgresql, doing an UPDATE on one > column causes all indexes for the effected row to have to be > updated due to the way PG replaces the old row with a new one > for updates. This seems to explain why dropping the unique > index on data solves the performance problem. > > interesting settings: > shared_buffers = 32768 > maintenance_work_mem = 262144 > fsync = true > wal_sync_method = open_sync > wal_buffers = 512 > checkpoint_segments = 30 > effective_cache_size = 1 > work_mem = (1024 i think?) > > box: > Linux 2.6.9-11EL (CentOS 4.1) > 2x Xeon 3.4 HT > 2GB of RAM (but Apache and other services are running) > 4 disk raid 10 (74G Raptor) for data > 4 disk raid 10 (7200rpm) for WAL > > other then throwing more spindles at the problem, any suggestions? >
[PERFORM] Trouble managing planner for timestamptz columns
We have large tables that hold statistics based on time. They are of the form. CREATE TABLE stats ( id serial primary key, logtime timestamptz, d1 int, s1 bigint ); CREATE INDEX idx on stats(logtime); Some of these tables have new data inserted at a rate of 500,000+ rows / hour. The entire table will grow to being 10's to 100's of millions of rows in size. (Yes, we are also paritioning these, it's the size of an individual partition that we're talking about). We tend to analyze these tables every day or so and this doesn't always prove to be sufficient Our application is a reporting application and the end users typically like to query the newest data the most. As such, the queries of the form... select * from stats inner join dimension_d1 using (d1) where logtime between X and Y and d1.something = value; This usually results in a hash join (good thing) where the dimension table is loaded into the hash table and it index scans stats using idx index. The trouble starts when both X and Y are times "after" the last analyze. This restriction clause is outside the range of values in the historgram created by the last analyze. Postgres's estimate on the number of rows returned here is usually very low and incorrect, as you'd expect... Trouble can occur when the planner will "flip" its decision and decide to hash join by loading the results of the index scan on idx into the hash table instead of the dimension table Since the table is so large and the system is busy (disk not idle at all), doing an analyze on this table in the production system can take 1/2 hour! (statistics collector set to 100). We can't "afford" to analyze more often... It certainly would be nice if postgres could understand somehow that some columns are "dynamic" and that it's histogram could be stretched to the maximal values or some other technique for estimating rows to the right of the range of values in the histogram... Or have some concept of error bars on it's planner decisions Suggestions? Comments? Marc ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Trouble managing planner for timestamptz columns
Well this analyze just took 12 minutes... Stats target of 100. # time psql xxx xxx -c "analyze elem_trafficstats_1" ANALYZE real12m1.070s user0m0.001s sys 0m0.015s A large table, but by far, not the largest... Have about 1 dozen or so tables like this, so analyzing them will take 3-4 hours of time... No weird datatypes, just bigints for facts, timestamptz and ints for dimensions. My problem is not the analyze itself, it's the fact that our db is really busy doing stuff Analyze I/O is competing... I am random I/O bound like crazy. If I set the stats target to 10, I get # time psql xxx -c "set session default_statistics_target to 10;analyze elem_trafficstats_1" ANALYZE real2m15.733s user0m0.009s sys 0m2.255s Better, but not sure what side affect this would have. > -Original Message- > From: Tom Lane [mailto:[EMAIL PROTECTED] > Sent: Friday, March 10, 2006 1:31 PM > To: Marc Morin > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Trouble managing planner for > timestamptz columns > > "Marc Morin" <[EMAIL PROTECTED]> writes: > > We tend to analyze these tables every day or so and this doesn't > > always prove to be sufficient > > Seems to me you just stated your problem. Instead of having > the planner make wild extrapolations, why not set up a cron > job to analyze these tables more often? Or use autovacuum > which will do it for you. > > > Since the table is so large and the system is busy (disk > not idle at > > all), doing an analyze on this table in the production > system can take > > 1/2 hour! (statistics collector set to 100). > > I'd believe that for vacuum analyze, but analyze alone should > be cheap. > Have you perhaps got some weird datatypes in the table? > Maybe you should back off the stats target a bit? > > We do support analyzing selected columns, so you might try > something like a cron job analyzing only the timestamp > column, with a suitably low stats target for that column. > This would yield numbers far more reliable than any > extrapolation the planner could do. > > regards, tom lane > ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Query runs too long for indexed tables
Wondering if Update firma1.rid set toode=null where toode is not null and not exists(select 1 from firma1.toode where toode=rid.toode); Would be faster... Problem appears to be the seqscan of seqscan... No? > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Scott Marlowe > Sent: Tuesday, April 04, 2006 3:49 PM > To: Andrus > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Query runs too long for indexed tables > > On Tue, 2006-04-04 at 14:37, Andrus wrote: > > I have relatively small tables (toode and rid) in fast server. > > Both tables are indexed on toode field. > > > > Following query takes long time to run. > > toode field type is char(20). It is difficult to change > this field type. > > > > Any idea how to speed up this query ? > > > > UPDATE firma1.rid SET toode=NULL > >WHERE toode IS NOT NULL AND > >toode NOT IN (SELECT TOODE FROM firma1.TOODE); > > > > Query returned successfully: 0 rows affected, 594813 ms > execution time. > > > > explain window shows: > > > > Seq Scan on rid (cost=2581.07..20862553.77 rows=51848 width=1207) > > Filter: ((toode IS NOT NULL) AND (NOT (subplan))) > > SubPlan > > -> Materialize (cost=2581.07..2944.41 rows=14734 width=84) > > -> Seq Scan on toode (cost=0.00..2350.34 rows=14734 > > width=84) > > Let me guess, you've updated it a lot and aren't familiar with Vacuum? > > run a vacuum full on your database. schedule a vacuum (plain > one) to run every so often (hours or days are a good interval > for most folks) > > If that's NOT your problem, then please, let us know. > > ---(end of > broadcast)--- > TIP 2: Don't 'kill -9' the postmaster > ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Query runs too long for indexed tables
Explain analyze would be nice ;-) > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Andrus > Sent: Tuesday, April 04, 2006 3:37 PM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] Query runs too long for indexed tables > > I have relatively small tables (toode and rid) in fast server. > Both tables are indexed on toode field. > > Following query takes long time to run. > toode field type is char(20). It is difficult to change this > field type. > > Any idea how to speed up this query ? > > UPDATE firma1.rid SET toode=NULL >WHERE toode IS NOT NULL AND >toode NOT IN (SELECT TOODE FROM firma1.TOODE); > > Query returned successfully: 0 rows affected, 594813 ms > execution time. > > explain window shows: > > Seq Scan on rid (cost=2581.07..20862553.77 rows=51848 width=1207) > Filter: ((toode IS NOT NULL) AND (NOT (subplan))) > SubPlan > -> Materialize (cost=2581.07..2944.41 rows=14734 width=84) > -> Seq Scan on toode (cost=0.00..2350.34 > rows=14734 width=84) > > > Andrus. > > > > ---(end of > broadcast)--- > TIP 5: don't forget to increase your free space map settings > ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Inserts optimization?
I hope I'm not going to say stupid things, but here's what i know (or i think i know :) ) about bacula + postgresql If I remember correctly (I allready discussed this with Kern Sibbald a while ago), bacula does each insert in its own transaction : that's how the program is done, and of course it works ok with mysql and MyIsam tables, as mysql doesn't have transactions with myisam... So, you'll probably end up being slowed down by WAL fsyncs ... and you won't have a lot of solutions. Maybe you should start with trying to set fsync=no as a test to confirm that (you should have a lot of iowaits right now if you haven't disabled fsync). For now, I only could get good performance with bacula and postgresql when disabling fsync... On Thursday 13 April 2006 20:45, Francisco Reyes wrote: > Chris writes: > > If you can, use copy instead: > > http://www.postgresql.org/docs/8.1/interactive/sql-copy.html > > I am familiar with copy. > Can't use it in this scenario. > > The data is coming from a program called Bacula (Backup server). > It is not static data. > > ---(end of broadcast)--- > TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] LIKE query problem
I'm having a problem with a simple query, that finds children of a node, using a materialized path to the node. The query: select n1.id from nodes n1, nodes n2 where n1.path like n2.path || '%' and n2.id = 14; QUERY PLAN --- Nested Loop (cost=0.00..120256.56 rows=17517 width=4) (actual time=0.901..953.485 rows=7 loops=1) Join Filter: (("inner".path)::text ~~ (("outer".path)::text || '%'::text)) -> Index Scan using nodes_id on nodes n2 (cost=0.00..35.08 rows=11 width=34) (actual time=0.050..0.059 rows=1 loops=1) Index Cond: (id = 14) -> Seq Scan on nodes n1 (cost=0.00..6151.89 rows=318489 width=38) (actual time=0.010..479.479 rows=318489 loops=1) Total runtime: 953.551 ms (6 rows) I've tried re-writing the query, which results in a different plan: select id from nodes where path like ( select path from nodes where id = 14 limit 1 ) || '%'; QUERY PLAN Seq Scan on nodes (cost=3.19..7747.52 rows=1592 width=4) (actual time=0.230..226.311 rows=7 loops=1) Filter: ((path)::text ~~ (($0)::text || '%'::text)) InitPlan -> Limit (cost=0.00..3.19 rows=1 width=34) (actual time=0.018..0.019 rows=1 loops=1) -> Index Scan using nodes_id on nodes (cost=0.00..35.08 rows=11 width=34) (actual time=0.016..0.016 rows=1 loops=1) Index Cond: (id = 14) Total runtime: 226.381 ms (7 rows) While the plan looks a little better, the estimated rows are woefully inaccurate for some reason, resulting in a seq scan on nodes. If I perform the nested select in the second query separately, then use the result in the outer select, it's extremely fast: test=# select path from nodes where id = 14; path /3/13/ (1 row) Time: 0.555 ms test=# select id from nodes where path like '/3/13/%'; id - 14 169012 15 16 17 169219 169220 (7 rows) Time: 1.062 ms I've vacuum full analyzed. PG version is 8.1.4 The nodes table is as follows: test=# \d nodes Table "public.nodes" Column | Type | Modifiers +-+--- id | integer | not null path | character varying(2000) | not null depth | integer | not null Indexes: "nodes_pkey" PRIMARY KEY, btree (id, path) "nodes_id" btree (id) "nodes_path" btree (path) test# select count(*) from nodes; count 318489 Is there a way to perform this efficiently in one query ? ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] LIKE query problem
Thanks Tom, Is that documented somewhere? I can't seem to see any mention of it in the docs. Tom Lane wrote: Marc McIntyre <[EMAIL PROTECTED]> writes: ... Is there a way to perform this efficiently in one query ? No, because you're hoping for an indexscan optimization of a LIKE query, and that can only happen if the pattern is a plan-time constant. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Decreasing BLKSZ
Our application has a number of inserters posting rows of network statistics into a database. This is occuring continously. The following is an example of a stats table (simplified but maintains key concepts). CREATE TABLE stats ( logtime timestamptz, key int, stat1 bigint, stat2 bigint, stat3 bigint, PRIMARY KEY (key,logtime) ); CREATE INDEX x ON stats(logtime); There are on the order of 1M unique values for "key" and a new row for each key value will be inserted say every 15 minutes. These rows are divided up between a number of different inserting elements, but that isn't relevant. The problem is, the insert pattern has low correlation with the (key,logtime) index. In this case, would need >1M blocks in my shared_buffer space to prevent a read-modify-write type of pattern happening during the inserts (given a large enough database). Wondering about lowering the BLKSZ value so that the total working set of blocks required can be maintained in my shared buffers. Our database only has 8G of memory and likely need to reduce BLKSZ to 512 Any comment on other affects or gotchas with lowering the size of BLKSZ? Currently, our database is thrashing its cache of blocks we we're getting only ~100 inserts/second, every insert results in a evict-read-modify operation. Ideally, like to keep the entire working set of blocks in memory across insert periods so that the i/o looks more like write full blocks Thanks Marc
Re: [PERFORM] Decreasing BLKSZ
> Would it be possible to change the primary key to > (logtime,key)? This could help keeping the "working window" small. No, the application accessing the data wants all the rows between start and end time for a particular key value. > > Secondly, the real working set is smaller, as the rows are > all inserted at the end of the table, filling each page until > it's full, so only the last pages are accessed. There's no > relation between the index order, and the order of data on > disk, unless you CLUSTER. I'd theorizing that my problem is in updating the index itself and not the heap. Insert order Refers to the order by which the applications are inserting the rows and as such, the order by Which the index is being updated. This in turn, is causing the b-tree to be traverse. Problem Is the working set of blocks at the bottom of the btree is too big for my cache. > > > Any comment on other affects or gotchas with lowering the size of > > BLKSZ? Currently, our database is thrashing its cache of blocks we > > we're getting only ~100 inserts/second, every insert results in a > > evict-read-modify operation. > > I'm not shure that's the correct diagnosis. > > Do you have one transaction per insert? Every transaction > means a forced sync to the disk, so you won't get more than > about 100-200 commits per second, depending on your actual > disk rotation speed. No, an insert consists of roughly 10,000+ rows per transaction block. > > To improve concurrency of the "numer of inserters" running in > parallel, try to tweak the config variables commit_delay and > commit_sibling, so you get a higher overall throughput at > cost of an increased delay per connection, and increase the > number of inserters. Using sensible tweaking, the throughput > should scale nearly linear with the number of backens. :-) I don't think this will help us here due to large transactions already. > > If feasible for your application, you can also bundle several > log entries into a single transaction. If you're CPU bound, > you can use COPY instead of INSERT or (if you can wait for > 8.2) the new multi-row INSERT to further improve performance, > but I doubt that you're CPU bound. > > The only way to "really" get over the sync limit is to have > (at least) the WAL on a battery backed ram / SSD media that > has no "spinning disk" > physical limit, or abandon crash safety by turning fsync off. Again, problem is not with WAL writing, already on it's own raid1 disk pair. The I/O pattern we see is about 1-2% load on WAL and 100% load on the array holding the indexes and tables. Throughput is very low, something like 150k-200K bytes/second of real rows being deposited on the disk. The disks are busy seeking all over the disk platter to fetch a block, add a single row, then seek to another spot and write back a previously dirty buffer > > Thanks, > Markus. > -- > Markus Schaber | Logical Tracking&Tracing International AG > Dipl. Inf. | Software Development GIS > > Fight against software patents in Europe! www.ffii.org > www.nosoftwarepatents.org > ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Decreasing BLKSZ
> > The bottom line here is likely to be "you need more RAM" :-( Yup. Just trying to get a handle on what I can do if I need more than 16G Of ram... That's as much as I can put on the installed based of servers 100s of them. > > I wonder whether there is a way to use table partitioning to > make the insert pattern more localized? We'd need to know a > lot more about your insertion patterns to guess how, though. > > regards, tom lane We're doing partitioning as well. > ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Decreasing BLKSZ
Yes, that is our application. We have implemented both scenarios... 1- partitions loaded without indexes on them.. And build index "when partition is full". Slow to drill down into incomplete partitions. 2- paritions with index as loaded. Slow, on insert (problem mentioned) but good to drill down So, I'd like my cake and eat it too... :-) I'd like to have my indexes built as rows are inserted into the partition so help with the drill down... > -Original Message- > From: Bucky Jordan [mailto:[EMAIL PROTECTED] > Sent: Tuesday, September 26, 2006 5:26 PM > To: Marc Morin; Tom Lane > Cc: Markus Schaber; pgsql-performance@postgresql.org > Subject: RE: [PERFORM] Decreasing BLKSZ > > > > The bottom line here is likely to be "you need more RAM" :-( > > > > Yup. Just trying to get a handle on what I can do if I > need more than > > 16G Of ram... That's as much as I can put on the installed based of > > servers 100s of them. > > > > > > > > I wonder whether there is a way to use table partitioning to make > > > the insert pattern more localized? We'd need to know a lot more > > > about your insertion patterns to guess how, though. > > > > > > regards, tom lane > > > > We're doing partitioning as well. > > > > I'm guessing that you basically have a data collection > application that sends in lots of records, and a reporting > application that wants summaries of the data? So, if I > understand the problem correctly, you don't have enough ram > (or may not in the future) to index the data as it comes in. > > Not sure how much you can change the design, but what about > either updating a summary table(s) as the records come in > (trigger, part of the transaction, or do it in the > application) or, index periodically? In otherwords, load a > partition (say a day's worth) then index that partition all > at once. If you're doing real-time analysis that might not > work so well though, but the summary tables should. > > I assume the application generates unique records on its own > due to the timestamp, so this isn't really about checking for > constraint violations? If so, you can probably do away with > the index on the tables that you're running the inserts on. > > - Bucky > ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Optimicing Postgres for SunSolaris10 on V240
Hi... Bacula does no transaction right now, so every insert is done separately with autocommit. Moreover, the insert loop for the main table is done by several individual queries to insert data in several tables (filename, dir, then file), so this is slow. There's work underway to speed that up, using a big COPY to a temp table, then queries to dispatch the records in the right places as fast as possible. The patch has been made, but as it is a noticeable change in the core, will take some time to be integrated... See the thread about that in the bacula devel list a few weeks ago... Anyhow, our benchmark for now shows a 10-20 times speedup with postgresql, fsync stays on, and it becomes faster than mysql, and scales with the number of cpus... I cannot tell when/if it will be included, but there's work on this. For now, the only thing you can do is fsync=off, knowing you're taking a chance with the data (but it's not that big a problem, as it's only bacula's database, and can be rebuilt from the tapes or from a dump...) or a writeback disk controller. On Friday 17 November 2006 19:44, Josh Berkus wrote: > Berner, > > First, I've corrected you e-mail so that it goes to the list, and not to > me directly. > > > I use my PostgreSQL 8.0.4 as Catalogue-Database for Bacula. > > Bacula is a Backupsoftware. > > Yes. The lead contributor to Bacula is a active PostgreSQL project > participant; I'll see if he'll look into your issue. > > > When I backing up System (lot of very small Files) then PostgreSQL seams > > to by the bottleneck by inserting Catalogueinformation of every single > > File. The System on which Postgres runs is a Sun Solaris 10 Server on a > > Sun Fire V240 with 1GB RAM, 1CPU (SUNW,UltraSPARC-IIIi at 1.3GHz), 2 > > Ultra SCSI-3 Disks 73GB at 10k RPM which are in Raid1 (Solaris Softraid). > > > > Can someone gif me a hint for compiling PostgreSQL or configuring the > > Database. > > > > fsync is already disabled.. > > This is a bad idea if you care about your database. > > So, PostgreSQL 8.1 is now official supported by Sun and ships with > Solaris 10 update 2 or later. It is recommended that you use that > rather and an out-of-date version. Second, see > www.powerpostgresql.com/PerfList > > --Josh Berkus > > > ---(end of broadcast)--- > TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] Problem with large query
Hi. I hope I'm not asking a too trivial question here... I'm having trouble with a (quite big) query, and can't find a way to make it faster. Here is the information : Tables : sces_vte -> 2753539 rows sces_art -> 602327 sces_fsf -> 8126 sces_frc -> 7763 sces_tps -> 38 sces_gtr -> 35 Query : === SELECT sces_gtr_art.gtr_cod, sces_gtr_art.gtr_lib, sces_frc_art.fou_cod, sces_frc_art.fou_lib, sces_tps.tps_annee_mois, TO_NUMBER('200401','99'), TO_NUMBER('200405','99'), sces_tps.tps_libc, sum(sces_vte.vte_mnt), sum(sces_vte.vte_qte), sum(sces_vte.vte_ton), sces_famille.fsf_codfam, sces_famille.fsf_lib, sces_s_famille.fsf_codsfm, sces_s_famille.fsf_lib FROM sces_vte, sces_art, sces_fsf sces_famille, sces_fsf sces_s_famille, sces_frc sces_frc_art, sces_tps, sces_gtr sces_gtr_art WHERE ( sces_famille.fsf_codfam=sces_s_famille.fsf_codfam ) AND ( sces_famille.fsf_codseg= 0 and sces_famille.fsf_codsfm = 0 ) AND ( sces_vte.tps_annee_mois=sces_tps.tps_annee_mois ) AND ( sces_vte.art_cod=sces_art.art_cod and sces_vte.dos_cod=sces_art.dos_cod ) AND ( sces_gtr_art.gtr_cod=sces_frc_art.gtr_cod ) AND ( sces_frc_art.gtr_cod=sces_art.gtr_cod and sces_frc_art.fou_cod=sces_art.fou_cod ) AND ( sces_s_famille.fsf_codfam=sces_art.fsf_codfam and sces_s_famille.fsf_codsfm=sces_art.fsf_codsfm ) AND ( sces_s_famille.fsf_codseg = 0 ) AND ( ( ( ( sces_tps.tps_annee_mois ) >= ( TO_NUMBER('200401','99') ) and ( sces_tps.tps_annee_mois ) <= ( TO_NUMBER('200405','99') ) ) OR ( ( sces_tps.tps_annee_mois ) >= ( TO_NUMBER('200401','99') )-100 and ( sces_tps.tps_annee_mois ) <= ( TO_NUMBER('200405','99') )-100 ) ) AND ( sces_gtr_art.gtr_cod in (2)) ) GROUP BY sces_gtr_art.gtr_cod, sces_gtr_art.gtr_lib, sces_frc_art.fou_cod, sces_frc_art.fou_lib, sces_tps.tps_annee_mois, TO_NUMBER('200401','99'), TO_NUMBER('200405','99'), sces_tps.tps_libc, sces_famille.fsf_codfam, sces_famille.fsf_lib, sces_s_famille.fsf_codsfm, sces_s_famille.fsf_lib Explain Analyze Plan : GroupAggregate (cost=27161.91..27938.72 rows=16354 width=280) (actual time=484509.210..544436.148 rows=4115 loops=1) -> Sort (cost=27161.91..27202.79 rows=16354 width=280) (actual time=484496.188..485334.151 rows=799758 loops=1) Sort Key: sces_gtr_art.gtr_cod, sces_gtr_art.gtr_lib, sces_frc_art.fou_cod, sces_frc_art.fou_lib, sces_tps.tps_annee_mois, 200401::numeric, 200405::numeric, sces_tps.tps_libc, sces_famille.fsf_codfam, sces_famille.fsf_lib, sces_s_famille.fsf_codsfm, sces_s_famille.fsf_lib -> Merge Join (cost=25727.79..26017.34 rows=16354 width=280) (actual time=58945.821..69321.146 rows=799758 loops=1) Merge Cond: (("outer".fsf_codfam = "inner".fsf_codfam) AND ("outer".fsf_codsfm = "inner".fsf_codsfm)) -> Sort (cost=301.36..304.60 rows=1298 width=83) (actual time=27.926..28.256 rows=332 loops=1) Sort Key: sces_s_famille.fsf_codfam, sces_s_famille.fsf_codsfm -> Seq Scan on sces_fsf sces_s_famille (cost=0.00..234.24 rows=1298 width=83) (actual time=0.042..19.124 rows=1341 loops=1) Filter: (fsf_codseg = 0::numeric) -> Sort (cost=25426.43..25448.05 rows=8646 width=225) (actual time=58917.106..59693.810 rows=799758 loops=1) Sort Key: sces_art.fsf_codfam, sces_art.fsf_codsfm -> Merge Join (cost=24726.32..24861.08 rows=8646 width=225) (actual time=19036.709..29404.943 rows=799758 loops=1) Merge Cond: ("outer".tps_annee_mois = "inner".tps_annee_mois) -> Sort (cost=2.49..2.53 rows=17 width=23) (actual time=0.401..0.428 rows=20 loops=1) Sort Key: sces_tps.tps_annee_mois -> Seq Scan on sces_tps (cost=0.00..2.14 rows=17 width=23) (actual time=0.068..0.333 rows=20 loops=1) Filter: (((tps_annee_mois >= 200301::numeric) OR (tps_annee_mois >= 200401::numeric)) AND ((tps_annee_mois <= 200305::numeric) OR (tps_annee_mois >= 200401::numeric)) AND ((tps_annee_mois >= 200301::numeric) OR (tps_annee_mois <= 200405::numeric)) AND ((tps_annee_mois <= 200305::numeric) OR (tps_annee_mois <= 200405::numeric))) -> Sort (cost=24723.83..24747.97 rows=9656 width=214) (actual time=19036.223..19917.214 rows=799757 loops=1) Sort Key: sces_vte.tps_annee_mois -> Nested Loop (cost=21825.09..24084.74 rows=9656 width=214) (actual time=417.603..8644.294 rows=399879 loops=1) -> Nested Loop (cost=21825.09..21837.50 rows=373 width=195) (actual time=417.444..672.741 rows=14158 loops=1) -> Seq Scan on sces_gtr sces_gtr_art
Re: [PERFORM] Problem with large query
The query has been generated by business objects ... i'ill try to suggest to the developpers to remove this constant (if they can)... The fields used by the sort are of type numeric(6,0) or (10,0) ... Could it be better if the fields were integer or anything else ? On Wednesday 08 September 2004 16:40, you wrote: > Marc Cousin <[EMAIL PROTECTED]> writes: > > I'm having trouble with a (quite big) query, and can't find a way to make it > > faster. > > Seems like it might help if the thing could use a HashAggregate instead > of sort/group. Numeric is not hashable, so having those TO_NUMBER > constants in GROUP BY destroys this option instantly ... but why in the > world are you grouping by constants anyway? You didn't say what the > datatypes of the other columns were... > >regards, tom lane > ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Problem with large query
On Wednesday 08 September 2004 16:56, you wrote: > Marc Cousin <[EMAIL PROTECTED]> writes: > > The query has been generated by business objects ... i'ill try to suggest to the > > developpers to remove this constant (if they can)... > > The fields used by the sort are of type numeric(6,0) or (10,0) ... > > Could it be better if the fields were integer or anything else ? > > integer or bigint would be a WHOLE lot faster. I'd venture that > comparing two numerics is order of a hundred times slower than > comparing two integers. > > Even if you don't want to change the fields on-disk, you might think > about casting them all to int/bigint in the query. > > Another thing that might or might not be easy is to change the order of > the GROUP BY items so that the fields with the largest number of > distinct values are listed first. If two rows are distinct at the first > column, the sorting comparison doesn't even have to look at the > remaining columns ... > >regards, tom lane > Thanks. I've just had confirmation that they can remove the two constants (allready won 100 seconds thanks to that) I've tried the cast, and got down to 72 seconds. So now we're going to try to convert the fields to int or bigint. Thanks a lot for your help and time. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Does PostgreSQL run with Oracle?
On Fri, 15 Oct 2004 11:54:44 -0500, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > My basic question to the community is "is PostgreSQL approximately as fast > as Oracle?" > > I don't want benchmarks, they're BS. I want a gut feel from this community > because I know many of you are in mixed shops that run both products, or > have had experience with both. That all depends on exactly what your application needs to do. There are many more features that Oracle has and postgres doesn't than vice versa. If you need to do something with your data that isn't possible to do as efficiently without one of those features, then yes postgresql can be much slower. If you don't need any such features, it can be ballpark, until you start getting into fairly hefty hardware. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Optimizing PostgreSQL for Windows
Although I'm not an expert on this stuff, but 32 MB of shared buffers seems quite low to me, even for a windows machine. I'm running postgres 8.2 on my workstation with 2GB of ram and an AMD x64 3500+ with shared_buffer set to 256MB without any trouble an it's running fine, even on large datasets and other applications running. In my experience, shared_buffers are more important than work_mem. Have you tried increasing default_statistic_targets (eg to 200 or more) and after that running "analyze" on your tables or the entire database? Marc Christian Rengstl wrote: > Hi list, > > I have the following query: > select t.a1, t.a2 from table1 t inner join table2 s > using(id) where t.pid='xyz' and s.chromosome=9 order by s.pos; > > With the following output from analyze: > "Sort (cost=35075.03..35077.51 rows=991 width=14) (actual > time=33313.718..33321.935 rows=22599 loops=1)" > " Sort Key: s.pos" > " -> Hash Join (cost=7851.48..35025.71 rows=991 width=14) (actual > time=256.513..33249.701 rows=22599 loops=1)" > "Hash Cond: ((t.id)::text = (s.id)::text)" > "-> Bitmap Heap Scan on table1 t (cost=388.25..27357.57 > rows=22286 width=23) (actual time=112.595..32989.663 rows=22864 > loops=1)" > " Recheck Cond: ((pid)::text = 'xyz'::text)" > " -> Bitmap Index Scan on idx_table1 (cost=0.00..382.67 > rows=22286 width=0) (actual time=103.790..103.790 rows=22864 loops=1)" > "Index Cond: ((pid)::text = 'xyz'::text)" > "-> Hash (cost=7180.62..7180.62 rows=22609 width=17) (actual > time=143.867..143.867 rows=22864 loops=1)" > " -> Bitmap Heap Scan on table2 s (cost=333.00..7180.62 > rows=22609 width=17) (actual time=108.715..126.637 rows=22864 loops=1)" > "Recheck Cond: ((chromosome)::text = '9'::text)" > "-> Bitmap Index Scan on idx_table2 > (cost=0.00..327.35 rows=22609 width=0) (actual time=108.608..108.608 > rows=22864 loops=1)" > " Index Cond: ((chromosome)::text = > '9'::text)" > > My OS is Windows 2003 with 4GB Ram and Xeon Duo with 3.2 GHz; > shared_buffers is set to 32MB (as I read it should be fairly low on > Windows) and work_mem is set to 2500MB, but nevertheless the query takes > about 38 seconds to finish. The table "table1" contains approx. 3 > million tuples and table2 approx. 500.000 tuples. If anyone could give > an advice on either how to optimize the settings in postgresql.conf or > anything else to make this query run faster, I really would appreciate. > > > > > Christian Rengstl M.A. > Klinik und Poliklinik für Innere Medizin II > Kardiologie - Forschung > Universitätsklinikum Regensburg > B3 1.388 > Franz-Josef-Strauss-Allee 11 > 93053 Regensburg > Tel.: +49-941-944-7230 > > > > > ---(end of broadcast)--- > TIP 2: Don't 'kill -9' the postmaster > > > -- Marc Schablewski click:ware Informationstechnik GmbH ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] strange plan choice
Hi, This occurs on postgresql 8.2.5. I'm a bit at loss with the plan chosen for a query : The query is this one : SELECT SULY_SAOEN.SAOEN_ID, SULY_SDCEN.SDCEN_REF, SULY_SDCEN.SDCEN_LIB, CSTD_UTI.UTI_NOM, CSTD_UTI.UTI_LIBC, SULY_SAOEN.SAOEN_DTDERNENVOI, SULY_SDCEN.SDCEN_DTLIMAP, SULY_PFOUR.PFOUR_RAISON, SULY_SDCEN.PGTC_CODE FROM SULY_SDCEN inner join SULY_SDDEN on (SULY_SDCEN.SDCEN_ID=SULY_SDDEN.SDCEN_ID) inner join SULY_SAOEN on (SULY_SAOEN.SDDEN_ID=SULY_SDDEN.SDDEN_ID) inner join CSTD_UTI on (CSTD_UTI.UTI_CODE=SULY_SDDEN.SDDEN_RESPPROS) inner join SULY_PFOUR on (SULY_PFOUR.PFOUR_ID=SULY_SAOEN.PFOUR_ID) WHERE SULY_SDCEN.PGTC_CODE = '403' AND SULY_SDDEN.PBURE_ID IN (41) AND SULY_SAOEN.SAOEN_ID IN ( SELECT TmpAoen.SAOEN_ID FROM SULY_SAOPR TmpAopr LEFT JOIN SULY_SOFPR TmpOfpr ON (TmpOfpr.SAOPR_ID = TmpAopr.SAOPR_ID),SULY_SAOEN TmpAoen WHERE TmpAopr.SAOEN_ID= TmpAoen.SAOEN_ID AND (SOFPR_DEMCOMP = 1 OR (SAOPR_DTENV IS NOT NULL AND SAOPR_DTREPONSE IS NULL)) ) The plan I get is : QUERY PLAN Nested Loop (cost=65.91..2395.16 rows=6 width=142) (actual time=696.212..2566.996 rows=2 loops=1) -> Nested Loop IN Join (cost=65.91..2391.95 rows=6 width=124) (actual time=696.189..2566.957 rows=2 loops=1) Join Filter: (suly_saoen.saoen_id = tmpaopr.saoen_id) -> Nested Loop (cost=10.84..34.21 rows=6 width=124) (actual time=0.233..0.617 rows=30 loops=1) -> Nested Loop (cost=10.84..29.00 rows=2 width=108) (actual time=0.223..0.419 rows=2 loops=1) -> Hash Join (cost=10.84..24.44 rows=2 width=87) (actual time=0.207..0.372 rows=2 loops=1) Hash Cond: (suly_sdden.sdcen_id = suly_sdcen.sdcen_id) -> Seq Scan on suly_sdden (cost=0.00..13.36 rows=58 width=27) (actual time=0.012..0.163 rows=58 loops=1) Filter: (pbure_id = 41) -> Hash (cost=10.74..10.74 rows=8 width=76) (actual time=0.129..0.129 rows=8 loops=1) -> Seq Scan on suly_sdcen (cost=0.00..10.74 rows=8 width=76) (actual time=0.017..0.113 rows=8 loops=1) Filter: ((pgtc_code)::text = '403'::text) -> Index Scan using pk_cstd_uti on cstd_uti (cost=0.00..2.27 rows=1 width=42) (actual time=0.015..0.017 rows=1 loops=2) Index Cond: ((cstd_uti.uti_code)::text = (suly_sdden.sdden_resppros)::text) -> Index Scan using ass_saoen_sdden_fk on suly_saoen (cost=0.00..2.54 rows=5 width=32) (actual time=0.007..0.049 rows=15 loops=2) Index Cond: (suly_saoen.sdden_id = suly_sdden.sdden_id) -> Hash Join (cost=55.07..2629.62 rows=8952 width=16) (actual time=0.119..82.680 rows=3202 loops=30) Hash Cond: (tmpaopr.saoen_id = tmpaoen.saoen_id) -> Merge Left Join (cost=0.00..2451.46 rows=8952 width=8) (actual time=0.027..76.229 rows=3202 loops=30) Merge Cond: (tmpaopr.saopr_id = tmpofpr.saopr_id) Filter: ((tmpofpr.sofpr_demcomp = 1::numeric) OR ((tmpaopr.saopr_dtenv IS NOT NULL) AND (tmpaopr.saopr_dtreponse IS NULL))) -> Index Scan using pk_suly_saopr on suly_saopr tmpaopr (cost=0.00..1193.49 rows=15412 width=32) (actual time=0.012..19.431 rows=14401 loops=30) -> Index Scan using ass_saopr_sofpr_fk on suly_sofpr tmpofpr (cost=0.00..998.90 rows=14718 width=16) (actual time=0.010..18.377 rows=13752 loops=30) -> Hash (cost=38.92..38.92 rows=1292 width=8) (actual time=2.654..2.654 rows=1292 loops=1) -> Seq Scan on suly_saoen tmpaoen (cost=0.00..38.92 rows=1292 width=8) (actual time=0.006..1.322 rows=1292 loops=1) -> Index Scan using pk_suly_pfour on suly_pfour (cost=0.00..0.52 rows=1 width=34) (actual time=0.010..0.011 rows=1 loops=2) Index Cond: (suly_pfour.pfour_id = suly_saoen.pfour_id) Total runtime: 2567.225 ms (28 lignes) What I don't understand is the Nested Loop IN. If I understand correctly, the consequence is that the bottom part (hash joins) is done 30 times ? Why not just once ? If I remove SULY_SDCEN.PGTC_CODE = '403', the query becomes 25 times faster. - Nested Loop (cost=2766.40..2879.44 rows=175 width=142) (actual time=121.927..123.996 rows=120 loops=1) -> Hash Join (cost=2766.40..2785.92 rows=175 width=124) (actual time=121.881
Re: [PERFORM] Delete performance again
Hi, Maybe you can try this syntax. I'm not sure, but it eventually perform better: delete from company_alias USING comprm where company_alias.company_id =comprm.id Cheers, Marc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query optimization
Le Sunday 30 November 2008 19:45:11 tmp, vous avez écrit : > I am struggeling with the following query which fetches a random subset > of 200 questions that matches certain tags within certain languages. > However, the query takes forever to evaluate, even though I have a > "limit 200" appended. Any ideas on how to optimize it? > > QUERY: > > SELECT distinct q.question_id >FROM question_tags qt, questions q >WHERE q.question_id = qt.question_id > AND q.STATUS = 1 > AND not q.author_id = 105 > AND ((qt.language_id = 5 and qt.tag_id in > (1101,917,734,550,367,183)) or (qt.language_id = 4 and qt.tag_id in > (856,428)) or (qt.language_id = > 3 and qt.tag_id in (1156,1028,899,771,642,514,385,257,128)) or > (qt.language_id = 2 and qt.tag_id in > (1193,1101,1009,917,826,734,642,550,458,367,275,183,91))) > and q.question_id not in (413) >LIMIT 200 > > EXPLAIN ANALYZE: = > > Limit (cost=1.50..1267.27 rows=200 width=4) (actual > time=278.169..880.934 rows=200 loops=1) > -> Unique (cost=1.50..317614.50 rows=50185 width=4) (actual > time=278.165..880.843 rows=200 loops=1) > -> Merge Join (cost=1.50..317489.04 rows=50185 width=4) > (actual time=278.162..880.579 rows=441 loops=1) > Merge Cond: (qt.question_id = q.question_id) > -> Index Scan using question_tags_question_id on > question_tags qt (cost=0.00..301256.96 rows=82051 width=4) (actual > time=24.171..146.811 rows=6067 loops=1) > Filter: (((language_id = 5) AND (tag_id = ANY > ('{1101,917,734,550,367,183}'::integer[]))) OR ((language_id = 4) AND > (tag_id = ANY ('{856,428}'::integer[]))) OR ((language_id = 3) AND > (tag_id = ANY ('{1156,1028,899,771,642,514,385,257,128}'::integer[]))) > OR ((language_id = 2) AND (tag_id = ANY > ('{1193,1101,1009,917,826,734,642,550,458,367,275,183,91}'::integer[] > -> Index Scan using questions_pkey on questions q > (cost=0.00..15464.12 rows=83488 width=4) (actual time=222.956..731.737 > rows=1000 loops=1) > Filter: ((q.author_id <> 105) AND (q.question_id > <> 413) AND (q.status = 1)) > Total runtime: 881.152 ms > (9 rows) First, because of the distinct, the limit 200 wont reduce the work to be done a lot : it will still have to sort for the unique. Only when everything is sorted will it take only the first 200 records. And anyhow it seems there are only 441 rows before doing the distinct, so, at least for this query, it won't change a lot the times. Then it seems to me that you may try to create composed indexes, because there is a lot of filtering after the index scans (that is if you want the query to be faster). Maybe (q.author_id,q.status). For qt you may try (qt.language_id,qt.tag_id)... Hope it helps Cheers -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] temp_tablespaces and RAID
Hello, To improve performances, I would like to try moving the temp_tablespaces locations outside of our RAID system. Is it a good practice ? Thanks, Marc Mamin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very specialised query
Hello, if your data are mostly static and you have a few mains objects, maybe you can have some gain while defining conditional indexes for those plus one for the rest and then slicing the query: create index o_1x on X (start,end,id) where object_id = 1 create index o_2x on X (start,end,id) where object_id = 2 create index o_3x on X (start,end,id) where object_id = 3 create index o_4x on X (start,end,id) where object_id = 4 ... create index o_4x on X (start,end,id) where object_id not in (1,2,3,4..) I'm not sure that putting all in one index and using the BETWEEN clause as in my example is the best method though. Marc Mamin SELECT l1.id AS id1, l2.id AS id2 FROM location l1, location l2 WHERE l1.objectid = 1 AND (l2.start BETWEEN l1.start AND l1.end OR l1.start BETWEEN l2.start AND l2.end ) l1.start AND l2.start <> l2.start -- if required AND l2.start <> l2.end -- if required AND l1.id <> l2.id UNION ALL ... WHERE l1.objectid = 2 ... UNION ALL ... WHERE l1.objectid not in (1,2,3,4..)
Re: [PERFORM] Very specialised query
>> WHERE (l2.start BETWEEN l1.start AND l1.end >> OR >> l1.start BETWEEN l2.start AND l2.end >> ) >Yes, that's another way to calculate an overlap. However, it turns out to not >be that fast. >The problem is that OR there, which causes a bitmap index scan, as the leaf of >a nested loop join, >which can be rather slow. Ok , than splitting these checks in 2 Queries with UNION is better. But I often read that BETWEEN is faster than using 2 comparison operators. Here I guess that a combined index on (start,end) makes sense: .. WHERE l2.start BETWEEN l1.start AND l1.end .. UNION .. WHERE l1.start BETWEEN l2.start AND l2.end .. The first clause being equivalent to AND l1.start <= l2.end AND l1.end >= l2.start AND l1.start <= l2.start I don't know how you have to deal the limit conditions... Marc Mamin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very specialised query
Hello Matthew, Another idea: Are your objects limited to some smaller ranges of your whole interval ? If yes you may possibly reduce the ranges to search for while using an additional table with the min(start) max(end) of each object... Marc Mamin
Re: [PERFORM] Scalability in postgres
It's not that trivial with Oracle either. I guess you had to use shared servers to get to that amount of sessions. They're most of the time not activated by default (dispatchers is at 0). Granted, they are part of the 'main' product, so you just have to set up dispatchers, shared servers, circuits, etc ... but there is still setup to do : dispatchers are (if I recall correctly) a completely manual parameter (and every dispatcher can only drive a certain amount of sessions, dependant on the operating system), where shared servers is a bit more dynamic, but still uses processes (so you may have to tweak max processes also). What I mean to say is that Oracle does something quite alike PostgreSQL + a connection pooler, even if it's more advanced (it's a shared memory structure that is used to send messages between dispatchers and shared servers). Or did you mean that you had thousands of sessions in dedicated mode ? On Wednesday 03 June 2009 20:13:39 Dimitri wrote: > Just to say you don't need a mega server to keep thousands connections > with Oracle, it's just trivial, nor CPU affinity and other stuff you > may or may not need with Sybase :-) > > Regarding PostgreSQL, I think it'll only benefit to have an integrated > connection pooler as it'll make happy all populations anyway: > - those who don't like the idea may always disable it :-) > - those who have a lot but mostly inactive sessions will be happy to > simplify session pooling > - those who really seeking for the most optimal workload on their > servers will be happy twice: if there are any PG scalability limits, > integrated pooler will be in most cases more performant than external; > if there are no PG scalability limits - it'll still help to size PG > most optimally according a HW or OS capacities.. > > Rgds, > -Dimitri > > On 6/3/09, Kevin Grittner wrote: > > Dimitri wrote: > >> Few weeks ago tested a customer application on 16 cores with Oracle: > >> - 20,000 sessions in total > >> - 70,000 queries/sec > >> > >> without any problem on a mid-range Sun box + Solaris 10.. > > > > I'm not sure what point you are trying to make. Could you elaborate? > > > > (If it's that Oracle doesn't need an external connection pool, then > > are you advocating that PostgreSQL include that in the base product?) > > > > -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Very big insert/join performance problem (bacula)
e a big budget for this). There was no swapping of linux So all of this makes me think there is a cost evaluation problem in this setup : with the default values, postgresql seems to underestimate the cost of sorting here (the row estimates were good, no problem with that). PostgreSQL seems to think that at around 1 million records in file it should go with a hash join on filename and path, so we go on hashing the 17 million records of path, the 80 millions of filename, then joining and inserting into file (we're talking about sorting around 15 GB for each of these despools in parallel). Temporarily I moved the problem at a bit higher sizes of batch by changing random_page_cost to 0.02 and seq_page_cost to 0.01, but I feel like an apprentice sorcerer with this, as I told postgreSQL that fetching rows from disk are much cheaper than they are. These values are, I think, completely abnormal. Doing this, I got the change of plan at around 8 million. And had 2 of them at 9 millions at the same time this weekend, and both of the took 24 hours, while the nested loops before the join (for inserts in path and filename) did their work in minutes... So, finally, to my questions : - Is it normal that PostgreSQL is this off base on these queries (sorry I don't have the plans, if they are required I'll do my best to get some, but they really are the two obvious plans for this kind of query). What could make it choose the hash join for too small batch tables ? - Is changing the 2 costs the way to go ? - Is there a way to tell postgreSQL that it's more costly to sort than it thinks ? (instead of telling it that fetching data from disk doesn't cost anything). Here are the other non-default values from my configuration : shared_buffers = 2GB work_mem = 64MB maintenance_work_mem = 256MB max_fsm_pages = 1500 # There are quite big deletes with bacula ... effective_cache_size = 800MB default_statistics_target = 1000 PostgreSQL is 8.3.5 on Debian Lenny I'm sorry for this very long email, I tried to be as precise as I could, but don't hesitate to ask me more. Thanks for helping. Marc Cousin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
We regularly do all of dbcheck. This is our real configuration, there are really lots of servers and lots of files (500 million files backed up every month). But thanks for mentionning that. The thing is we're trying to improve bacula with postgresql in order to make it able to bear with this kind of volumes. So we are looking for things to improve bacula and postgresql tuning to make it cope with the queries mentionned (or rewrite the queries or the way to do inserts, that may not be a problem either) On Monday 13 July 2009 16:37:06 SystemManagement wrote: > Hi, > just a remark, as the number of entries seems to be very high: > Did you ever activate bacula's program dbcheck Option 16? > > Regards > > Reiner -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
> > While this is not your questions, I still noticed you seem to be on 8.3 - > it might be a bit faster to use GROUP BY instead of DISTINCT. It didn't do a big difference, I already tried that before for this query. Anyway, as you said, it's not the query having problems :) > Your effective_cache_size is really small for the system you seem to have - > its the size of IO caching your os is doing and uses no resources itself. > And 800MB of that on a system with that amount of data seems a bit unlikely > ;-) > > Using `free` you can see the amount of io caching your OS is doing atm. in > the 'cached' column. > > That possibly might tip some plans in a direction you prefer. > > What kind of machine are you running this on? I played with this parameter too, and it didn't influence the plan. Anyway, the doc says it's the OS cache available for one query, and there may be a lot of insert queries at the same time, so I chose to be conservative with this value. I tried it with 8GB too, the plans were the same. The OS cache is around 8-10GB by the way. The machine is a dell PE2900, with 6 disks dedicated to this database (raid 10 config) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
Le Tuesday 14 July 2009 10:15:21, vous avez écrit : > Marc Cousin wrote: > >> Your effective_cache_size is really small for the system you seem to > >> have - its the size of IO caching your os is doing and uses no resources > >> itself. And 800MB of that on a system with that amount of data seems a > >> bit unlikely ;-) > >> > >> Using `free` you can see the amount of io caching your OS is doing atm. > >> in the 'cached' column. > >> > >> That possibly might tip some plans in a direction you prefer. > >> > >> What kind of machine are you running this on? > > > > I played with this parameter too, and it didn't influence the plan. > > Anyway, the doc says it's the OS cache available for one query, > > No they don't. I'm guessing you're getting mixed up with work_mem. I'm not (from the docs) : effective_cache_size (integer) Sets the planner's assumption about the effective size of the disk cache that is available to a single query I trust you, of course, but then I think maybe this should be rephrased in the doc then, because I understand it like I said ... I always had a doubt about this sentence, and that's why I tried both 800MB and 8GB for this parameter. > > > and there may be a lot of > > > > insert queries at the same time, so I chose to be conservative with this > > value. I tried it with 8GB too, the plans were the same. > > > > The OS cache is around 8-10GB by the way. > > That's what you need to set effective_cache_size to then. Ok but that doesn't change a thing for this query (I had a doubt on this parameter and tried with both 800MB and 8GB) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
Le Tuesday 14 July 2009 10:23:25, Richard Huxton a écrit : > Marc Cousin wrote: > > Temporarily I moved the problem at a bit higher sizes of batch by > > changing random_page_cost to 0.02 and seq_page_cost to 0.01, but I feel > > like an apprentice sorcerer with this, as I told postgreSQL that fetching > > rows from disk are much cheaper than they are. These values are, I think, > > completely abnormal. > > They certainly don't have anything to do with reality. Try putting them > back to (say) seq_page_cost=1 and random_page_cost=2. That's the first thing I tried (it seemed more sensible), and it didn't work. I can't put them back to these values for more than one test query, the server really died before I changed the settings. > > > So, finally, to my questions : > > - Is it normal that PostgreSQL is this off base on these queries (sorry I > > don't have the plans, if they are required I'll do my best to get some, > > but they really are the two obvious plans for this kind of query). What > > could make it choose the hash join for too small batch tables ? > > No point in speculating without plans. Ok, I'll try to have them tomorrow. > > > - Is changing the 2 costs the way to go ? > > Not the way you have. That's what I thought, and the reason I posted :) > > > - Is there a way to tell postgreSQL that it's more costly to sort than it > > thinks ? (instead of telling it that fetching data from disk doesn't cost > > anything). > > That's what the configuration settings do. But if you put a couple way > off from reality it'll be pure chance if it gets any estimates right. > > > Here are the other non-default values from my configuration : > > > > shared_buffers = 2GB > > work_mem = 64MB > > Set this *much* higher when you are running your bulk imports. You can > do it per-connection. Try 256MB, 512MB, 1GB (but keep an eye on total > memory used). I'll try that. But anyhow, I've got much better performance when not doing the hash join. I'll get back with the plans as soon as possible. > > > maintenance_work_mem = 256MB > > max_fsm_pages = 1500 # There are quite big deletes with bacula ... > > effective_cache_size = 800MB > > See other emails on this one. > > > default_statistics_target = 1000 > > Probably don't need this for all columns, but it won't cause problems > with these queries. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
Total runtime: 42371.362 ms The thing is that this query is ten times faster, but it's not the main point : this query stays reasonably fast even when there are 20 of it running simultaneously. Of course, as it's faster, it also has less tendancy to pile up than the other one does. When I get 10-20 of the first one running at the same time, the queries get extremely slow (I guess they are fighting for accessing the sort disk, because I see a lot of smaller IOs instead of the big and nice IOs I see when only one of these queries runs). The IO subsystem seems to degrade very much when there is a lot of concurrent activity on this server. For instance, last weekend, we had to 8 million simultaneous backups, with the hash join plan. It took 24 hours for them to complete. If they had been alone on the server, it would have taken around 1 hour for each of them. Of course, with these smaller cost values, there is still a batch size when the plans goes back to the first one. Plan 3 seq_page_cost to 1, random_page_cost to 2. Plan is the same as Plan 1. - Hash Join (cost=3923961.69..5131416.88 rows=1286440 width=91) Hash Cond: (batch.name = filename.name) -> Hash Join (cost=880144.31..1286270.06 rows=1286440 width=102) Hash Cond: (batch.path = path.path) -> Seq Scan on temp_mc batch (cost=0.00..49550.40 rows=1286440 width=189) -> Hash (cost=425488.36..425488.36 rows=16747036 width=92) -> Seq Scan on path (cost=0.00..425488.36 rows=16747036 width=92) -> Hash (cost=1436989.50..1436989.50 rows=79105350 width=35) -> Seq Scan on filename (cost=0.00..1436989.50 rows=79105350 width=35) (9 rows) Plan 4: seq_page_cost to 1, random_page_cost back to 4, raise work_mem to 512MB. Same as Plan 1 Estimated cost hasn't changed. Is this normal ? - Hash Join (cost=3923961.69..5131416.88 rows=1286440 width=91) Hash Cond: (batch.name = filename.name) -> Hash Join (cost=880144.31..1286270.06 rows=1286440 width=102) Hash Cond: (batch.path = path.path) -> Seq Scan on temp_mc batch (cost=0.00..49550.40 rows=1286440 width=189) -> Hash (cost=425488.36..425488.36 rows=16747036 width=92) -> Seq Scan on path (cost=0.00..425488.36 rows=16747036 width=92) -> Hash (cost=1436989.50..1436989.50 rows=79105350 width=35) -> Seq Scan on filename (cost=0.00..1436989.50 rows=79105350 width=35) (9 rows) Maybe this one would scale a bit better, as there would be less sort files ? I couldn't execute it and get reliable times (sorry, the production period has started). If necessary, I can run it again tomorrow. I had to cancel the query after more than 15 minutes, to let the server do it's regular work. There are other things I am thinking of : maybe it would be better to have sort space on another (and not DBRD'ded) raid set ? we have a quite cheap setup right now for the database, and I think maybe this would help scale better. I can get a filesystem in another volume group, which is not used that much for now. Anyway, thanks for all the ideas you could have on this. Marc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
Le Wednesday 15 July 2009 15:45:01, Alvaro Herrera a écrit : > Marc Cousin escribió: > > There are other things I am thinking of : maybe it would be better to > > have sort space on another (and not DBRD'ded) raid set ? we have a quite > > cheap setup right now for the database, and I think maybe this would help > > scale better. I can get a filesystem in another volume group, which is > > not used that much for now. > > You know, that's the first thing it came to me when I read you're using > DRDB. Have you tried setting temp_tablespace to a non-replicated disk? I wish I could easily. I'm not entitled to tune the database, only to give directives. I've given this one, but I don't know when it will be done. I'll keep you informed on this one, but I don't have my hopes too high. As mentionned before, I tried to deactivate DRBD (still using the DRBD device, but not connected to the other node, so it has almost no effect). It didn't change much (performance was a bit (around 20% better). Anyway, the thing is that : - big sorts kill my machine when there are more that 5 of them. I think it is a system problem (raid, filesystem, linux tuning, don't really know, I'll have to dig into this, but it will be complicated, for human reasons :) ) - the plan through nested loops is faster anyway, and I think it's because there is only a small fraction of filename and path that is used (most files backed up have the same name or path, as we save 600 machines with mostly 2 OSes, linux and windows), so the hot parts of these 2 tables are extremely likely to be in the database or linux cache (buffer hit rate was 97% in the example provided). Moreover, the first two queries of the insert procedure fill the cache for us... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
Le Thursday 16 July 2009 01:56:37, Devin Ben-Hur a écrit : > Marc Cousin wrote: > > This mail contains the asked plans : > > Plan 1 > > around 1 million records to insert, seq_page_cost 1, random_page_cost 4 > > > > -> Hash (cost=425486.72..425486.72 rows=16746972 width=92) > > (actual time=23184.196..23184.196 rows=16732049 loops=1) -> Seq Scan on > > path (cost=0.00..425486.72 rows=16746972 width=92) (actual > > time=0.004..7318.850 rows=16732049 loops=1) > > > >-> Hash (cost=1436976.15..1436976.15 rows=79104615 width=35) (actual > > time=210831.840..210831.840 rows=79094418 loops=1) -> Seq Scan on > > filename (cost=0.00..1436976.15 rows=79104615 width=35) (actual > > time=46.324..148887.662 rows=79094418 loops=1) > > This doesn't address the cost driving plan question, but I think it's a > bit puzzling that a seq scan of 17M 92-byte rows completes in 7 secs, > while a seqscan of 79M 35-byte rows takes 149secs. It's about 4:1 row > ratio, less than 2:1 byte ratio, but a 20:1 time ratio. Perhaps there's > some terrible bloat on filename that's not present on path? If that seq > scan time on filename were proportionate to path this plan would > complete about two minutes faster (making it only 6 times slower instead > of 9 :). Much simpler than that I think : there is a bigger percentage of path that is used all the time than of filename. The database used is the production database, so there were other insert queries running a few minutes before I got this plan. But I'll give it a look today and come back with bloat and cache information on these 2 tables. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
On Thursday 16 July 2009 07:20:18 Marc Cousin wrote: > Le Thursday 16 July 2009 01:56:37, Devin Ben-Hur a écrit : > > Marc Cousin wrote: > > > This mail contains the asked plans : > > > Plan 1 > > > around 1 million records to insert, seq_page_cost 1, random_page_cost 4 > > > > > > -> Hash (cost=425486.72..425486.72 rows=16746972 width=92) > > > (actual time=23184.196..23184.196 rows=16732049 loops=1) -> Seq Scan > > > on path (cost=0.00..425486.72 rows=16746972 width=92) (actual > > > time=0.004..7318.850 rows=16732049 loops=1) > > > > > >-> Hash (cost=1436976.15..1436976.15 rows=79104615 width=35) > > > (actual time=210831.840..210831.840 rows=79094418 loops=1) -> Seq Scan > > > on filename (cost=0.00..1436976.15 rows=79104615 width=35) (actual > > > time=46.324..148887.662 rows=79094418 loops=1) > > > > This doesn't address the cost driving plan question, but I think it's a > > bit puzzling that a seq scan of 17M 92-byte rows completes in 7 secs, > > while a seqscan of 79M 35-byte rows takes 149secs. It's about 4:1 row > > ratio, less than 2:1 byte ratio, but a 20:1 time ratio. Perhaps there's > > some terrible bloat on filename that's not present on path? If that seq > > scan time on filename were proportionate to path this plan would > > complete about two minutes faster (making it only 6 times slower instead > > of 9 :). > > Much simpler than that I think : there is a bigger percentage of path that > is used all the time than of filename. The database used is the production > database, so there were other insert queries running a few minutes before I > got this plan. > > But I'll give it a look today and come back with bloat and cache > information on these 2 tables. Here are the stats for filename : SELECT * from pgstattuple('public.filename'); table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent +-++---+--++++-- 5308719104 |79338344 | 4717466438 | 88.86 |0 | 0 | 0 | 11883396 | 0.22 So I guess it's not bloated. I checked in the cache, the times displayed before were with path in the cache. filename couldn't stay in the cache, as it's too big. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
Le Thursday 16 July 2009 22:07:25, Kevin Grittner a écrit : > Marc Cousin wrote: > > the hot parts of these 2 tables are extremely likely to be in the > > database or linux cache (buffer hit rate was 97% in the example > > provided). Moreover, the first two queries of the insert procedure > > fill the cache for us... > > This would be why the optimizer does the best job estimating the > relative costs of various plans when you set the random_page_cost and > seq_page_cost very low. > > -Kevin Ok, so to sum it up, should I keep these values (I hate doing this :) ) ? Would there be a way to approximately evaluate them regarding to the expected buffer hit ratio of the query ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
Le Thursday 16 July 2009 23:54:54, Kevin Grittner a écrit : > Marc Cousin wrote: > > to sum it up, should I keep these values (I hate doing this :) ) ? > > Many people need to set the random_page_cost and/or seq_page_cost to > reflect the overall affect of caching on the active portion of the > data. We set our fully-cached databases to 0.1 for both. Databases > with less caching usually wind up at 2 and 1. We have one database > which does best at 0.5 and 0.3. My advice is to experiment and try to > find a pair of settings which works well for most or all of your > queries. If you have a few which need a different setting, you can > set a special value right before running the query, but I've always > been able to avoid that (thankfully). > > > Would there be a way to approximately evaluate them regarding to > > the expected buffer hit ratio of the query ? > > Nothing query-specific except setting them on the connection right > before the query (and setting them back or discarding the connection > afterward). Well, that and making sure that effective_cache_size > reflects reality. > > -Kevin OK, thanks a lot. A last thing : As mentionned in another mail from the thread (from Richard Huxton), I felt this message in the documentation a bit misleading : effective_cache_size (integer) Sets the planner's assumption about the effective size of the disk cache that is available to a single query I don't really know what the 'a single query' means. I interpreted that as 'divide it by the amount of queries typically running in parallel on the database'. Maybe it should be rephrased ? (I may not be the one misunderstanding it). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
> It really has very little impact. It only affects index scans, and > even then only if effective_cache_size is less than the size of the > table. > > Essentially, when this kicks in, it models the effect that if you are > index scanning a table much larger than the size of your cache, you > might have to reread some blocks that you previously read in during > *that same index scan*. Ok, thanks for clearing that up for me. Still, I think the doc could be improved on this point (sorry to be a bit obsessed with that, but I'm one of the french translators, so I like the doc to be perfect :) ) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why is PostgreSQL so slow on Windows ( Postgres 8.3.7) version
The few 'obvious' things I see : ID and POLLID aren't of the same type (numeric vs bigint) TTIME isn't indexed. And as a general matter, you should stick to native datatypes if you don't need numeric. But as said in the other answer, maybe you should redo this schema and use more consistent datatypes Anyway, from what I remenber, it's not advised to set up shared buffers that high for windows (I don't do so much windows myself, so maybe someone will be better informed). Anyway you can start by correcting the schema… On Friday 31 July 2009 07:45:55 pari krishnan wrote: > Dear All, > > > We are > using Postgres 8.3.7 in our java application. We are doing performances > tuning and load testing in our setup. we have noticed that ,some of our > queries to the database taking long time to return the results.Please > find our setup details belows. > > We observed that postgres is running in windows is slower than the linux . > > Machine & Database Details : > > Windows configuration: > 4 GB RAM > 4*1.6 GHZ > windows 2008 server standard edition > > Postgresql configuration: > > shared_buffers: 1 GB > Effective_cache_size: 2GB > fsync: off (even we tested this parameter is on ,we observed the same > slowness ) > > > Database Details : > > Postgres Database : PostgreSQL 8.3.7.1 > Driver Version: PostgreSQL 8.3 JDBC4 with SSL (build 604) > We are using 40 database connections. > > > We have few tables which will be having more amount data.While running > our application STATSDATA table will be created daily with table name > with date. > like as STATSDATA8_21_2009 > > Schema for STATSDATA table > > create table STATSDATA8_21_2009( > POLLID Numeric(19), > INSTANCE varchar(100), > TTIME Numeric(19), > VAL Numeric(13)) ;CREATE INDEX POLLID%_ndx on STATSDATA%(POLLID) > > Schema for PolledData > > create table PolledData( > "NAME" varchar(50) NOT NULL , > "ID" BIGINT NOT NULL , > "AGENT" varchar(50) NOT NULL , > "COMMUNITY" varchar(100) NOT NULL , > "PERIOD" INTEGER NOT NULL, > "ACTIVE" varchar(10), > "OID" varchar(200) NOT NULL, > "LOGDIRECTLY" varchar(10), > "LOGFILE" varchar(100), > "SSAVE" varchar(10), > "THRESHOLD" varchar(10), > "ISMULTIPLEPOLLEDDATA" varchar(10), > "PREVIOUSSEVERITY" INTEGER, > "NUMERICTYPE" INTEGER, > "SAVEABSOLUTES" varchar(10), > "TIMEAVG" varchar(10), > "PORT" INTEGER, > "WEBNMS" varchar(100), > "GROUPNAME" varchar(100), > "LASTCOUNTERVALUE" BIGINT , > "LASTTIMEVALUE" BIGINT , > "TIMEVAL" BIGINT NOT NULL , > "POLICYNAME" varchar(100), > "THRESHOLDLIST" varchar(200), > "DNSNAME" varchar(100), > "SUFFIX" varchar(20), > "STATSDATATABLENAME" varchar(100), > "POLLERNAME" varchar(200), > "FAILURECOUNT" INTEGER, > "FAILURETHRESHOLD" INTEGER, > "PARENTOBJ" varchar(100), > "PROTOCOL" varchar(50), > "SAVEPOLLCOUNT" INTEGER, > "CURRENTSAVECOUNT" INTEGER, > "SAVEONTHRESHOLD" varchar(10), > "SNMPVERSION" varchar(10), > "USERNAME" varchar(30), > "CONTEXTNAME" varchar(30), > PRIMARY KEY ("ID","NAME","AGENT","OID"), > index PolledData0_ndx ( "NAME"), > index PolledData1_ndx ( "AGENT"), > index PolledData2_ndx ( "OID"), > index PolledData3_ndx ( "ID"), > index PolledData4_ndx ( "PARENTOBJ"), > ) > > > We > have 300k row's in PolledData Table.In each STATSDATA table ,we have > almost 12 to 13 million rows. Every one minute interval ,we insert data > into to STATSDATA table. In our application ,we use insert and select > query to STATSDATA table at regular interval. Please let us know why > the below query takes more time to return the results. is there any > thing we need to do to tune the postgres database ? > > > > > Please find explain analyze output. > > > First Query : > > postgres=# explain analyze SELECT COUNT(*) FROM ( SELECT ID, > PolledData.AGENT, N AME, INSTANCE, TTIME, VAL FROM PolledData, > STATSDATA8_21_2009 WHERE ( ( PolledDa ta.ID=STATSDATA8_21_2009.POLLID) AND > ( ( TTIME >= 1250838027454) AND ( TTIME <= 1250838079654) ) ) ) t1; > QUERY > PLAN > > --- >- -- > Aggregate (cost=773897.12..773897.13 rows=1 width=0) (actual > time=17818.410..1 7818.412 rows=1 loops=1) >-> Merge Join (cost=717526.23..767505.06 rows=2556821 width=0) (actual > time =17560.469..17801.790 rows=13721 loops=1) > Merge Cond: (statsdata8_21_2009.pollid = > ((polleddata.id)::numeric)) -> Sort (cost=69708.44..69742.49 rows=13619 > width=8) (actual time=239 2.659..2416.093 rows=13721 loops=1) >Sort Key: statsdata8_21_2009.pollid >Sort Method: quicksort Memory: 792kB >-> Seq Scan on statsdata8_21_2009 (cost=0.00..68773.27 > rows=136 19 width=8) (actual time=0.077..2333.132 rows=13721 loops=1) > Filter: ((ttime >= 1250838027454::numeric) AND (ttime > <= 12 50838079654::numeric)) > -> Materialize (cost=64781
Re: [PERFORM] optimizing query with multiple aggregates
Hello, I didn't try it, but following should be slightly faster: COUNT( CASE WHEN field >= x AND field < y THEN true END) intead of SUM( CASE WHEN field >= x AND field < y THEN 1 ELSE 0 END) HTH, Marc Mamin From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Nikolas Everett Sent: Thursday, October 22, 2009 4:48 AM To: Doug Cole Cc: pgsql-performance Subject: Re: [PERFORM] optimizing query with multiple aggregates So you've got a query like: SELECT SUM(CASE WHEN field >= 0 AND field < 10 THEN 1 ELSE 0 END) as zeroToTen, SUM(CASE WHEN field >= 10 AND field < 20 THEN 1 ELSE 0 END) as tenToTwenty, SUM(CASE WHEN field >= 20 AND field < 30 THEN 1 ELSE 0 END) as tenToTwenty, ... FROM bigtable My guess is this forcing a whole bunch of if checks and your getting cpu bound. Could you try something like: SELECT SUM(CASE WHEN field >= 0 AND field < 10 THEN count ELSE 0 END) as zeroToTen, SUM(CASE WHEN field >= 10 AND field < 20 THEN count ELSE 0 END) as tenToTwenty, SUM(CASE WHEN field >= 20 AND field < 30 THEN count ELSE 0 END) as tenToTwenty, ... FROM (SELECT field, count(*) FROM bigtable GROUP BY field) which will allow a hash aggregate? You'd do a hash aggregate on the whole table which should be quick and then you'd summarize your bins. This all supposes that you don't want to just query postgres's column statistics. On Wed, Oct 21, 2009 at 10:21 PM, Doug Cole wrote: On Wed, Oct 21, 2009 at 5:39 PM, Merlin Moncure wrote: > > On Wed, Oct 21, 2009 at 6:51 PM, Doug Cole wrote: > > I have a reporting query that is taking nearly all of it's time in aggregate > > functions and I'm trying to figure out how to optimize it. The query takes > > approximately 170ms when run with "select *", but when run with all the > > aggregate functions the query takes 18 seconds. The slowness comes from our > > attempt to find distribution data using selects of the form: > > > > SUM(CASE WHEN field >= x AND field < y THEN 1 ELSE 0 END) > > > > repeated across many different x,y values and fields to build out several > > histograms of the data. The main culprit appears to be the CASE statement, > > but I'm not sure what to use instead. I'm sure other people have had > > similar queries and I was wondering what methods they used to build out data > > like this? > > have you tried: > > count(*) where field >= x AND field < y; > > ?? > > merlin Unless I'm misunderstanding you, that would require breaking each bin into a separate sql statement and since I'm trying to calculate more than 100 bins between the different fields any improvement in the aggregate functions would be overwhelmed by the cost of the actual query, which is about 170ms. Thanks, Doug -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query improvement
> On Mon, May 2, 2011 at 10:54 PM, Mark wrote: > > but the result have been worst than before. By the way is there a posibility > > to create beeter query with same effect? > > I have tried more queries, but this has got best performance yet. > > Well, this seems to be the worst part: > >(SELECT page_id FROM mediawiki.page WHERE page_id IN >(SELECT page_id FROM mediawiki.page > WHERE (titlevector @@ (to_tsquery('fotbal' >OR page_id IN >(SELECT p.page_id from mediawiki.page p,mediawiki.revision r, >(SELECT old_id FROM mediawiki.pagecontent >WHERE (textvector @@ (to_tsquery('fotbal' ss >WHERE (p.page_id=r.rev_page AND r.rev_id=ss.old_id))) > 'OR' statements often generate complicated plans. You should try to rewrite your Query with a n UNION clause. Using explicit joins may also help the planner: SELECT page_id FROM mediawiki.page WHERE (titlevector @@ (to_tsquery('fotbal'))) UNION SELECT p.page_id FROM mediawiki.page p JOIN mediawiki.revision r on (p.page_id=r.rev_page) JOIN mediawiki.pagecontent ss on (r.rev_id=ss.old_id) WHERE (ss.textvector @@ (to_tsquery('fotbal'))) HTH, Marc Mamin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a "data warehouse", 5-10 TB
Hi, > (see attachment) under high concurency you may expect that your data is already in. In such a case you better do nothing at all: begin select dat=a_dat from t where id=a_id into test: if test is null then begin insert into t (id, dat) values (a_id, a_dat); exception when unique_violation then update t set dat = a_dat where id = a_id and dat <> a_dat; return 0; end; elsif not test then update t set dat = a_dat where id = a_id; return 0; end if; return 1; best regards, Marc Mamin -Ursprüngliche Nachricht- Von: pgsql-performance-ow...@postgresql.org im Auftrag von Robert Klemme Gesendet: Di 9/13/2011 6:34 An: Marti Raudsepp Cc: pgsql-performance@postgresql.org Betreff: Re: [PERFORM] Postgres for a "data warehouse", 5-10 TB On Tue, Sep 13, 2011 at 5:13 PM, Marti Raudsepp wrote: > On Tue, Sep 13, 2011 at 00:26, Robert Klemme > wrote: >> In the case of PG this particular example will work: >> 1. TX inserts new PK row >> 2. TX tries to insert same PK row => blocks >> 1. TX commits >> 2. TX fails with PK violation >> 2. TX does the update (if the error is caught) > > That goes against the point I was making in my earlier comment. In > order to implement this error-catching logic, you'll have to allocate > a new subtransaction (transaction ID) for EVERY ROW you insert. I don't think so. You only need to catch the error (see attachment). Or does this create a sub transaction? > If > you're going to be loading billions of rows this way, you will invoke > the wrath of the "vacuum freeze" process, which will seq-scan all > older tables and re-write every row that it hasn't touched yet. You'll > survive it if your database is a few GB in size, but in the terabyte > land that's unacceptable. Transaction IDs are a scarce resource there. Certainly. But it's not needed as far as I can see. > In addition, such blocking will limit the parallelism you will get > from multiple inserters. Yes, I mentioned the speed issue. But regardless of the solution for MySQL's "INSERT..ON DUPLICATE KEY UPDATE" which Igor mentioned you will have the locking problem anyhow if you plan to insert concurrently into the same table and be robust. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
Re: [PERFORM] overzealous sorting?
Le Mon, 26 Sep 2011 16:28:15 +1000, anthony.ship...@symstream.com a écrit : > In Mammoth Replicator (PG 8.3) I have a table described as > >Table "public.tevent_cdr" > Column | Type | > Modifiers > +--+ > event_id | integer | not null default > nextval(('event_id_seq'::text)::regclass) timestamp | timestamp > with time zone | not null classification | character varying| > not null area | character varying| not null > kind | character varying| > device_id | integer | > device_name| character varying| > fleet_id | integer | > fleet_name | character varying| > customer_id| integer | > customer_name | character varying| > event | text | > Indexes: > "tevent_cdr_event_id" UNIQUE, btree (event_id) > "tevent_cdr_timestamp" btree ("timestamp") > Check constraints: > "tevent_cdr_classification_check" CHECK (classification::text > = 'cdr'::text) > Inherits: tevent > > > This simple query puzzles me. Why does it need to sort the records? > Don't they come from the index in order? > > "explain analyze select * from tevent_cdr where timestamp >= > '2011-09-09 12:00:00.00+0' and timestamp < '2011-09-09 > 13:00:00.00+0' and classification = 'cdr' order by timestamp;" > > > QUERY PLAN > > Sort (cost=9270.93..9277.12 rows=2477 width=588) (actual > time=9.219..11.489 rows=2480 loops=1) > Sort Key: "timestamp" > Sort Method: quicksort Memory: 2564kB > -> Bitmap Heap Scan on tevent_cdr (cost=57.93..9131.30 > rows=2477 width=588) (actual time=0.440..3.923 rows=2480 loops=1) > Recheck Cond: (("timestamp" >= '2011-09-09 > 22:00:00+10'::timestamp with time zone) AND ("timestamp" < > '2011-09-09 23:00:00+10'::timestamp with time zone)) > Filter: ((classification)::text = 'cdr'::text) > -> Bitmap Index Scan on tevent_cdr_timestamp > (cost=0.00..57.31 rows=2477 width=0) (actual time=0.404..0.404 > rows=2480 loops=1) > Index Cond: (("timestamp" >= '2011-09-09 > 22:00:00+10'::timestamp with time zone) AND ("timestamp" < > '2011-09-09 23:00:00+10'::timestamp with time zone)) > Total runtime: 13.847 ms > (9 rows) Because Index Scans are sorted, not Bitmap Index Scans, which builds a list of pages to visit, to be then visited by the Bitmap Heap Scan step. Marc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] overzealous sorting?
Le Tue, 27 Sep 2011 12:45:00 +1000, anthony.ship...@symstream.com a écrit : > On Monday 26 September 2011 19:39, Marc Cousin wrote: > > Because Index Scans are sorted, not Bitmap Index Scans, which > > builds a list of pages to visit, to be then visited by the Bitmap > > Heap Scan step. > > > > Marc. > > Where does this bitmap index scan come from? It seems to negate the > advantages of b-tree indexes described in the section "Indexes and > ORDER BY" of the manual. If I do "set enable_bitmapscan = off;" the > query runs a bit faster although with a larger time range it reverts > to a sequential scan. > Bitmap Index Scan is just another way to use a btree index. It is often used when a bigger part of a table is required, as it costs more than plain index scan to retrieve a few records, but less when a lot of records are needed. Your tests show that index scans are a bit faster on this query. But it is probably true only when most needed data is cached, which is probably your case, as you are doing tests using the same query all the time. The bitmap index scan is probably cheaper when data isn't in cache. You could also see the bitmap index scan as safer, as it won't perform as bad when data is not cached (less random IO) :) The thing is, the optimizer doesn't know if your data will be in cache when you will run your query… if you are sure most of your data is in the cache most of the time, you could try to tune random_page_cost (lower it) to reflect that data is cached. But if the win is small on this query, it may not be worth it. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] overzealous sorting?
Le Tue, 27 Sep 2011 19:05:09 +1000, anthony.ship...@symstream.com a écrit : > On Tuesday 27 September 2011 18:54, Marc Cousin wrote: > > The thing is, the optimizer doesn't know if your data will be in > > cache when you will run your query… if you are sure most of your > > data is in the cache most of the time, you could try to tune > > random_page_cost (lower it) to reflect that data is cached. But if > > the win is small on this query, it may not be worth it. > > What I really want is to just read a sequence of records in timestamp > order between two timestamps. The number of records to be read may be > in the millions totalling more than 1GB of data so I'm trying to read > them a slice at a time but I can't get PG to do just this. > > If I use offset and limit to grab a slice of the records from a large > timestamp range then PG will grab all of the records in the range, > sort them on disk and return just the slice I want. This is absurdly > slow. > > The query that I've shown is one of a sequence of queries with the > timestamp range progressing in steps of 1 hour through the timestamp > range. All I want PG to do is find the range in the index, find the > matching records in the table and return them. All of the planner's > cleverness just seems to get in the way. > Maybe you should try using a cursor, if you don't know where you'll stop. This associated with a very low cursor_tuple_fraction will probably give you what you want (a fast start plan). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres array parser
Hello, For such cases (see below), it would be nice to have an unnest function that only affect the first array dimension. Something like unnest(ARRAY[[1,2],[2,3]], SLICE=1) => unnest -- [1,2] [2,3] With this function, I imagine that following sql function might beat the plpgsql FOREACH version. CREATE OR REPLACE FUNCTION input_value_un (in_inputs numeric[], in_input_nr numeric) RETURNS numeric AS $BODY$ SELECT u[1][2] FROM unnest($1, SLICE =1) u WHERE u[1][1]=in_input_nr LIMIT 1; $BODY$ LANGUAGE sql IMMUTABLE; best regards, Marc Mamin > -Original Message- > From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance- > ow...@postgresql.org] On Behalf Of Pavel Stehule > Sent: Dienstag, 13. Dezember 2011 15:43 > To: Aleksej Trofimov > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Postgres array parser > > Hello > > 2011/12/13 Aleksej Trofimov : > > We have tried foreach syntax, but we have noticed performance > degradation: > > Function with for: 203ms > > Function with foreach: ~250ms: > > > > there is functions code: > > CREATE OR REPLACE FUNCTION input_value_fe(in_inputs numeric[], > in_input_nr > > numeric) > > RETURNS numeric AS > > $BODY$ > > declare i numeric[]; > > BEGIN > > FOREACH i SLICE 1 IN ARRAY in_inputs > > LOOP > > if i[1] = in_input_nr then > > return i[2]; > > end if; > > END LOOP; > > > > return null; > > END; > > $BODY$ > > LANGUAGE plpgsql VOLATILE > > COST 100; > > > > CREATE OR REPLACE FUNCTION input_value(in_inputs numeric[], > in_input_nr > > numeric) > > RETURNS numeric AS > > $BODY$ > > declare > > size int; > > BEGIN > > size = array_upper(in_inputs, 1); > > IF size IS NOT NULL THEN > > > > FOR i IN 1 .. size LOOP > > if in_inputs[i][1] = in_input_nr then > > return in_inputs[i][2]; > > end if; > > END LOOP; > > END IF; > > > > return null; > > END; > > $BODY$ > > LANGUAGE plpgsql VOLATILE > > COST 100; > > > > > > On 12/13/2011 04:02 PM, Pavel Stehule wrote: > >> > >> Hello > >> > >> do you know FOREACH IN ARRAY statement in 9.1 > >> > >> this significantly accelerate iteration over array > >> > >> > >> http://www.depesz.com/index.php/2011/03/07/waiting-for-9-1-foreach- > in-array/ > >> > >> > >> > >> 2011/12/13 Aleksej Trofimov: > >>> > >>> Hello, I wanted to ask according such a problem which we had faced > with. > >>> We are widely using postgres arrays like key->value array by doing > like > >>> this: > >>> > >>> {{1,5},{2,6},{3,7}} > >>> > >>> where 1,2,3 are keys, and 5,6,7 are values. In our pgSql functions > we are > >>> using self written array_input(array::numeric[], key::numeric) > function > >>> which makes a loop on whole array and searches for key like > >>> FOR i IN 1 .. size LOOP > >>> if array[i][1] = key then > >>> return array[i][2]; > >>> end if; > >>> END LOOP; > >>> > >>> But this was a good solution until our arrays and database had > grown. So > >>> now > >>> FOR loop takes a lot of time to find value of an array. > >>> > >>> And my question is, how this problem of performance could be > solved? We > >>> had > >>> tried pgperl for string parsing, but it takes much more time than > our > >>> current solution. Also we are thinking about self-written C++ > function, > >>> may > >>> be someone had implemented this algorithm before? > >>> > >> you can use indexes or you can use hstore > >> > >> Regards > >> > >> Pavel Stehule > >> > >>> -- > >>> Best regards > >>> > >>> Aleksej Trofimov > >>> > >>> > >>> -- > >>> Sent via pgsql-performance mailing list > >>> (pgsql-performance@postgresql.org) > >>> To make changes to your subscription: > >>> http://www.postgresql.org/mailpref/pgsql-performance > > > > > > It is strange - on my comp FOREACH is about 2x faster > > postgres=# select inp
Re: [PERFORM] Postgres array parser
> Yes, it would be great, but I haven't found such a function, which > splits 2 dimensional array into rows =) Maybe we'll modify existing > function, but unfortunately we have tried hstore type and function in > postgres and we see a significant performance improvements. So we only > need to convert existing data into hstore and I think this is a good > solution. I haven't tested hstore yet, but I would be interested to find out if it still better perform with custom "numeric" aggregates on the hstore values. I've made a short "proof of concept" test with a custom key/value type to achieve such an aggregation. Something like: SELECT x, distinct_sum( (currency,amount)::keyval ) overview FROM ... GROUP BY x x currency amount a EUR 15.0 a EUR5.0 a CHF7.5 b USD 12.0 => x overview - ---- a {(EUR,20.0), (CHF,7.5)} b {(USD,10.0)} regards, Marc Mamin > On 12/14/2011 11:21 AM, Marc Mamin wrote: > > Hello, > > > > For such cases (see below), it would be nice to have an unnest > function that only affect the first array dimension. > > > > Something like > > > > unnest(ARRAY[[1,2],[2,3]], SLICE=1) > > => > > unnest > > -- > > [1,2] > > [2,3] > > > > > > With this function, I imagine that following sql function > > might beat the plpgsql FOREACH version. > > > > > > CREATE OR REPLACE FUNCTION input_value_un (in_inputs numeric[], > in_input_nr numeric) > > RETURNS numeric AS > > $BODY$ > > > >SELECT u[1][2] > >FROM unnest($1, SLICE =1) u > >WHERE u[1][1]=in_input_nr > >LIMIT 1; > > > > $BODY$ > > LANGUAGE sql IMMUTABLE; > > > > > > > > best regards, > > > > Marc Mamin > > > > > >> -Original Message- > >> From: pgsql-performance-ow...@postgresql.org [mailto:pgsql- > performance- > >> ow...@postgresql.org] On Behalf Of Pavel Stehule > >> Sent: Dienstag, 13. Dezember 2011 15:43 > >> To: Aleksej Trofimov > >> Cc: pgsql-performance@postgresql.org > >> Subject: Re: [PERFORM] Postgres array parser > >> > >> Hello > >> > >> 2011/12/13 Aleksej Trofimov: > >>> We have tried foreach syntax, but we have noticed performance > >> degradation: > >>> Function with for: 203ms > >>> Function with foreach: ~250ms: > >>> > >>> there is functions code: > >>> CREATE OR REPLACE FUNCTION input_value_fe(in_inputs numeric[], > >> in_input_nr > >>> numeric) > >>> RETURNS numeric AS > >>> $BODY$ > >>> declare i numeric[]; > >>> BEGIN > >>> FOREACH i SLICE 1 IN ARRAY in_inputs > >>> LOOP > >>> if i[1] = in_input_nr then > >>> return i[2]; > >>> end if; > >>> END LOOP; > >>> > >>> return null; > >>> END; > >>> $BODY$ > >>> LANGUAGE plpgsql VOLATILE > >>> COST 100; > >>> > >>> CREATE OR REPLACE FUNCTION input_value(in_inputs numeric[], > >> in_input_nr > >>> numeric) > >>> RETURNS numeric AS > >>> $BODY$ > >>> declare > >>> size int; > >>> BEGIN > >>> size = array_upper(in_inputs, 1); > >>> IF size IS NOT NULL THEN > >>> > >>> FOR i IN 1 .. size LOOP > >>> if in_inputs[i][1] = in_input_nr then > >>> return in_inputs[i][2]; > >>> end if; > >>> END LOOP; > >>> END IF; > >>> > >>> return null; > >>> END; > >>> $BODY$ > >>> LANGUAGE plpgsql VOLATILE > >>> COST 100; > >>> > >>> > >>> On 12/13/2011 04:02 PM, Pavel Stehule wrote: > >>>> Hello > >>>> > >>>> do you know FOREACH IN ARRAY statement in 9.1 > >>>> > >>>> this significantly accelerate iteration over array > >>>> > >>>> > >>>> http://www.depesz.com/index.php/2011/03/07/waiting-for-9-1- > foreach- > >> in-array/ > >>>> > >>>> > >>>> 2011/12/13 Aleksej Trofimov: > >>>>> Hello, I wanted to
Re: [PERFORM] Duplicate deletion optimizations
Hi Samuel! On 6 January 2012 20:02, Samuel Gendler wrote: > Have you considered doing the insert by doing a bulk insert into a temp > table and then pulling rows that don't exist across to the final table in > one query and updating rows that do exist in another query? I did a very > brief scan of the SO thread and didn't see it suggested. Something like > this: > > update stats_5mn set count = count + t.count > from temp_table t > where stats_5mn.t_value = t.t_value and stats_5mn.t_record and > stats_5mn.output_id = t.output_id; > > insert into stats_5mn > select * from temp_table t > where not exists ( > select 1 from stats_5mn s > where s.t_value = t.t_value and s.t_record = t.t_record and s.output_id = > t.output_id > ); > > drop table temp_table; Am I right to assume that the update/insert needs to be placed into a begin / end transaction block if such batch uploads might happen concurrently? Doesn't seem to be the case for this question here, but I like the solution and wonder if it works under more general circumstances. What's the overhead of creating and dropping a temporary table? Is it only worth doing this for a large number of inserted/updated elements? What if the number of inserts/updates is only a dozen at a time for a large table (>10M entries)? Thanks, Marc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Duplicate deletion optimizations
On 6 January 2012 20:38, Samuel Gendler wrote: > On Fri, Jan 6, 2012 at 12:22 PM, Marc Eberhard > wrote: >> On 6 January 2012 20:02, Samuel Gendler wrote: >> > Have you considered doing the insert by doing a bulk insert into a temp >> > table and then pulling rows that don't exist across to the final table >> > in >> > one query and updating rows that do exist in another query? I did a >> > very >> > brief scan of the SO thread and didn't see it suggested. Something like >> > this: >> > >> > update stats_5mn set count = count + t.count >> > from temp_table t >> > where stats_5mn.t_value = t.t_value and stats_5mn.t_record and >> > stats_5mn.output_id = t.output_id; >> > >> > insert into stats_5mn >> > select * from temp_table t >> > where not exists ( >> > select 1 from stats_5mn s >> > where s.t_value = t.t_value and s.t_record = t.t_record and s.output_id >> > = >> > t.output_id >> > ); >> > >> > drop table temp_table; >> >> Am I right to assume that the update/insert needs to be placed into a >> begin / end transaction block if such batch uploads might happen >> concurrently? Doesn't seem to be the case for this question here, but >> I like the solution and wonder if it works under more general >> circumstances. > > > yes, assuming you are concerned about making the insertion atomic. > Obviously, a failure in the second query after success in the 1st query > would be problematic outside of a transaction, since any attempt to repeat > the entire operation would result in repeated updates. True, but I was more concerned about concurrency, where a second upsert inserts an element between update/insert from the first. That would then skip the element in the first upsert as it is neither updated (doesn't exist at that point in time) nor inserted (does exists at that later point). Or would that be impossible anyway? >> What's the overhead of creating and dropping a temporary table? Is it >> only worth doing this for a large number of inserted/updated elements? >> What if the number of inserts/updates is only a dozen at a time for a >> large table (>10M entries)? > > pretty minimal, but enough that doing a handful of rows at a time probably > wouldn't be worth it. You'd surely get index usage on a plain insert in > such a case, so I'd probably just use an upsert stored proc for doing small > numbers of rows - unless you are doing large numbers of inserts, just a few > at a time. In that case, I'd try to accumulate them and then do them in > bulk. Those are tough questions to answer without a specific context. My > real answer is 'try it and see.' You'll always get an answer that is > specific to your exact circumstance that way. It's a fairly tricky problem. I have a number of sensors producing energy data about every 5 minutes, but at random times between 1 and 15 minutes. I can't change that as that's the way the hardware of the sensors works. These feed into another unit, which accumulates them and forwards them in batches over the Internet to my PostgreSQL database server every few minutes (again at random times outside my control and with random batch sizes). To make things worse, if the Internet connection between the unit and the database server fails, it will send the latest data first to provide a quick update to the current values and then send the backlog of stored values. Thus, data do not always arrive in correct time order. At the moment I only look at the latest data for each sensor and these should be as close to real time as possible. Thus, collecting data for some time to get a larger size for a batch update isn't preferable. What I want to do, and this is where the upsert problem starts, is to build a table with energy values at fixed times. These should be calculated as a linear interpolation between the nearest reported values from the sensors. Please note each sensor is reporting a measured energy value (not instant power), which always increases monotonically with time. To compare the performance of the different devices that are measured, I need to have the energy values at the same time and not at the random times when the sensors report. This also allows the calculation of average power for the devices by taking the difference of the energy values over longer periods, like 30 minutes. What I simply haven't got my head around is how to do this in an efficient way. When new values arrive, the table of interpolated values needs to be updated. For some times, there will already be values in the table, but for other times there won't. Thus, the upsert. If there w
Re: [PERFORM] Duplicate deletion optimizations
Yes, but it should become a bit slower if you fix your code :-) where t_imp.id is null and test.id=t_imp.id; => where t_imp.id is not null and test.id=t_imp.id; and a partial index on matching rows might help (should be tested): (after the first updat) create index t_imp_ix on t_imp(t_value,t_record,output_id) where t_imp.id is not null. regards, Marc Mamin -Ursprüngliche Nachricht- Von: pgsql-performance-ow...@postgresql.org im Auftrag von Jochen Erwied Gesendet: Sa 1/7/2012 12:57 An: anto...@inaps.org Cc: pgsql-performance@postgresql.org Betreff: Re: [PERFORM] Duplicate deletion optimizations Friday, January 6, 2012, 4:21:06 PM you wrote: >> Every 5 minutes, a process have to insert a few thousand of rows in this >> table, but sometime, the process have to insert an already existing row >> (based on values in the triplet (t_value, t_record, output_id). In this >> case, the row must be updated with the new count value. I've tried some >> solution given on this stackoverflow question [1] but the insertion rate >> is always too low for my needs. I did check the following in a loop, starting with an empty table, and inserting/updating 5 random unique entries. After 15 minutes I've got about 10 million records, each loop takes about 3 seconds. After 30 minutes the table contains approx. 18 million entries, time per loop only slightly increased. After 90 minutes the database has about 30 million entries. The speed has dropped to about 15-20 seconds per loop, but the server is doing lots of other queries in parallel, so with an unloaded server the updates should still take less than 10 seconds. The generator runs in perl, and generates records for a maximum of 100 million different entries: use strict; srand time; my $i = 0; open FD, ">data.in"; for (1..5) { $i += rand(2000); print FD sprintf("%d\t%d\t%d\t%d\n", $i/65536, ($i/256)%255, $i%255, rand(1000)); } close FD; The SQL-script looks like this: \timing on begin; create temp table t_imp(id bigint,t_value integer,t_record integer,output_id integer,count bigint); \copy t_imp (t_value, t_record, output_id, count) from 'data.in' --an index is not really needed, table is in memory anyway --create index t_imp_ix on t_imp(t_value,t_record,output_id); -- find matching rows update t_imp set id=test.id from test where (t_imp.t_value,t_imp.t_record,t_imp.output_id)=(test.t_value,test.t_record,test.output_id); -- update matching rows using primary key update test set count=t_imp.count from t_imp where t_imp.id is null and test.id=t_imp.id; -- insert missing rows insert into test(t_value,t_record,output_id,count) select t_value,t_record,output_id,count from t_imp where id is null; commit; Advantages of this solution: - all updates are done in-place, no index modifications (except for the inserts, of course) - big table only gets inserts - no dead tuples from deletes - completely avoids sequential scans on the big table Tested on my home server (8GB RAM, 3GB shared memory, Dual-Xeon 5110, 1.6 GHz, table and indices stored on a SSD) Table statistics: relid | 14332525 schemaname| public relname | test seq_scan | 8 seq_tup_read | 111541821 idx_scan | 149240169 idx_tup_fetch | 117901695 n_tup_ins | 30280175 n_tup_upd | 0 n_tup_del | 0 n_tup_hot_upd | 0 n_live_tup| 30264431 n_dead_tup| 0 last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | 2012-01-07 12:38:49.593651+01 vacuum_count | 0 autovacuum_count | 0 analyze_count | 0 autoanalyze_count | 31 The sequential scans were from some 'select count(*)' in between. HTH. -- Jochen Erwied | home: joc...@erwied.eu +49-208-38800-18, FAX: -19 Sauerbruchstr. 17 | work: j...@mbs-software.de +49-2151-7294-24, FAX: -50 D-45470 Muelheim | mobile: jochen.erw...@vodafone.de +49-173-5404164 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Duplicate deletion optimizations
Hi Pierre! On 7 January 2012 12:20, Pierre C wrote: > I'm stuck home with flu, so I'm happy to help ;) [...] > I'll build an example setup to make it clearer... [...] That's almost identical to my tables. :-) > Note that the "distance" field represents the distance (in time) between the > interpolated value and the farthest real data point that was used to > calculate it. Therefore, it can be used as a measure of the quality of the > interpolated point ; if the distance is greater than some threshold, the > value might not be that precise. Nice idea! > Although this query is huge, it's very fast, since it doesn't hit the big > tables with any seq scans (hence the max() and min() tricks to use the > indexes instead). And it can easily be tamed by putting parts of it into stored pgpsql functions. > I love how postgres can blast that huge pile of SQL in, like, 50 ms... Yes, indeed. It's incredible fast. Brilliant! > If there is some overlap between packet data and data already in the log, > you might get some division by zero errors, in this case you'll need to > apply a DISTINCT somewhere (or simply replace the UNION ALL with an UNION, > which might be wiser anyway...) I do have a unique constraint on the actual table to prevent duplicate data in case of retransmission after a failed connect. It's easy enough to delete the rows from packet that already exist in the main table with a short one line SQL delete statement before the interpolation and merge. > Tada. :- > Enjoy ! I certainly will. Many thanks for those great lines of SQL! Hope you recover from your flu quickly! All the best, Marc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] How to remove a table statistics ?
Hello, I have a weird table, upon with the queries are much faster when no statics were collected. Is there a way to delete statistics information for a table ? I've tried ALTER.. set STATISTICS 0 and then run ANALYZE, but it seems that old statistics are kept this way. Can I delete entries directly in pg_statistic ? (Postgresql 9.1) short backgroud Info: One of the table index is a GIN on a tsvector returning function, which is very costy. once analyzed, the query planner often ignore this index in favour of other one, hence triggering this function too often. I'll fix that model, but am first looking for a quick way to restore performance on our production servers. best regards, Marc Mamin
Re: [PERFORM] How to remove a table statistics ?
Hello, Some more tests have shown that removing the statistics just move the performance issue to other places. The main issue here is a bad design, so I'd better focus on this than losing too much time with the current situation. But this raises an interesting question on how/where does Postgres store statistics on functional indexes. in pg_statistics there are information on the column content, but I couldn't find stats on the function result which is fully computed only during the index creation. I guess that the planner would need to know at least the function cost to weight the benefit of such an index. In my case I would set the function cost to 200 ... I have also tried to reduce random_page_cost to "2", and it seems to help in a few cases. (anonymized) explain analyze SELECT min(msoffset) as t, coalesce(pipelinecall_id,-2) as pid from aserrorlist_20120125 l WHERE 1 = 1 AND msoffset >= 132750300 AND my_func('foo',20120125,l.id, l.header_9_10_id, l.categories_id, l.firstline_id) @@ to_aserrcfg_search_tsq($KUKU$lexeme_1 ! lexeme_2$KUKU$) group by ridcount,pipelinecall_id,coalesce(toplevelrid,msoffset::varchar); without stats: http://explain.depesz.com/s/qPg with stats: http://explain.depesz.com/s/88q aserr_20120125_tvi: GIN Index on my_func(.,.,.,.,.,.) best regards, Marc Mamin > -Original Message- > From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance- > ow...@postgresql.org] On Behalf Of Josh Berkus > Sent: Dienstag, 31. Januar 2012 19:44 > To: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] How to remove a table statistics ? > > On 1/31/12 3:50 AM, Marc Mamin wrote: > > Hello, > > > > I have a weird table, upon with the queries are much faster when no > > statics were collected. > > > > Is there a way to delete statistics information for a table ? > > I've tried ALTER.. set STATISTICS 0 and then run ANALYZE, but it > seems > > that old statistics are kept this way. > > Can I delete entries directly in pg_statistic ? > > (Postgresql 9.1) > > You can, but it won't do any good; autovaccum will replace them. > > It would be better to fix the actual query plan issue. If you can, > post > the query plans with and without statistics (EXPLAIN ANALYZE, please) > here. > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com > > -- > Sent via pgsql-performance mailing list (pgsql- > performa...@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] text search: tablescan cost for a tsvector
Hello, I have quite systematically better performance with the text search when I disable the statistics collection for the tsvector column. So I wonder if such statistics ever make sense. Here a testcase: The table contains 200'000 tsvector, whereas the lexeme 'fooblablabla' exists in all tsvector: Without statistics, the planner decide as expected for the gin index. After analyze, it switch to a table scan which is also expected, but the query is 3 times slower. My first thought was that the Bitmap Heap Scan was really fast as the searched term is always at the first position. So I repeated the test with an additional search term at the last position, but without significant change: (result from the 6. test below) without analyze: http://explain.depesz.com/s/6At with analyze:http://explain.depesz.com/s/r3B best regards, Marc Mamin Here all my results, always one of the fastest from a few runs. CREATE TABLE tsv_test ( id bigserial NOT NULL, v tsvector ); The test query: explain analyze select id from tsv_test where v @@ 'lexeme3179'::tsquery UNION ALL select id from tsv_test where v @@ 'lexeme5'::tsquery UNION ALL select id from tsv_test where v @@ 'fooblablabla'::tsquery The results A) on first lexeme 1) without indexes without analyze: http://explain.depesz.com/s/bOv 2) alter table tsv_test add constraint tsv_test_pk primary key(id); http://explain.depesz.com/s/9QQ (same as previous); 3) create index tsv_gin on tsv_test using gin(v); http://explain.depesz.com/s/r4M <= fastest 4) ANALYZE tsv_test (id); http://explain.depesz.com/s/MyC (same as previous); 5) ANALYZE tsv_test; http://explain.depesz.com/s/qu3S B) on lastlexeme 6) create table tsv_test2 as select id, v||'zzthisisalongerlexemethisisalongerlexeme'::tsvector from tsv_test; explain analyze select id from tsv_test2 where v @@ 'zzthisisalongerlexemethisisalongerlexeme'::tsquery http://explain.depesz.com/s/6At ANALYZE tsv_test2; http://explain.depesz.com/s/r3B test data: insert into tsv_test (v) select cast('fooblablabla' || ' lexeme'||s%2|| ' lexeme'||s%3|| ' lexeme'||s%4|| ' lexeme'||s%4|| ' lexeme'||s%5|| ' lexeme'||s%6|| ' lexeme'||s%7|| ' lexeme'||s%8|| ' lexeme'||s%9|| ' lexeme'||s%10 || ' lexeme2'||s%11 || ' lexeme3'||s%12 || ' lexeme'||s%11 || ' lexeme2'||s%12 || ' lexeme3'||s%22 || ' lexeme'||s%12 || ' lexeme2'||s%13 || ' lexeme3'||s%32 || ' lexeme'||s%13 || ' lexeme2'||s%14 || ' lexeme3'||s%42 || ' lexeme'||s%14 || ' lexeme2'||s%15 || ' lexeme3'||s%52 || ' lexeme'||s%15 || ' lexeme2'||s%16 || ' lexeme3'||s%62 || ' lexeme'||s%16 || ' lexeme2'||s%17 || ' lexeme3'||s%72 || ' lexeme'||s%17 || ' lexeme2'||s%18 || ' lexeme3'||s%82 || ' lexeme'||s%18 || ' lexeme2'||s%19 || ' lexeme3'||s%92 || ' lexeme'||s%19 || ' lexeme2'||s%10 || ' lexeme3'||s%15 || ' lexeme'||s%12 || ' lexeme2'||s%71 || ' lexeme3'||s%16 || ' lexeme'||s%20 || ' lexeme2'||s%81 || ' lexeme3'||s%17 || ' lexeme'||s%35 || ' lexeme2'||s%91 || ' lexeme3'||s%18 || ' lexeme'||s%100 || ' lexeme2'||s%110 || ' lexeme3'||s%120 || ' lexeme'||s%110 || ' lexeme2'||s%120 || ' lexeme3'||s%220 || ' lexeme'||s%120 || ' lexeme2'||s%130 || ' lexeme3'||s%320 || ' lexeme'||s%130 || ' lexeme2'||s%140 || ' lexeme3'||s%420 || ' lexeme'||s%140 || ' lexeme2'||s%150 || ' lexeme3'||s%520 || ' lexeme'||s%150 || ' lexeme2'||s%160 || ' lexeme3'||s%620 || ' lexeme'||s%160 || ' lexeme2'||s%170 || ' lexeme3'||s%720 || ' lexeme'||s%170 || ' lexeme2'||s%180 || ' lexeme3'||s%820 || ' lexeme'||s%180 || ' lexeme2'||s%190 || ' lexeme3'||s%920 || ' lexeme'||s%190 || ' lexeme2'||s%100 || ' lexeme3'||s%150 || ' lexeme'||s%120 || ' lexeme2'||s%710 || ' lexeme3'||s%160 || ' lexeme'||s%200 || ' lexeme2'||s%810 || ' lexeme3'||s%170 || ' lexeme'||s%350 || ' lexeme2'||s%910 || ' lexeme3'||s%180 as tsvector) FROM generate_series(1,10) s UNION ALL select cast('fooblablabla' || ' thisisalongerlexemethisisalongerlexeme'||s%2|| ' thisisalongerlexemethisisalongerl
Re: [PERFORM] text search: tablescan cost for a tsvector
> Von: Robert Haas [mailto:robertmh...@gmail.com] > Gesendet: Mi 2/29/2012 7:32 > > On Mon, Feb 6, 2012 at 6:05 AM, Marc Mamin wrote: > > without analyze: http://explain.depesz.com/s/6At > > with analyze:http://explain.depesz.com/s/r3B ... > The problem seems to be that the cost estimator doesn't know that > detoasting is expensive. Hello, Tom Lane has started a follow up thread in the hacker list. Detoasting is indeed the main obstacle, but I've repeated my test using plain storage and the planer still choose (systematically?) the slowest query. It seems that I bumped into 2 different issues at the same time. http://archives.postgresql.org/pgsql-hackers/2012-02/msg00896.php Backround: Our reporting system offers amongst others time histograms combined with a FTS filtering on error occurences (imported from error logs), It is hence not unusual that given search terms are found within a majority of the documents... best regards, Marc Mamin
[PERFORM] Partitioning / Strange optimizer behaviour
We have an optimizer problem regarding partitioned tables on 8.4.11. We started partitioning a large table containing approx. 1 billion records. So far, there is only the master table, called edifactmsgpart (which is empty) and 1 partition, called edifactmsgpart_pact. There is a bigint column called emg_id with a btree-index on it. \d edifactmsgpart_pact ... ..."emp_emg_ept_i_pact" btree (emg_id, ept_id) ... gdw=> select relname, reltuples from pg_class where relname in( 'edifactmsgpart', 'edifactmsgpart_pact' ); relname | reltuples -+- edifactmsgpart_pact | 1.03102e+09 edifactmsgpart | 0 a select on the big partition yields a decent plan and performs as expected, lasting only a fraction of a second. gdw=> explain select min( emg_id ) from edifactmsgpart_pact; QUERY PLAN -- Result (cost=2.05..2.06 rows=1 width=0) InitPlan 1 (returns $0) -> Limit (cost=0.00..2.05 rows=1 width=8) -> Index Scan using emp_emg_ept_i_pact on edifactmsgpart_pact (cost=0.00..2109171123.79 rows=1031020672 width=8) Filter: (emg_id IS NOT NULL) gdw=> select min( emg_id ) from edifactmsgpart_pact; min --- 58178 =>>> very fast. a select on the partitioned table, however, yields a... shall we call it "sub-optimal" plan: gdw=> explain select min( emg_id ) from edifactmsgpart; QUERY PLAN -- Aggregate (cost=23521692.03..23521692.04 rows=1 width=8) -> Append (cost=0.00..20944139.42 rows=1031021042 width=8) -> Seq Scan on edifactmsgpart (cost=0.00..13.70 rows=370 width=8) -> Seq Scan on edifactmsgpart_pact edifactmsgpart (cost=0.00..20944125.72 rows=1031020672 width=8) I would expect this to run half an hour or so, completely overloading the server... Any Ideas? Kind regards Marc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Partitioning / Strange optimizer behaviour
Thanks for pointing me to that article. I totally forgot that the postgres wiki existed. Updating is not an option at the moment, but we'll probably do so in the future. Until then I can live with the workaround. Kind regards, Marc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance of a large array access by position (tested version 9.1.3)
>> On 22/06/12 09:02, Maxim Boguk wrote: >> May be I completely wrong but I always assumed that the access speed to the >> array element in PostgreSQL should be close to constant time. >> But in tests I found that access speed degrade as O(N) of array size. >> Is that behaviour is correct? > From: pgsql-performance-ow...@postgresql.org On Behalf Of Jesper Krogh > Default column storage is to "compress it, and store in TOAST" with large > values. > This it what is causing the shift. Try to change the column storage of the > column > to EXTERNAL instead and rerun the test. Hello, I've repeated your test in a simplified form: you are right :-( create table t1 ( _array int[]); alter table t1 alter _array set storage external; insert into t1 SELECT ARRAY(SELECT * FROM generate_series(1,5)); create table t2 ( _array int[]); alter table t2 alter _array set storage external; insert into t2 SELECT ARRAY(SELECT * FROM generate_series(1,500)); explain analyze SELECT _array[1] FROM t1; Total runtime: 0.125 ms explain analyze SELECT _array[1] FROM t2; Total runtime: 8.649 ms best regards, Marc Mamin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance