Re: [PERFORM] two queries and dual cpu (perplexed)

2005-04-21 Thread Jeff
On Apr 21, 2005, at 7:49 AM, Shoaib Burq (VPAC) wrote:
Now I have not touch the $PGDATA/postgresql.conf (As I know very little
about memory tuning) Have run VACCUM & ANALYZE.
You should really, really bump up shared_buffers and given you have 8GB 
of ram this query would likely benefit from more work_mem.

and the time taken is *twice* that for the original. The modification 
was
minor. The queries do make use of both CPUs:

Is this an IO intensive query?  If running both in parellel results in 
2x the run time and you have sufficient cpus it would (to me) indicate 
you don't have enough IO bandwidth to satisfy the query.

Can we see an explain analyze of the query?  Could be a bad plan and a 
bad plan will never give good performance.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] two queries and dual cpu (perplexed)

2005-04-22 Thread Jeff
On Apr 21, 2005, at 11:33 PM, Shoaib Burq (VPAC) wrote:
BTW I guess should mention that I am doing the select count(*) on a 
View.

A bit of a silly question...
but are you actually selecting all the rows from this query in 
production or would it be more selective?  ie select * from bigslowview 
where bah = 'snort'?


Ran the Explain analyse with the nestedloop disabled but it was taking
forever... and killed it after 30mins.
If it takes too long you can run just plain explain (no analyze) and it 
will show you the plan.  This is nearly always instant... it'll give 
you a clue as to if your setting changes did anything.

You may need to end up breaking some parts of this up into subqueries.  
I've had to do this before.  I had one query that just ran too dang 
slow as a join so I modified it into a subquery type deal.  Worked 
great.  However since you are selecting ALL rows I doubt that will help 
much.

Another option may be to use materialized views.  Not sure how 
"dynamic" your data model is. It could help.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Why is this system swapping?

2005-04-27 Thread Jeff
On Apr 27, 2005, at 1:48 PM, Anjan Dave wrote:
As you can see the system starts utilizing swap at some point, with so 
many processes. Some time ago we had decided to keep the connections 
from the pool open for longer
You've shown the system has used swap but not that it is swapping.  
Having swap in use is fine - there is likely plenty of code and whatnot 
that is not being used so it dumped it out to swap. However if you are 
actively moving data to/from swap that is bad. Very bad. Especially on 
linux.

To tell if you are swapping you need to watch the output of say, vmstat 
1 and look at the si and so columns.

Linux is very swap happy and likes to swap things for fun and profit.
--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Why is this system swapping?

2005-04-27 Thread Jeff
On Apr 27, 2005, at 2:29 PM, Greg Stark wrote:
"AI would seriously look at tuning those connection pools down. A lot. 
If your
server processes are sitting idle over half the time I would at least 
cut it
by a factor of 2.

Are you (Anjan) using real or fake connection pooling - ie pgpool 
versus php's persistent connections ?  I'd strongly recommend looking 
at pgpool. it does connection pooling correctly (A set of X connections 
shared among the entire box rather than 1 per web server)

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Why is this system swapping?

2005-04-28 Thread Jeff
On Apr 27, 2005, at 7:46 PM, Greg Stark wrote:
In fact I think it's generally superior to having a layer like pgpool 
having
to hand off all your database communication. Having to do an extra 
context
switch to handle every database communication is crazy.

I suppose this depends on how many machines / how much traffic you have.
In one setup I run here I get away with 32 * 4 db connections instead 
of 500 * 4. Pretty simple to see the savings on the db machine. (Yes, 
it is a "bad design" as you said where static & dynamic content are 
served from the same box. However it also saves money since I don't 
need machines sitting around serving up pixel.gif vs 
myBigApplication.cgi)

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


[PERFORM] Xeon twice the performance of opteron

2011-03-17 Thread Jeff

hey folks,

Running into some odd performance issues between a few of our db  
boxes.  While trying to speed up a query I ran it on another box and  
it was twice as fast.  The plans are identical and various portions of  
the query run in the same amount of time - it all boils down to most  
of the time being spent in a join filter.  The plan is as good as it  
is going to get but the thing that is concerning me, which hopefully  
some folks here may have some insight on, is the very large difference  
in runtime.


three boxes:
	A: Intel(R) Xeon(R) CPU   E5345  @ 2.33GHz  (Runs query  
fastest)

4MB cache
	B: Quad-Core AMD Opteron(tm) Processor 2352 (2.1GHZ) (Main production  
box, currently, middle speed)

512k cache
C: Quad-Core AMD Opteron(tm) Processor 2378 (2.4GHZ)
512k cache

A & B are running PG 8.4.2 (yes, I know it desperately need to be  
upgraded). C was also on 8.4.2 and since it was not in production I  
upgraded it to 8.4.7 and got the same performance as 8.4.2.  Dataset  
on A & B is the same C is mostly the same, but is missing a couple  
weeks of data (but since this query runs over 3 years of data, it is  
negligable - plus C runs the slowest!)


All three running FC10 with kernel Linux db06  
2.6.27.19-170.2.35.fc10.x86_64 #1 SMP Mon Feb 23 13:00:23 EST 2009  
x86_64 x86_64 x86_64 GNU/Linux


Load is very low on each box. The query is running from shared_buffers  
- no real IO is occuring.


The average timing for the query in question is 90ms on A,  180ms on B  
and 190ms on C.


Now here's where some odd stuff starts piling up: explain analyze  
overhead on said queries:

20ms on A, 50ms on B and 85ms on C(!!)

We had one thought about potential NUMA issues, but doing a series  
(100) of connect, query, disconnect and looking at the timings reveals  
them all to be solid... but even still we wouldn't expect it to be  
that awful.  The smaller cache of the opterons is also a valid argument.


I know we're running an old kernel, I'm tempted to upgrade to see what  
will happen, but at the same time I'm afraid it'll upgrade to a kernel  
with a broken [insert major subsystem here] which has happened before.


Anybody have some insight into this or run into this before?

btw, little more background on the query:

   ->  Nested Loop  (cost=5.87..2763.69 rows=9943 width=0) (actual  
time=0.571..2

74.750 rows=766 loops=1)
 Join Filter: (ce.eventdate >= (md.date - '6 days'::interval))
 ->  Nested Loop  (cost=5.87..1717.98 rows=27 width=8)  
(actual time=0.53

3..8.301 rows=159 loops=1)
[stuff removed here]
->  Index Scan using  xxx_date_idx on xx md
(cost=0.00..19.50 rows=1099 width=8) (actual time=0.023..0.729  
rows=951 loops=15

9)
   Index Cond: (ce.eventdate <= md.date)


On all three boxes that inner nestloop completes in about the same  
amount of time - it is that join filter that is causing the pain and  
agony. (If you are noticing the timing differences, that is because  
the numbers above are the actual numbers, not explain analyze).  The  
query is pulling up a rolling window of events that occured on a  
specific date. This query pulls up al the data for a period of time.
ce.eventdate is indexed, and is used in the outer nestloop.  Thinking  
more about what is going on cache thrashing is certainly a possibility.


the amazing explain analyze overhead is also very curious - we all  
know it adds overhead, but 85ms?  Yow.


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Xeon twice the performance of opteron

2011-03-18 Thread Jeff


On Mar 17, 2011, at 9:39 PM, Scott Marlowe wrote:



My experience puts the 23xx series opterons in a same general
neighborhood as the E5300 and a little behind the E5400 series Xeons.
OTOH, the newer Magny Cours Opterons stomp both of those into the
ground.

Do any of those machines have zone.reclaim.mode = 1 ???

i.e.:

sysctl -a|grep zone.reclaim
vm.zone_reclaim_mode = 0

I had a machine that had just high enough interzone communications
cost to get it turned on by default and it slowed it right to a crawl
under pgsql.



It is set to zero on this machine.

I've tried PG compiled on the box itself, same result.

As for power savings, according to cpuinfo all the cores are running  
at 2.1ghz


We had another machine which typically runs as a web server running on  
an AMD Opteron(tm) Processor 6128
which after diddling the speed governor to performance (thus bumping  
cpu speed to 2ghz from 800mhz) query speed increased to 100ms, still  
not as fast as the xeon, but close enough.


I think I'm just hitting some wall of the architecture. I tried  
getting some oprofile love from it but oprofile seems to not work on  
that box. however it worked on the xeon box:

33995 9.6859  postgres j2date
21925 6.2469  postgres ExecMakeFunctionResultNoSets
20500 5.8409  postgres slot_deform_tuple
17623 5.0212  postgres BitmapHeapNext
13059 3.7208  postgres dt2time
12271 3.4963  postgres slot_getattr
11509

aside from j2date (probably coming up due to that Join filter I'd  
wager) nothing unexpected.



--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Jeff


On Mar 29, 2011, at 12:13 AM, Merlin Moncure wrote:



My own experience with MLC drives is that write cycle expectations are
more or less as advertised. They do go down (hard), and have to be
monitored. If you are writing a lot of data this can get pretty
expensive although the cost dynamics are getting better and better for
flash. I have no idea what would be precisely prudent, but maybe some
good monitoring tools and phased obsolescence at around 80% duty cycle
might not be a bad starting point.  With hard drives, you can kinda
wait for em to pop and swap em in -- this is NOT a good idea for flash
raid volumes.




we've been running some of our DB's on SSD's (x25m's, we also have a  
pair of x25e's in another box we use for some super hot tables).  They  
have been in production for well over a year (in some cases, nearly a  
couple years) under heavy load.


We're currently being bit in the ass by performance degradation and  
we're working out plans to remedy the situation.  One box has 8 x25m's  
in a R10 behind a P400 controller.  First, the p400 is not that  
powerful and we've run experiments with newer (p812) controllers that  
have been generally positive.   The main symptom we've been seeing is  
write stalls.  Writing will go, then come to a complete halt for 0.5-2  
seconds, then resume.   The fix we're going to do is replace each  
drive in order with the rebuild occuring between each.  Then we do a  
security erase to reset the drive back to completely empty (including  
the "spare" blocks kept around for writes).


Now that all sounds awful and horrible until you get to overall  
performance, especially with reads - you are looking at 20k random  
reads per second with a few disks.  Adding in writes does kick it down  
a noch, but you're still looking at 10k+ iops. That is the current  
trade off.


In general, i wouldn't recommend the cciss stuff with SSD's at this  
time because it makes some things such as security erase, smart and  
other things near impossible. (performance seems ok though) We've got  
some tests planned seeing what we can do with an Areca controller and  
some ssds to see how it goes.


Also note that there is a funky interaction with an MSA70 and SSDs.  
they do not work together. (I'm not sure if HP's official branded  
ssd's have the same issue).


The write degradation could probably be monitored looking at svctime  
from sar. We may be implementing that in the near future to detect  
when this creeps up again.



--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Jeff


On Mar 29, 2011, at 10:16 AM, Jeff wrote:

Now that all sounds awful and horrible until you get to overall  
performance, especially with reads - you are looking at 20k random  
reads per second with a few disks.  Adding in writes does kick it  
down a noch, but you're still looking at 10k+ iops. That is the  
current trade off.




We've been doing a burn in for about 4 days now on an array of 8  
x25m's behind a p812 controller: here's a sample of what it is  
currently doing (I have 10 threads randomly seeking, reading, and 10%  
of the time writing (then fsync'ing) out, using my pgiosim tool which  
I need to update on pgfoundry)


10:25:24 AM  dev104-2   7652.21 109734.51  12375.22 15.96   
8.22  1.07  0.12 88.32
10:25:25 AM  dev104-2   7318.52 104948.15  11696.30 15.94   
8.62  1.17  0.13 92.50
10:25:26 AM  dev104-2   7871.56 112572.48  13034.86 15.96   
8.60  1.09  0.12 91.38
10:25:27 AM  dev104-2   7869.72 111955.96  13592.66 15.95   
8.65  1.10  0.12 91.65
10:25:28 AM  dev104-2   7859.41 111920.79  13560.40 15.97   
9.32  1.19  0.13 98.91
10:25:29 AM  dev104-2   7285.19 104133.33  12000.00 15.94   
8.08  1.11  0.13 92.59
10:25:30 AM  dev104-2   8017.27 114581.82  13250.91 15.94   
8.48  1.06  0.11 90.36
10:25:31 AM  dev104-2   8392.45 120030.19  13924.53 15.96   
8.90  1.06  0.11 94.34
10:25:32 AM  dev104-2  10173.86 145836.36  16409.09 15.95  
10.72  1.05  0.11113.52
10:25:33 AM  dev104-2   7007.14 100107.94  11688.89 15.95   
7.39  1.06  0.11 79.29
10:25:34 AM  dev104-2   8043.27 115076.92  13192.31 15.95   
9.09  1.13  0.12 96.15
10:25:35 AM  dev104-2   7409.09 104290.91  13774.55 15.94   
8.62  1.16  0.12 90.55


the 2nd to last column is svctime. first column after dev104-2 is  
TPS.  if I kill the writes off, tps rises quite a bit:
10:26:34 AM  dev104-2  22659.41 361528.71  0.00 15.95  
10.57  0.42  0.04 99.01
10:26:35 AM  dev104-2  22479.41 359184.31  7.84 15.98   
9.61  0.52  0.04 98.04
10:26:36 AM  dev104-2  21734.29 347230.48  0.00 15.98   
9.30  0.43  0.04 95.33
10:26:37 AM  dev104-2  21551.46 344023.30116.50 15.97   
9.56  0.44  0.05 97.09
10:26:38 AM  dev104-2  21964.42 350592.31  0.00 15.96  
10.25  0.42  0.04 96.15
10:26:39 AM  dev104-2  22512.75 359294.12  7.84 15.96  
10.23  0.50  0.04 98.04
10:26:40 AM  dev104-2  22373.53 357725.49  0.00 15.99   
9.52  0.43  0.04 98.04
10:26:41 AM  dev104-2  21436.79 342596.23  0.00 15.98   
9.17  0.43  0.04 94.34
10:26:42 AM  dev104-2  22525.49 359749.02 39.22 15.97  
10.18  0.45  0.04 98.04



now to demonstrate "write stalls" on the problemtic box:
10:30:49 AM  dev104-3  0.00  0.00  0.00  0.00   
0.38  0.00  0.00 35.85
10:30:50 AM  dev104-3  3.03  8.08258.59 88.00   
2.43635.00333.33101.01
10:30:51 AM  dev104-3  4.00  0.00128.00 32.00   
0.67391.75 92.75 37.10
10:30:52 AM  dev104-3 10.89  0.00 95.05  8.73   
1.45133.55 12.27 13.37
10:30:53 AM  dev104-3  0.00  0.00  0.00  0.00   
0.00  0.00  0.00  0.00
10:30:54 AM  dev104-3155.00  0.00   1488.00  9.60  
10.88 70.23  2.92 45.20
10:30:55 AM  dev104-3 10.00  0.00536.00 53.60   
1.66100.20 45.80 45.80
10:30:56 AM  dev104-3 46.53  0.00411.88  8.85   
3.01 78.51  4.30 20.00
10:30:57 AM  dev104-3 11.00  0.00 96.00  8.73   
0.79 72.91 27.00 29.70
10:30:58 AM  dev104-3 12.00  0.00 96.00  8.00   
0.79 65.42 11.17 13.40
10:30:59 AM  dev104-3  7.84  7.84 62.75  9.00   
0.67 85.38 32.00 25.10
10:31:00 AM  dev104-3  8.00  0.00224.00 28.00   
0.82102.00 47.12 37.70
10:31:01 AM  dev104-3 20.00  0.00184.00  9.20   
0.24 11.80  1.10  2.20
10:31:02 AM  dev104-3  4.95  0.00 39.60  8.00   
0.23 46.00 13.00  6.44
10:31:03 AM  dev104-3  0.00  0.00  0.00  0.00   
0.00  0.00  0.00  0.00


that was from a simple dd, not random writes. (since it is in  
production, I can't really do the random write test as easily)


theoretically, a nice rotation of disks would remove that problem.  
annoying, but it is the price you need to pay


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-perfor

Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Jeff


On Mar 29, 2011, at 12:12 PM, Jesper Krogh wrote:



Are you replacing the drives with new once, or just secure-erase and  
back in?
What kind of numbers are you drawing out of smartmontools in usage  
figures?
(Also seeing some write-stalls here, on 24 Raid50 volumes of x25m's,  
and
have been planning to cycle drives for quite some time, without  
actually

getting to it.



we have some new drives that we are going to use initially, but  
eventually it'll be a secure-erase'd one we replace it with (which  
should perform identical to a new one)


What enclosure & controller are you using on the 24 disk beast?

--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] is it possible to make this faster?

2006-05-25 Thread Jeff -
Also, are you sure your numbers are not coming out of the mysql query  
cache?


That might explain some of it - also with Tom seeing comprable  
numbers in his test.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] SETOF performance

2004-04-05 Thread Jeff
I think it was on this list - someone posted a  message about SETOF 
being slower.  Tom replied saying it was because it needed to create an 
on-disk tuplestore.

I was just looking for some clarification - a SETOF function will 
always write the reslting tuples to disk (Not buffering in say a 
sort_mem sized buffer)?

I think if that is the case I may need to go back and change some stuff 
around.
I have a procedure that I broke out a bit to make life easier.

Basically it goes

for v_row in
select blah from function_that_gets_data_from_some_cache()
rowcount := rowcount + 1;
return next v_row;
end for;
if rowcount = 0 then
[same thing, but we call some_function_that_creates_data_for_cache]
end if;
Doing it this way means I avoid having to deal with it in the client 
and I also avoid having a giant stored procedure. (I like short & sweet 
things)

What I've found for timings is this:

select * from function_that_gets_data_from_some_cache() runs around 1.8 
ms
but select * from the_top_level_function() runs around 4.2ms
(Yes, I know 4.2 ms is fast, but that is not the point).

could this overhead be related to the SETOF tuplestores?

Might it be better to use refcursor or something or bite the bullet and 
live with a giant procedure?

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] [ SOLVED ] select count(*) very slow on an already

2004-04-17 Thread Jeff
On Apr 16, 2004, at 4:23 AM, Rajesh Kumar Mallah wrote:



 I am running an update on the same table

 update rfis set inquiry_status='APPROVED' where inquiry_status='a';

 Its running for past 20 mins. and top output is below.
 The PID which is executing the query above is 6712. Can anyone
 tell me why it is in an uninterruptable sleep and does it relate
 to the apparent poor performance? Is it problem with the disk
 hardware. I know at nite this query will run reasonably fast.
I've had this problem recently.  The problem is simply that the disk 
cannot keep up.  Most likely you don't see it at night because traffic 
is lower.  There are only 2 solutions: 1. get more disks 2. write to 
the db less

The machine I was running on had a single(!) disk.  It was a quad xeon 
so there was plenty of cpu.   I'd see 8-9 processes stuck in the "D" 
state.  Doing a simple ls -l somefile would take 10-15 seconds and of 
course, db performance was abysmal.

I had a lowly P2 with a few disks in it that was able to run circles 
around it for the simple fact the machine was not waiting for disk.  
Again, proof that disk is far more important than CPU in a db.

good luck.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-20 Thread Jeff
On Apr 19, 2004, at 8:01 PM, Tom Lane wrote:
[test case]
Quad P3-700Mhz, ServerWorks, pg 7.4.2 - 1 process: 10-30 cs / second
   2 process: 
100k cs / sec
   3 process: 140k cs 
/ sec
   8 process: 115k cs 
/ sec
Dual P2-450Mhz, non-serverworks (piix)  - 1 process 15-20 / sec
2 process 30k / sec
  3 (up to 7) process: 
15k /sec
(Yes, I verified with more processes the cs's drop)

And finally,

6 cpu sun e4500, solaris 2.6, pg 7.4.2: 1 - 10 processes: hovered 
between 2-3k cs/second (there was other stuff running on the machine as 
well)

Verrry interesting.
I've got a dual G4 at home, but for convenience Apple doesn't ship a 
vmstat that tells context switches

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Jeff
On Apr 30, 2004, at 3:01 AM, Gary Doades wrote:
[ pg query plan, etc ]
I wonder if other parts of the plan are affecting the speed.
I've recently run into a case where a merge join plan was chosen for 
this query, which took 11 seconds to execute.  Forcing it to pick a 
nested loop join dropped it to 3.  (Updating my 
default_statistics_target to 500 caused the planner to choose nested 
loop join)

So, is the plan really the same?
 A better comparision query may be a simple "select a from mytable 
where a between foo and bar"  to get an index scan.  In that case its a 
straight up, vanilla index scan.  Nothing else getting in the way.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


[PERFORM] Visual Explain

2004-06-16 Thread Jeff
I've known about this tool for a while, but it seems many people do not 
know of its existence and I think it would be useful to a lot of people 
who have a hard time reading explain analyze output. (And even those 
who can read them without blinking.. when you get deep in join hell it 
gets tricky!)

Red Hat Visual Explain - part of Red Hat Database.
It is what the name implies - a graphical (java) program to draw a 
picture of your query plan (along with all the juicy information 
explain analyze provides).   I just tried it out today and after 
upgrading my JDBC to 7.4 it worked fine (If you get a message about SET 
AUTOCOMMIT then you need to upgrade your jdbc jar)

Quite handy for getting a grasp on stupidly large query plans.
http://sources.redhat.com/rhdb/visualexplain.html
I used the CVS version, I have no idea how well the "official" releases 
work.
Anyone else using it?

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Visual Explain

2004-06-17 Thread Jeff
On Jun 17, 2004, at 7:10 AM, Adam Witney wrote:
Will this run on other platforms? OSX maybe?
I've run it on both linux (rh8) and osx (panther).
its java so it *should* run anywhere.
It isn't the fastest beast in the world though. takes a bit of time to 
render the plan.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Visual Explain

2004-06-17 Thread Jeff
On Jun 17, 2004, at 12:54 PM, Vitaly Belman wrote:
Is it possible to download the Visual Explain only (link)? I only see
that you can donwload the whole ISO (which I hardly need).
you'll need to snag it out of cvs:
http://sources.redhat.com/rhdb/cvs.html
You can checkout just visual explain so you won't need to grab 
everything.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Major differences between oracle and postgres performance - what can I do ?

2004-06-18 Thread Jeff
On Jun 18, 2004, at 7:31 AM, Gary Cowell wrote:
The explain output on postgres shows the same
execution with a scan on vers and a sort but the query
time is 78.6 seconds.
Does it run just as slow if you run it again?
It could be a case of the caches being empty
Oracle but I think I've configured comparible
buffering and sort area sizes, certainly there isn't
much physical IO going on in either case.
Configuring PG like Oracle isn't the best thing in the world.  The 
general PG philosophy is to let the OS do all the caching & buffering  
- this is reversed in the Oracle world.  In 7.4 the rule of thumb is no 
more than 10k shared_buffers.. beyond that the overhead of maintaining 
it becomes excessive.  (This isn't really the case in 7.5)

Curiously, what are your sort_mem and shared_buffers settings?
--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] SQL stupid query plan... terrible performance !

2004-06-27 Thread Jeff
On Jun 27, 2004, at 8:37 PM, Jim wrote:
Hi,
I have one performance issue... and realy have no idea what's going 
on...
When I set enable_seqscan to 0, query2 runs the same way...

upload  =>  60667 entities
uploadfield => 506316 entities
Have you vacuum analyze'd recently?
--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Mysterious performance of query because of plsql function in where condition

2004-07-02 Thread Jeff
On Jul 2, 2004, at 3:48 AM, Peter Alberer wrote:
Postgres seems to execute the function "submission_status" for every 
row
of
the submissions table (~1500 rows). The query therefore takes quite a
lot
time, although in fact no row is returned from the assignments table
when
the condition package_id=949589 is used.

Well, you need to think of it this way - PG has no idea what the 
function does so it treats it as a "black box" - thus it has to run it 
for each row to see what evaluates too - especially since it is in a 
where clause.

If you really want a function there you can use a SQL function instead 
of plpgsql - PG has smart enough to push that function up into your 
query and let the optimizer look at the whole thing.

You can also take a look at the various flags you can use while 
creating functions such as immutable, strict, etc. they can help

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Performance over a LAN

2004-07-23 Thread Jeff
On Jul 23, 2004, at 3:57 AM, William Carney wrote:
I tested the LAN connection by transferring around some large (150 
MByte)
files, and got consistent transfer rates of about 10 MBytes/second in 
both
directions without any problems, which is what I would expect. Netstat 
says
It would be interesting to run something like ntop that can show you 
current network usage... unless you are doing a large COPY the PG 
protocol has a lot of back and forth messages...

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] my boss want to migrate to ORACLE

2004-07-30 Thread Jeff
On Jul 28, 2004, at 1:08 PM, Stephane Tessier wrote:
we have a BIG problem of performance,it's slow
Can you isolate which part is slow? (log_min_duration is useful for 
finding your slow running queries)

we use postgres 7.3 for php security application with approximately 4 
millions of insertion by day and 4 millions of delete and update
That is pretty heavy write volume.   Are these updates done in batches 
or "now and then"?  If they are done in batches you could speed them up 
by wrapping them inside a transaction.

#shared_buffers = 256   # min max_connections*2 or 16, 8KB each
#shared_buffers = 196000    # min max_connections*2 or 16, 
8KB each
shared_buffers = 128000 # min max_connections*2 or 16, 8KB each

Too much. Generally over 1 will stop benefitting you.
#wal_buffers = 8    # min 4, typically 8KB each
Might want to bump this up
#checkpoint_segments = 3    # in logfile segments, min 1, 16MB each
Given your write volume, increase this up a bit.. oh.. 20 or 30 of them 
will help a lot.
But it will use 16*30MB of disk space.

Oracle is *NOT* a silver bullet.
It will not instantly make your problems go away.
I'm working on a project porting some things to Oracle and as a test I 
also ported it to Postgres.  And you know what? Postgres is running 
about 30% faster than Oracle.  The Oracle lovers here are not too happy 
with that one :)  Just so you know..

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] The black art of postgresql.conf tweaking

2004-08-04 Thread Jeff
On Aug 4, 2004, at 8:45 AM, Paul Serby wrote:
Apache on the Web server can take up to 300 connections and PHP is 
using  pg_pconnect

Postgres is set with the following.
max_connections = 300
shared_buffers = 38400
sort_mem = 12000
But Apache is still maxing out the non-super user connection limit.
Did you restart PG after making that change?
(you need to restart, reload won't change max_connections)
Also, you're sort_mem is likely too high (That is the amount of memory 
that can be used PER SORT) and you s hould back down on shared_buffers. 
(General consensus is don't go over 10k shared buffers)

Another thing you may want to try is using pgpool and regular 
pg_connect - this way you only have a pool of say, 32 connections to 
the DB that are shared among all apache instances.  This gets rid of 
the need to have hundreds of idle postgres'  sitting around.  
Connecting to pgpool is very fast. We use it in production here and it 
works wonderfully.  And it is 100% transparent to your application.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Performance Bottleneck

2004-08-08 Thread Jeff
On Aug 8, 2004, at 1:29 AM, Martin Foster wrote:
I am currently making use of Apache::DBI which overrides the 
DBI::disconnect call and keeps a pool of active connections for use 
when need be.   Since it offloads the pooling to the webserver, it 
seems more advantageous then pgpool which while being able to run on a 
external system is not adding another layer of complexity.

Apache::DBI is not the same sort of a pool as pgpool.  DB connections 
are not shared among all your apache children (A common misconception). 
 So if you have 300 apache kids you can have have 300 db connections.  
With pgpool connections are  shared among all of them so even though 
you have 300 kids you only have say 32 db connections.

Anyone had any experience with both Apache::DBI and pgpool?   For my 
needs they seem to do essentially the same thing, simply that one is 
invisible to the code while the other requires adding the complexity 
of a proxy.

Both are invisible to the app.  (With pgpool it thinks it is connecting 
to a regular old PG server)

And I've been running pgpool in production for months.  It just sits 
there.  Doesn't take much to set it up or configure it.  Works like a 
champ

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] high load caused by I/O - a hint

2004-08-18 Thread Jeff
On Aug 18, 2004, at 4:18 AM, eleven wrote:
Hello,
This is not strictly PostgreSQL performance hint, but may be
helpful to someone with problems like mine.
As I earlier posted, I was experiencing very high load average
on one of my Linux database servers (IBM eServer 345, SCSI disks on 
LSI Logic controller) caused by I/O bottleneck.

We have some 335's (I think they are 335s) and until April or so there 
was a bug in the Fusion MPT driver that would cause it to revert to 
async narrow mode if hardware RAID was enabled on it.  (Performance was 
horrible - NFS on a 100meg network was 10x faster than local disk!) And 
on the upside, when I originally researched the problem they hadn't 
found the bug yet so there were no others around having issues like 
mine so trying to figure it out was quite difficult.

I may see if using that acpi=ht makes any difference as well.
--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] What is the best way to do attribute/values?

2004-08-25 Thread Jeff
On Aug 25, 2004, at 4:22 AM, Mark Kirkwood wrote:
select
pav1.person_id
from
person_attributes_vertical pav1
where
   (pav1.attribute_id = 1
and pav1.value_id in (2,3))
or (pav1.attribute_id = 2
and pav1.value_id in (2,3))
You know..
It may help if you toss in a group by
ie
select pav1.person_id, count(*) from person_attributes_vertical pav1
where (pav1.attribute_id = 1 and pav1.value_id in (2,3)) or ( ... ) or 
(...)
group by pav1.person_id
order by count(*) desc

that should give you the person_id's that matched the most 
criteria
I've used similar things before now that I've thought about it.

If you want an exact match you could put
"having count(*) = $myNumAttributes" in there too.. By definition an 
exact match would match that definition..

it has an added side effect of producing "closest matches" when an 
exact match cannot be found... granted you may not want that for a 
dating site : )

"You asked for a blond female, blue eyes.. but I couldn't find any... 
but I *DID* find a brown haired male with brown eyes! Is that good 
enough?"

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Caching of Queries

2004-09-27 Thread Jeff
[ discussion of server side result caching ]
and lets not forget PG's major fork it will throw into things:  MVCC
The results of query A may hold true for txn 1, but not txn 2 and so on 
.
That would have to be taken into account as well and would greatly 
complicate things.

It is always possible to do a "poor man"'s query cache with triggers.. 
which would just leave you with basically a materialized view.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Need advice on postgresql.conf settings

2004-11-10 Thread Jeff
On Nov 9, 2004, at 2:01 PM, Shane | SkinnyCorp wrote:
Thanks in advance for anything you can do to help.
The real issue is this, we have THE SAME queries taking anywhere from 
.001 - 90.0 seconds... the server is using 98% of the available RAM at 
all times (because of the persistant connections via php), and I don't 
know what to do.  Every time I change a
I'd recommend strongly ditching the use of pconnect and use pgpool + 
regular connect. It is a terrific combination that provides pool 
connections like how you'd think they shoudl work (a pool of N 
connections to PG shared by Y processes instead of a 1:1 mapping).

curiously, have you noticed any pattern to the slowdown?
It could be induced by a checkpoint or vacuum.
Are you swapping at all?
Are your PHP scripts leaking at all, etc.?
Your load average is high, how does your CPU idle look (if load is 
high, and the cpus are pretty idle that is an indicator of being IO 
bound).

good luck.
--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] How to speed-up inserts with jdbc

2004-11-10 Thread Jeff
On Nov 10, 2004, at 8:51 AM, Michael Kleiser wrote:
It is trunning in in 10 Threads. Each thread makes 100 Inserts:
For the 1000 Inserts (10 threads a 100 inserts)
we need 8 seconds.
That's 125 Insets / Seconds.
How could we make it faster ?
Batch the inserts up into a transaction.
So you'd have
BEGIN
insert
insert
insert
...
COMMIT
Your numbers will suddenly sky rocket.
--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] query plan question

2004-11-17 Thread Jeff
On Nov 17, 2004, at 7:32 AM, David Parker wrote:
Oh, I didn't realize that analyze gave that much more info. I've got a
lot to learn about this tuning stuff ;-)
I've attached the output. I see from the new output where the slow 
query
is taking its time (the nested loop at line 10), but I still have no
idea why this plan is getting chosen

looks like your stats are incorrect on the sparc.
Did you forget to run vacuum analyze on it?
also, do both db's have the same data loaded?
there are some very different numbers in terms of actual rows floating 
around there...

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] scaling beyond 4 processors

2004-12-06 Thread Jeff
On Dec 6, 2004, at 5:18 PM, [EMAIL PROTECTED] wrote:
Hello everyone!
Since our current Postgres server, a quad Xeon system, finally can't 
keep up with our
load anymore we're ready to take the next step.

I'm assuming you've already done as much query tweaking as possible.
and are you sure you are CPU bound and not IO bound?
(Symptoms of IO bound are low cpu usage, high load average, poor 
performance. Many processes in "D" state)

So the question is: Has anyone experiences with running Postgres on 
systems with
more than 4 processors in a production environment? Which systems and
Have you also considered a replicated approach?
--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Some Performance Advice Needed

2004-12-23 Thread Jeff
On Dec 23, 2004, at 9:27 AM, Alex wrote:

Running hdparm reported
A) 920mb/s   (SCSI 10k)
B) 270mb/s   (SCSI 10k)
C) 1750mb/s  (IDE  7.2k)

IDE disks lie about write completion (This can be disabled on some 
drives) whereas SCSI drives wait for the data to actually be written 
before they report success.  It is quite
easy to corrupt a PG (Or most any db really) on an IDE drive.  Check 
the archives for more info.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Some Performance Advice Needed

2004-12-27 Thread Jeff
On Dec 23, 2004, at 4:27 PM, Joshua D. Drake wrote:
IDE disks lie about write completion (This can be disabled on some 
drives) whereas SCSI drives wait for the data to actually be written 
before they report success.  It is quite
easy to corrupt a PG (Or most any db really) on an IDE drive.  Check 
the archives for more info.
Do we have any real info on this? Specifically which drives? Is SATA 
the same way? What about SATA-II?
I am not saying it isn't true (I know it is) but this is a blanket 
statement that may or may not be
true with newer tech.
Scott Marlowe did some tests a while ago on it.  They are likely in the 
archives.
Maybe we can get him to pipe up :)

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Swapping on Solaris

2005-01-19 Thread Jeff
On Jan 19, 2005, at 10:42 AM, Alan Stange wrote:
Kevin Schroeder wrote:
I take that back.  There actually is some paging going on.  I ran sar 
-g 5 10 and when a request was made (totally about 10 DB queries) my 
pgout/s jumped to 5.8 and my ppgout/s jumped to 121.8.  pgfree/s also 
jumped to 121.80.
I'm fairly sure that the pi and po numbers include file IO in Solaris, 
because of the unified VM and file systems.
Curiously, what are your shared_buffers and sort_mem set too?
Perhaps they are too high?
--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Swapping on Solaris

2005-01-19 Thread Jeff
On Jan 19, 2005, at 10:40 AM, Kevin Schroeder wrote:
I may be asking the question the wrong way, but when I start up 
PostgreSQL swap is what gets used the most of.  I've got 1282MB free 
RAM right now and and 515MB swap in use.  Granted, swap file usage 
probably wouldn't be zero, but I would guess that it should be a lot 
lower so something must be keeping PostgreSQL from using the free RAM 
that my system is reporting.  For example, one of my postgres 
processes is 201M in size but on 72M is resident in RAM.  That extra 
130M is available in RAM, according to top, but postgres isn't using 
it.
Can you please give us your exact shared_buffer and sort_mem settings?
This will help greatly.  As a general thing, we say don't use more than 
10k shared bufs unless you have done testing and enjoy a benefit. 
Managing all those buffers isn't free.

I'm also not sure how Solaris reports shared memory usage for apps... a 
lot of that could be shared mem.

Can you watch say, vmstat 1 for a minute or two while PG is running and 
see if you're actually swapping?

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-20 Thread Jeff
On Jan 20, 2005, at 9:36 AM, Hervé Piedvache wrote:
Sorry but I don't agree with this ... Slony is a replication solution 
... I
don't need replication ... what will I do when my database will grow 
up to 50
Gb ... I'll need more than 50 Gb of RAM on each server ???
Slony doesn't use much ram. The mysql clustering product, ndb I believe 
it is called, requires all data fit in RAM. (At least, it used to).  
What you'll need is disk space.

As for a cluster I think you are thinking of multi-master replication.
You should look into what others have said about trying to partiition 
data among several boxes and then join the results together.

Or you could fork over  hundreds of thousands of dollars for Oracle's 
RAC.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Benchmark

2005-02-10 Thread Jeff
On Feb 10, 2005, at 12:49 AM, Jaime Casanova wrote:
Hi guys,
i'm planning try to do a comparative between some DBMS
and postgresql (informix, oracle, m$ sql server,
firebird and even mysql) i'm coordinating with people
in the irc spanish postgresql channel.
2) point me to a good benchmark test or script that
can be used?
The TPC tests are fairly widely accepted.  The thing with a benchmark 
is they are unlikely to simulate your real traffic.  But it is always 
fun to look at numbers

3) any comments?
If you plan on making your results public be very careful with the 
license agreements on the other db's.  I know Oracle forbids the 
release of benchmark numbers without their approval.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [PERFORM] Benchmark

2005-02-11 Thread Jeff
On Feb 11, 2005, at 2:04 AM, Mitch Pirtle wrote:
I did do the research, but couldn't find one instance where someone
was actually taken to task over it. So far it appears to be bluster.
Horrifying to some, but still bluster.
They may not have done that yet, but they _COULD_.  And if they decide 
to they have more money and power than you likely have and would drive 
you into financial ruin for the rest of your life (Even if you are 
correct).   It is a big risk.  I think that clause is in there so MS, 
etc. can't say "Use FooSQL, its 428% faster than that Oracle POS Just 
look!"

After using oracle in the last few months..  I can see why they'd want 
to prevent those numbers.. Oracle really isn't that good.  I had been 
under the impression that it was holy smokes amazingly fast.  It just 
isn't.  At least, in my experience it isn't.  but that is another 
story.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Possible interesting extra information for explain analyze?

2005-02-25 Thread Jeff
Given some recent posts / irc issues with dead tuple bloat..
And given that a lot of these people have at least been smart enough to 
explain analyze would it be A. possible B. useful C. None of the above 
to have various "scan" nodes of explain analyze also report how many 
invisible / dead tuples they had to disqualify (Just to clarify, they 
matched the search criteria, but were invisible due to MVCC rules).  
Some thing like:

 Seq Scan on boards  (cost=0.00..686.30 rows=25430 width=0) (actual 
time=8.866..5407.693 rows=18636 loops=1 invisiblerows=8934983098294)

This may help us to point out tuple bloat issues quicker... or it may 
give the developer enough of a clue to search around and find out he 
needs to vacuum... hmm.. but once we have an integrated autovacuum it 
will be a moot point.

Also another thing I started working on back in the day and hope to 
finish when I get time (that is a funny idea) is having explain analyze 
report when a step required the use of temp files.
--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] multi billion row tables: possible or insane?

2005-03-01 Thread Jeff
On Mar 1, 2005, at 4:34 AM, Ramon Bastiaans wrote:
What would be important issues when setting up a database this big, 
and is it at all doable? Or would it be a insane to think about 
storing up to 5-10 billion rows in a postgres database.

Buy a bunch of disks.
And then go out and buy more disks.
When you are done with that - go buy some more disks.
Then buy some ram.
Then buy more disks.
You want the fastest IO possible.
I'd also recommend the opteron route since you can also put heaping 
gobules of ram in there as well.

The database's performance is important. There would be no use in 
storing the data if a query will take ages. Query's should be quite 
fast if possible.

And make sure you tune your queries.
--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] 7 hrs for a pg_restore?

2008-02-19 Thread Jeff


On Feb 19, 2008, at 1:22 PM, Tom Lane wrote:



maintenance_work_mem, to be more specific.  If that's too small it  
will

definitely cripple restore speed.  I'm not sure fsync would make much
difference, but checkpoint_segments would.  See
http://www.postgresql.org/docs/8.3/static/populate.html#POPULATE-PG- 
DUMP




I wonder if it would be worthwhile if pg_restore could emit a warning  
if maint_work_mem is "low" (start flamewar on what "low" is).


And as an addition to that - allow a cmd line arg to have pg_restore  
bump it before doing its work?  On several occasions I was moving a  
largish table and the COPY part went plenty fast, but when it hit  
index creation it slowed down to a crawl due to low maint_work_mem..


--
Jeff Trout <[EMAIL PROTECTED]>
www.dellsmartexitin.com
www.stuarthamm.net






---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance increase with elevator=deadline

2008-04-11 Thread Jeff


On Apr 11, 2008, at 7:22 AM, Albe Laurenz wrote:
After some time of trial and error we found that changing the I/O  
scheduling

algorithm to "deadline" improved I/O performance by a factor 4 (!) for
this specific load test.

I was inspired once again to look into this - as I'm recently hitting  
some glass ceilings with my machines.


I have a little app I wrote called pgiosim (its on pgfoundry - http://pgfoundry.org/projects/pgiosim) 
 that basically just opens some files, and does random seeks and 8kB  
reads, much like what our beloved PG does.


Using 4 of these with a dataset of about 30GB across a few files  
(Machine has 8GB mem) I went from around 100 io/sec to 330 changing to  
noop.   Quite an improvement.  If you have a decent controller CFQ is  
not what you want.   I tried deadline as well and it was a touch  
slower.  The controller is a 3ware 9550sx with 4 disks in a raid10.


I'll be trying this out on the big array later today.  I found it  
suprising this info wasn't more widespread (the use of CFQ on a good  
controller).


it also seems changing elevators on the fly works fine (echo  
schedulername > /sys/block/.../queue/scheduler  I admit I sat there  
flipping back and forth going "disk go fast.. disk go slow.. disk go  
fast... " :)


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] RAID 10 Benchmark with different I/O schedulers (was: Performance increase with elevator=deadline)

2008-05-06 Thread Jeff


On May 5, 2008, at 7:33 PM, Craig James wrote:

I had the opportunity to do more testing on another new server to  
see whether the kernel's I/O scheduling makes any difference.   
Conclusion: On a battery-backed RAID 10 system, the kernel's I/O  
scheduling algorithm has no effect.  This makes sense, since a  
battery-backed cache will supercede any I/O rescheduling that the  
kernel tries to do.




this goes against my real world experience here.


pgbench -i -s 20 -U test
pgbench -c 10 -t 5 -v -U test



You should use a sample size of 2x ram to get a more realistic number,  
or try out my pgiosim tool on pgfoundry which "sort of" simulates an  
index scan.  I posted numbers from that a month or two ago here.



--
Jeff Trout <[EMAIL PROTECTED]>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] syslog performance when logging big statements

2008-07-08 Thread Jeff


On Jul 8, 2008, at 8:24 AM, Achilleas Mantzios wrote:


File sizes of about 3M result in actual logging output of ~ 10Mb.
In this case, the INSERT *needs* 20 minutes to return. This is  
because the logging through syslog seems to severely slow the system.
If instead, i use stderr, even with logging_collector=on, the same  
statement needs 15 seconds to return.




In syslog.conf is the destination for PG marked with a "-"? (ie -/var/ 
log/pg.log) which tells syslog to not sync after each line logged.   
That could explain a large chunk of the difference in time.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] 3ware vs Areca

2008-07-11 Thread Jeff
I've got a couple boxes with some 3ware 9550 controllers, and I'm  
less than pleased with performance on them.. Sequential access is  
nice, but start seeking around and you kick it in the gut.  (I've  
found posts on the internets about others having similar issues).  My  
last box with a 3ware I simply had it in jbod mode and used sw raid  
and it smoked the hw.


Anyway, anybody have experience in 3ware vs Areca - I've heard plenty  
of good anecdotal things that Areca is much better, just wondering if  
anybody here has firsthand experience.It'll be plugged into about  
8 10k rpm sata disks.


thanks
--
Jeff Trout <[EMAIL PROTECTED]>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/





Re: [PERFORM] 3ware vs Areca

2008-07-11 Thread Jeff


On Jul 11, 2008, at 3:39 PM, Jeffrey Baker wrote:


From my experience, the Areca controllers are difficult to operate.
Their firmware is, frankly, garbage.  In more than one instance we
have had the card panic when a disk fails, which is obviously counter
to the entire purpose of a RAID.  We finally removed the Areca
controllers from our database server and replaced them with HP P800s.



My main db has a p600 plugged into an msa70 which works well - does  
the HP junk work in non-hp boxes?


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/





Re: [PERFORM] 3ware vs Areca

2008-07-11 Thread Jeff


On Jul 11, 2008, at 3:21 PM, Greg Smith wrote:
My last box with a 3ware I simply had it in jbod mode and used sw  
raid and it smoked the hw.


That is often the case no matter which hardware controller you've  
got, particularly in more complicated RAID setups.  You might want  
to consider that a larger lesson rather than just a single data point.




Yeah, it'd be fun to run more benchmarks, but the beefy box, for some  
reason, is a prod box busy 24/7.  no time to nuke it and fidgit :)


Check out the pages starting at http://www.tomshardware.com/reviews/ 
SERIAL-RAID-CONTROLLERS-AMCC,1738-12.html for example, where the  
newer Areca 1680ML card just gets crushed at all kinds of workloads  
by the AMCC 3ware 9690SA.  I think the 3ware 9600 series cards have  
achieved or exceeded what Areca's 1200 series was capable of, while  
Areca's latest generation has slipped a bit from the previous one.



It does look like the 9600 series fixed a lot of the 9550 issues.

(and for the record, yes, either card I get will have a bbu. tis  
silly to get a controller without one)


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/





Re: [PERFORM] Filesystem benchmarking for pg 8.3.3 server

2008-08-11 Thread Jeff


On Aug 11, 2008, at 5:17 AM, Henrik wrote:

OK, changed the SAS RAID 10 to RAID 5 and now my random writes are  
handing 112 MB/ sek. So it is almsot twice as fast as the RAID10  
with the same disks. Any ideas why?


Is the iozone tests faulty?



does IOzone disable the os caches?
If not you need to use a size of 2xRAM for true results.

regardless - the test only took 10 seconds of wall time - which isn't  
very long at all. You'd probably want to run it longer anyway.





iozone -e -i0 -i1 -i2 -i8 -t1 -s 1000m -r 8k -+u -F /database/iotest

Children see throughput for 1 random writers=  112074.58 KB/sec
Parent sees throughput for 1 random writers =  111962.80 KB/sec
Min throughput per process  =  112074.58 KB/sec
Max throughput per process  =  112074.58 KB/sec
Avg throughput per process  =  112074.58 KB/sec
Min xfer= 1024000.00 KB
	CPU utilization: Wall time9.137CPU time0.510CPU  
utilization   5.58 %





--
Jeff Trout <[EMAIL PROTECTED]>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-13 Thread Jeff


On Jan 11, 2009, at 9:43 PM, M. Edward (Ed) Borasky wrote:


Luke Lonergan wrote:
Not to mention the #1 cause of server faults in my experience: OS  
kernel bug causes a crash.  Battery backup doesn't help you much  
there.




Not that long ago (a month or so) we ran into a problem where hpacucli  
(Utility for configuring/inspecting/etc HP smartarray controllers)  
would tickle the cciss driver in such a way that it would  cause a  
kernel panic. KABLAMMO (No data loss! we!).   The box had run for  
a long time without crashes, but it seems that when we added more  
disks and started the array building the new logical drive some  
magical things happened.


Bugs happen.  The [bad word] of it is catching the culprit with its  
fingers in the cookie jar.


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [PERFORMANCE] Buying hardware

2009-01-26 Thread Jeff


On Jan 26, 2009, at 2:42 PM, David Rees wrote:


Lots of people have databases much, much, bigger - I'd hate to imagine
have to restore from backup from one of those monsters.



If you use PITR + rsync you can create a binary snapshot of the db, so  
restore time is simply how long it takes to untar / whatever it into  
place.  Our backup script basically does:


archive backup directory
pg_start_backup
rsync
pg_stop_backup

voila. I have 2 full copies of the db.  You could even expand it a bit  
and after the rsync & friends have it fire up the instance and run  
pg_dump against it for a pg_restore compatible dump "just in case".


It takes a long time to restore a 300GB db, even if you cheat and  
parallelify some of it. 8.4 may get a  pg_restore that can load in  
parallel - which will help somewhat.


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [PERFORMANCE] Buying hardware

2009-01-26 Thread Jeff


On Jan 26, 2009, at 3:00 PM, Joshua D. Drake wrote:


On Mon, 2009-01-26 at 14:58 -0500, Jeff wrote:

voila. I have 2 full copies of the db.  You could even expand it a  
bit

and after the rsync & friends have it fire up the instance and run
pg_dump against it for a pg_restore compatible dump "just in case".

It takes a long time to restore a 300GB db, even if you cheat and
parallelify some of it. 8.4 may get a  pg_restore that can load in
parallel - which will help somewhat.


Somewhat? Just to be clear, if you have the hardware for it, parallel
restore can take a 500GB restore in 2.5 hours (versus 15). IMO, that  
is

a *little* more than somewhat. Maybe, a bit? ;)



I'd say that qualifies more towards just a "smidge" faster ;)

I'm quite excited about the feature.  I'm still on 8.2 mostly because  
of the downtime of the dump & restore.  I wrote up some plans a while  
back on doing the poor-mans parallel restore, but I haven't had the  
time to actually do it.


Theoretically, wouldn't the parallel pg_restore be able to run against  
an 8.3 instance with a dump from 8.2?   I don't see why it wouldn't be  
able to (unless it uses some handy dandy new 8.4-only catalog).  Maybe  
if I get time (HAHAHA) I'll test that out..


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD performance

2009-02-03 Thread Jeff
I somehow managed to convince the powers that be to let me get a  
couple X25-E's.
I tossed them in my macpro (8 cores), fired up Ubuntu 8.10 and did  
some testing.


Raw numbers are very impressive. I was able to get 3700 random seek 
+read's a second. In a R1 config it stayed at 3700, but if I added  
another process it went up to 7000, and eventually settled into the  
4000s.If I added in some random writing with fsyncs to it, it  
settled at 2200 (to be specific, I had 3 instances going - 2 read-only  
and 1 read-20% write to get that).  These numbers were obtained  
running a slightly modified version of pgiosim (which is on  
pgfoundtry) - it randomly seeks to a "block" in a file and reads 8kB  
of data, optionally writing the block back out.


Now, moving into reality I compiled 8.3.latest and gave it a whirl.   
Running against a software R1 of the 2 x25-e's  I got the following  
pgbench results:
(note config tweaks: work_mem=>4mb, shared_buffers=>1gb, should  
probably have tweaked checkpoint_segs, as it was emitting lots of  
notices about that, but I didn't).


(multiple runs, avg tps)

Scalefactor 50, 10 clients: 1700tps

At that point I realized write caching on the drives was ON. So I  
turned it off at this point:


Scalefactor 50, 10 clients: 900tps

At scalefactor 50 the dataset fits well within memory, so I scaled it  
up.


Scalefactor 1500: 10 clients: 420tps


While some of us have arrays that can smash those numbers, that is  
crazy impressive for a plain old mirror pair.   I also did not do much  
tweaking of PG itself.


While I'm in the testing mood, are there some other tests folks would  
like me to try out?


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD performance

2009-02-04 Thread Jeff


On Feb 3, 2009, at 1:43 PM, Scott Carey wrote:

I don’t think write caching on the disks is a risk to data integrity  
if you are configured correctly.
Furthermore, these drives don’t use the RAM for write cache, they  
only use a bit of SRAM on the controller chip for that (and respect  
fsync), so write caching should be fine.


Confirm that NCQ is on (a quick check in dmesg),  I have seen  
degraded performance when the wrong SATA driver is in use on some  
linux configs, but your results indicate its probably fine.




As it turns out, there's a bug/problem/something with the controller  
in the macpro vs the ubuntu drives where the controller goes into  
"works, but not as super as it could" mode, so NCQ is effectively  
disabled, haven't seen a workaround yet. Not sure if this problem  
exists on other distros (used ubuntu because I just wanted to try a  
live).  I read some stuff from Intel on the NCQ and in a lot of cases  
it won't make that much difference because the thing can respond so  
fast.




How much RAM is in that machine?



8GB


Some suggested tests if you are looking for more things to try :D
-- What affect does the following tuning have:

Turn the I/O scheduler to ‘noop’  ( echo noop > /sys/block// 
queue/scheduler)  I’m assuming the current was cfq, deadline may  
also be interesting, anticipatory would have comically horrible  
results.


I only tested noop, if you think about it, it is the most logical one  
as an SSD really does not need an elevator at all. There is no  
rotational latency or moving of the arm that the elevator was designed  
to cope with.


but, here are the results:
scale 50, 100 clients, 10x txns: 1600tps (a noticable improvement!)
scale 1500, 100 clients, 10xtxns: 434tps

I'm going to try to get some results for raptors, but there was  
another post earlier today that got higher, but not ridiculously  
higher tps but it required 14 15k disks instead of 2




Tune upward the readahead value ( blockdev —setra  /dev/ 
)  -- try 16384 (8MB)  This probably won’t help that much  
for a pgbench tune, its more for large sequential scans in other  
workload types, and more important for rotating media.
Generally speaking with SSD’s, tuning the above values does less  
than with hard drives.




Yeah, I don't think RA will help pgbench, and for my workloads it is  
rather useless as they tend to be tons of random IO.


I've got some Raptors here too I'll post numbers wed or thu.

--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] random_page_cost vs ssd?

2009-03-11 Thread Jeff
I've got a couple x25-e's in production now and they are working like  
a champ.  (In fact, I've got another box being built with all x25s in  
it. its going to smoke!)


Anyway, I was just reading another thread on here and that made me  
wonder about random_page_cost in the world of an ssd where a seek is  
basically free.  I haven't tested this yet (I can do that next week),  
but logically, in this scenario wouldn't lowering random_page_cost be  
ideal or would it not really matter in the grand scheme of things?


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] current transaction in productive database

2009-03-20 Thread Jeff


On Mar 20, 2009, at 5:26 AM, m...@bortal.de wrote:


Hello List,

is there a way to find out, how many transactions my currenc  
productive database is doing?


I know know how much i an offer with my new database and hardware,  
but i would also like to know what i actually _need_ on my current  
productive system.


Is there a way to find this out?


Are you looking to see how many transactions per second or more how  
many transactions concurrently at a given time?


For the former you can use pgspy (its on pgfoundry) to get an idea of  
queries per second coming in.


For the latter, just select * from pg_stat_activity where  
current_query <> '';


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] I have a fusion IO drive available for testing

2009-03-27 Thread Jeff


On Mar 26, 2009, at 8:47 AM, Dave Cramer wrote:

So far using dd I am seeing around 264MB/s on ext3, 335MB/s on ext2  
write speed. So the question becomes what is the best filesystem for  
this drive?


Anyone want me to run anything on it ?

Dave



I'd be more interested in the random io numbers.

You can do some tests with pgiosim (avail on pgfoundry) to sort-of  
simulate an index scan. It just seeks and reads. It can also randomly  
write and or fsync.


I'd be interested in seeing numbers for 1 proc and 10 on the fusionIO.
You have to make some file(s) for it to use first (I usually use dd to  
do that, and make sure it is at least 2xRAM in size)




--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] I have a fusion IO drive available for testing

2009-03-27 Thread Jeff


On Mar 27, 2009, at 1:30 PM, da...@lang.hm wrote:



for the WAL you definantly don't need the journal, for the data I'm  
not sure. I believe that postgres does appropriate fsync calls so is  
safe on a non-journaling filesystem. the fusionIO devices are small  
enough that a fsync on them does not take that long, so it may not  
be worth the overhead of the journaling.




The win for the journal on the heap is simply so you don't need to  
spend $longtime fsck'ing if you crash, etc.


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-08 Thread Jeff


On Apr 7, 2009, at 1:18 PM, Andrew Sullivan wrote:


I should have stated that differently.  First, you're right that if
you don't know where to look or what to look for, you can easily be
unaware of nodes being out of sync.  What's not a problem with Slony


_$cluster.sl_status on the origin is a very handy tool to see your  
slaves, how many sync's behind they are and whatnot.  Maybe I'm lucky,  
but I haven't got into a funky state that didn't cause my alarms that  
watch sl_status to go nuts.



Complexity seems to be the major evil here.


Yes.  Slony is massively complex.



Configuring slony by hand using slonik commands does suck horribly.
But the included altperl tools that come with it, along with  
slon_tools.conf removes a HUGE amount of that suck.


To add a table with a pk you edit slon_tools.conf and add something  
along the lines of:


"someset" => {
"set_id" => 5,
"table_id" => 5,
"pkeyedtables" => [ "tacos", "burritos", "gorditas" ]
}

then you just run

[create tables on slave(s)]
slonik_create_set someset;
slonik_subscribe_set 1 2;

there are other handy scripts in there as well for failing over,  
adding tables, merging, etc. that hide a lot of the suck.  Especially  
the suck of adding a node and creating the store paths.


I'm running slony on a rather write intensive system, works fine, just  
make sure you've got beefy IO.  One sucky thing though is if a slave  
is down sl_log can grow very large (I've had it get over 30M rows, the  
slave was only down for hours) and this causes major cpu churn while  
the queries slon issues sift through tons of data.  But, to be fair,  
that'll hurt any replication system.


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-09 Thread Jeff


On Apr 8, 2009, at 4:46 PM, Dimitri Fontaine wrote:



$ londiste.py setup.ini provider add schema.table
$ londiste.py setup.ini subscriber add schema.table



That is nice.  One could probably do that for slony too.

I may try some tests out with londiste.. I'm always open to new  
(ideally, better) things.


This could happen in Londiste too, just set pgq_lazy_fetch to a  
reasonable value and Londiste will use a cursor to fetch the events,  
lowering the load. Events are just tuples in an INSERT only table,  
which when not used anymore is TRUNCATEd away. PGQ will use 3 tables  
where to store events and will rotate its choice of where to insert  
new envents, allowing to use TRUNCATE rather than DELETE. And  
PostgreSQL is quite efficient to manage this :)

 
http://wiki.postgresql.org/wiki/Londiste_Tutorial#Londiste_is_eating_all_my_CPU_and_lag_is_raising



Well, Slony always uses a cursor to fetch, the problem is it may have  
to slog through millions of rows to find the new data - I've analyzed  
the queries and there isn't much it can do -  lots of calls to the  
xxid_ functions to determine whats to be used, whats not to be used.   
When all slaves have a sync event ack'd it is free to be removed by  
the cleanup routine which is run every few minutes.




Oh and some people asked what Londiste with failover and DDL would  
look like. Here's what the API being cooked looks like at the moment:

$ londiste setup.ini execute myddl.script.sql

$ londiste conf/londiste_db3.ini change-provider --provider=rnode1
$ londiste conf/londiste_db1.ini switchover --target=rnode2



ok, so londiste can't do failover yet, or is it just somewhat  
convoluted at this point?


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] linux deadline i/o elevator tuning

2009-04-13 Thread Jeff


On Apr 10, 2009, at 2:47 AM, Albe Laurenz *EXTERN* wrote:


Grzegorz Jaskiewicz wrote:
acording to kernel folks, anticipatory scheduler is even better for  
dbs.
Oh well, it probably means everyone has to test it on their own at  
the

end of day.


In my test case, noop and deadline performed well, deadline being a  
little

better than noop.

Both anticipatory and CFQ sucked big time.



This is my experience as well, I posted about playing with the  
scheduler a while ago on -performance, but I can't seem to find it.


If you have a halfway OK raid controller, CFQ is useless. You can fire  
up something such as pgbench or pgiosim, fire up an iostat and then  
watch your iops jump high when you flip to noop or deadline and  
plummet on cfq.  Try it. it's neat!


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] limiting performance impact of wal archiving.

2009-11-10 Thread Jeff


On Nov 10, 2009, at 10:53 AM, Laurent Laborde wrote:


On Tue, Nov 10, 2009 at 4:48 PM, Kevin Grittner
 wrote:

Laurent Laborde  wrote:


BTW, if you have any idea to improve IO performance, i'll happily
read it.  We're 100% IO bound.


At the risk of stating the obvious, you want to make sure you have
high quality RAID adapters with large battery backed cache configured
to write-back.


Not sure how "high quality" the 3ware is.
/c0 Driver Version = 2.26.08.004-2.6.18
/c0 Model = 9690SA-8I
/c0 Available Memory = 448MB


I'll note that I've had terrible experience with 3ware controllers and  
getting a high number of iops using hardware raid mode.  If you switch  
it to jbod and do softraid you'll get a large increase in iops - which  
is the key metric for a db.  I've posted previously about my problems  
with 3ware.


as for the ssd comment - I disagree.  I've been running ssd's for a  
while now (probably closing in on a year by now) with great success.   
A pair of intel x25-e's can get thousands of iops.  That being said  
the key is I'm running the intel ssds - there are plenty of absolutely  
miserable ssds floating around (I'm looking at you jmicron based disks!)


Have you gone through the normal process of checking your query plans  
to ensure they are sane? There is always a possibility a new index can  
vastly reduce IO.


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux I/O tuning: CFQ vs. deadline

2010-02-09 Thread Jeff


On Feb 8, 2010, at 11:35 PM, da...@lang.hm wrote:


And, yes, the whole I/O scheduling approach in Linux was just  
completely redesigned for a very recent kernel update.  So even  
what we think we know is already obsolete in some respects.




I'd done some testing a while ago on the schedulers and at the time  
deadline or noop smashed cfq.  Now, it is 100% possible since then  
that they've made vast improvements to cfq and or the VM to get better  
or similar performance.  I recall a vintage of 2.6 where they severely  
messed up the VM. Glad I didn't upgrade to that one :)


Here's the old post: 
http://archives.postgresql.org/pgsql-performance/2008-04/msg00155.php


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux I/O tuning: CFQ vs. deadline

2010-02-10 Thread Jeff


On Feb 10, 2010, at 1:37 AM, Greg Smith wrote:


Jeff wrote:
I'd done some testing a while ago on the schedulers and at the time  
deadline or noop smashed cfq.  Now, it is 100% possible since then  
that they've made vast improvements to cfq and or the VM to get  
better or similar performance.  I recall a vintage of 2.6 where  
they severely messed up the VM. Glad I didn't upgrade to that one :)


Here's the old post: 
http://archives.postgresql.org/pgsql-performance/2008-04/msg00155.php


pgiosim doesn't really mix writes into there though, does it?  The  
mixed read/write situations are the ones where the scheduler stuff  
gets messy.




It has the abillity to rewrite blocks randomly as well - but I  
honestly don't remember if I did that during my cfq/deadline test.   
I'd wager I didn't.  Maybe I'll get some time to run some more tests  
on it in the next couple days



--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com



--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] bgwriter tunables vs pg_stat_bgwriter

2010-02-17 Thread Jeff

Since getting on 8.4 I've been monitoring things fairly closely.
I whipped up a quick script to monitor pg_stat_bgwriter and save  
deltas every minute so I can ensure my bgwriter is beating out the  
backends for writes (as it is supposed to do).


Now, the odd thing I'm running into is this:

bgwriter_delay is 100ms (ie 10 times a second, give or take)
bgwriter_lru_maxpages is 500 (~5000 pages / second)
bgwriter_lru_multiplier is 4

Now, assuming I understand these values right the following is what  
should typically happen:


while(true)
{
if buffers_written > bgwriter_lru_maxpages
  or buffers_written > anticipated_pages_needed *  
bgwriter_lru_multiplier

   {
 sleep(bgwriter_delay ms)
  continue;
   }
   ...
}

so I should not be able to have more than ~5000 bgwriter_clean pages  
per minute. (this assumes writing takes 0ms, which of course is  
inaccurate)


However, I see this in my stats (they are deltas), and I'm reasonably  
sure it is not a bug in the code:


(timestamp, buffers clean, buffers_checkpoint, buffers backend)
 2010-02-17 08:23:51.184018 | 1 |   1686  
|   5
 2010-02-17 08:22:51.170863 | 15289 |  12676  
| 207
 2010-02-17 08:21:51.155793 | 38467 |   8993  
|4277
 2010-02-17 08:20:51.139199 | 35582 |  0  
|9437
 2010-02-17 08:19:51.125025 | 8 |  0  
|   3
 2010-02-17 08:18:51.84 |  1140 |   1464  
|   6
 2010-02-17 08:17:51.098422 | 0 |   1682  
| 228
 2010-02-17 08:16:51.082804 |50 |  0  
|   6
 2010-02-17 08:15:51.067886 |   789 |  0  
|   1


perhaps some stats buffering occurring or something or some general  
misunderstanding of some of these tunables?


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] bgwriter tunables vs pg_stat_bgwriter

2010-02-18 Thread Jeff


On Feb 17, 2010, at 6:23 PM, Greg Smith wrote:

JWith bgwriter_lru_maxpages=500 and bgwriter_delay=100ms, you can  
get up to 5000 pages/second which makes for 300,000 pages/minute.   
So none of your numbers look funny just via their scale.  This is  
why the defaults are so low--the maximum output of the background  
writer is quite big even before you adjust it upwards.




d'oh! that would be the reason.  Sorry folks, nothing to see here :)

There are however two bits of stats buffering involved.  Stats  
updates don't become visible instantly, they're buffered and only  
get their updates pushed out periodically to where clients can see  
them to reduce overhead.  Also, the checkpoint write update happens  
in one update at the end--not incrementally as the checkpoint  
progresses.  The idea is that you should be able to tell if a  
checkpoint happened or not during a period of monitoring time.  You  
look to be having checkpoints as often as once per minute right now,  
so something isn't right--probably checkpoint_segments is too low  
for your workload.




checkpoint_segments is currently 32. maybe I'll bump it up - this db  
does a LOT of writes


By the way, your monitoring code should be saving maxwritten_clean  
and buffers_allocated, too.  While you may not be doing something  
with them yet, the former will shed some light on what you're  
running into now, and the latter is useful later down the road  
you're walking.


It is, I just didn't include them in the mail.

--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Some vacuum & tuning help

2003-08-05 Thread Jeff
On Tue, 5 Aug 2003, Shridhar Daithankar wrote:
>
> I would suggest autovacuum daemon which is in CVS contrib  works for 7.3.x as
> well.. Or schedule a vacuum analyze every 15 minutes or so..
> >

Good Call. I'll give that a whirl and let you know.

> I think vacuum full is required.
>
D'oh.  Would this be a regular thing? I suppose we could do it weekly.

As for the pg_dumping of it. I suppose it would work on this table as it
is only a couple million rows and not terribly big data-wise.  The other
tables in this db are rather big and a load is not fast. (It is about
8GB).

thanks

> You mean linux? I guess you need a kernel revision for a long time. How about
> 2.4.21?
>
Yeah, linux. We're planning on upgrading when we relocate datacenters at
the end of August.  This machine has actually been up for 486 days (We're
hoping to reach linux's uptime wraparound of 496 days :) and the only
reason it went down then was because the power supply failed.  (That can
be read: pg7.0.2 had over a year of uptime. lets hope 7.3 works as good :)


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Some vacuum & tuning help

2003-08-14 Thread Jeff
I've been trying to search through the archives, but it hasn't been
successful.

We recently upgraded from pg7.0.2 to 7.3.4 and things were happy. I'm
trying to fine tune things to get it running a bit better and I'm trying
to figure out how vacuum output correlates to tuning parameters.

Here's the msot recent vacuum for the "active" table.  It gets a few
hundred updates/inserts a minute constantly throughout the day.

INFO:  Pages 27781: Changed 0, Empty 0; Tup 2451648: Vac 0, Keep 0, UnUsed
1003361.
Total CPU 2.18s/0.61u sec elapsed 2.78 sec.

I see unused is quite high. This morning I bumped max_fsm_pages to 50.
If I'm thinking right you want unused and max_fsm to be closish, right?
(Yesterday it was down around.. oh.. 600k?)

I'm thinking vacuum full's may be in order. Which stinks because I was
hoping to do away with the db essentially down for 10 minutes (includes
all the db's on that machine) while it vacuum'd.

The upside is: it is performing great.  During the vacuum analyze I do get
a few multi-second pauses while something occurs. I figured it was a
checkpoint, so I bumped checkpoint_timeout to 30 seconds and wal_buffers
to 128. (I'm just guessing on wal_buffers).

Machine is weenucks 2.2.17 on a dual p3 800, 2gb ram, 18gb drive (mirrored).
If you guys need other info (shared_buffers, etc) I'll be happy to funish
them. but the issue isn't query slowness.. just want to get this thing
oiled).

thanks

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Perfomance Tuning

2003-08-14 Thread Jeff
On Tue, 12 Aug 2003, Christopher Browne wrote:

> Are you _certain_ that's still true?  Have you a metric that shows
> Informix being 10x faster on a modern system?  That would be quite
> surprising...
>

We were forced (for budget reason) to switch from raw disk to cooked files
on our informix db. We took a huge hit - about 5-6x slower.  Granted part
of that was because informix takes number of spindles, etc into account
when generating query plans and the fact running UPDATE STATISTICS (think
Vacuum analyze) on the version we run locks the table exclusively. And it
is unacceptable to have our "main table" unavailable for hours and hours
while the update runs. (For the record: its a 8cpu sun e4500 running
sol2.6.  The raw disks were on a hitachi fibre array and the cooked files
were on a raid5 (scsi). Forget how many spindles in the raid.
There were 20 raw disks)

Informix, etc. have spent a lot of time and money working on it.
They also have the advantage of having many paid fulltime
developers who are doing this for a job, not as a weekend hobby
(Compared to the what? 2-3 full time PG developers).

The other advantage (which I hinted to above) with raw disks is being able
to optimize queries to take advantage of it.  Informix is multithreaded
and it will spawn off multiple "readers" to do say, a seq scan (and merge
the results at the end).

So if you have a table across say, 3 disks and you need to do a seq scan
it will spawn three readers to do the read. Result: nice and fast (Yes, It
may not always spawn the three readers, only when it thinks it will be a
good thing to do)

I think for PG the effort would be much better spent on other features...
like replication and whatnot.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Benchmark

2003-08-15 Thread Jeff
On Fri, 15 Aug 2003, Richard Huxton wrote:

> If you do find some flexible, scriptable web testing system that can read/fill
> out forms etc please post to the list - I've had no luck finding anything I
> like.
>
There was a tool created by altavista R&D called "WebL" that I used on a
project.  I think if you search around you'll be able to find it.

However, I think it has developed "bit rot" from not being touched in so
long. My old webl stuff will no longer work.

But what is it?  it is a web scraping language written in java. Fast it is
not. Easy to scrape and interact with pages: YES.  It has all sorts of
things for locating fields, locating table cells, etc.  (I used it for
writing a prototype that would scrape a competitors site and import the
data into our application :)

but if it doesn't work LWP isn't _that_ bad. You will end up in regex hell
though.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] What is the fastest way to get a resultset

2003-08-26 Thread Jeff
On Mon, 25 Aug 2003, Bupp Phillips wrote:

>
> I have a table that has 103,000 records in it (record size is about 953
> bytes) and when I do a select all (select * from ) it takes a
> whopping 30 secs for the data to return!!
>
> SQLServer on the other hand takes 6 secs, but you can also use what is
> called a firehose cursor, which will return the data in < 1 sec.
>
You probably want a cursor.
Typically what happens is postgres sends _all_ the data to the client -
which can be rather substantial.  A cursor allows you to say "get me the
first 1000 records. now the next 1000" - it should get you the speed you
want.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Sun vs a P2. Interesting results.

2003-08-26 Thread Jeff
ntime: 378.21 msec
(6 rows)

Sun:
Limit  (cost=2521.19..2521.24 rows=20 width=67) (actual
time=1041.14..1041.20 r
ows=20 loops=1)
   ->  Sort  (cost=2520.94..2521.39 rows=178 width=67) (actual
time=1040.96..104
1.08 rows=121 loops=1)
 Sort Key: dob
 ->  Seq Scan on userprofile  (cost=0.00..2514.28 rows=178
width=67) (
actual time=0.37..1014.50 rows=1783 loops=1)
   Filter: (gender_v AND (gender = 'm'::character varying) AND
count
ry_v AND (country = 'br'::character varying) AND (dob_v = true) AND (dob
>= '197
4-08-26 08:21:52.158181-04'::timestamp with time zone) AND (dob <=
'1985-08-26 0
8:21:52.158181-04'::timestamp with time zone))
 Total runtime: 1042.54 msec
(6 rows)

They are loaded with the exact same dataset - 53k rows, ~10MB
Notice the estimates are roughly the same, but the execution time is
different.

I don't think it is the IO system, since 10MB will be cached by the OS and
iostat reports no activity on the disks (when running the query many
times over and over and in parellel).  it is a simple query..

Could it just be that the sun sucks? (And for the record - same schema,
nearly same query (modified for datetime syntax) on informix runs in 3
seconds).




--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Sun vs a P2. Interesting results.

2003-08-26 Thread Jeff
On Tue, 26 Aug 2003, Darcy Buskermolen wrote:

> Also, after having taken another look at this, you aren't preforming the same
> query on both datasets, so you can't expect them to generate the same
> results, or the same query plans, or even comparable times. Please retry your
> tests with identical queries , specify the dates, don;t use a function like
> now() to retrieve them.
>

Given what you said in the previous email and this one here's some new
information.  I redid the query to use a static starting time and I ran
19 beaters in parallel.  After I send this mail out I'll try it with 40.

New Query:

select userkey, dob, email, gender, country from userprofile
where  gender_v  and gender='m'
   and  country_v and country = 'br'
   and dob_v
   and dob  >= '2003-08-26'::timestamptz - '29
years'::interval
   and dob <= '2003-08-26'::timestamptz - '18 years'::interval
order by dob asc
limit 20
offset 100

Explain Analyze's: (basically the same)
Sun:
 Limit  (cost=2390.05..2390.10 rows=20 width=67) (actual
time=1098.34..1098.39 rows=20 loops=1)
   ->  Sort  (cost=2389.80..2390.24 rows=178 width=67) (actual
time=1098.16..1098.28 rows=121 loops=1)
 Sort Key: dob
 ->  Seq Scan on imuserprofile  (cost=0.00..2383.14 rows=178
width=67) (actual time=0.38..1068.94 rows=1783 loops=1)
   Filter: (gender_v AND (gender = 'm'::character varying) AND
country_v AND (country = 'br'::character varying) AND dob_v AND (dob >=
'1974-08-26 00:00:00-04'::timestamp with time zone) AND (dob <=
'1985-08-26 00:00:00-04'::timestamp with time zone))
 Total runtime: 1099.93 msec
(6 rows)


p2

 Limit  (cost=2353.38..2353.43 rows=20 width=67) (actual
time=371.75..371.83 rows=20 loops=1)
   ->  Sort  (cost=2353.13..2353.60 rows=186 width=67) (actual
time=371.46..371.63 rows=121 loops=1)
 Sort Key: dob
 ->  Seq Scan on imuserprofile  (cost=0.00..2346.14 rows=186
width=67) (actual time=0.17..345.53 rows=1783 loops=1)
   Filter: (gender_v AND (gender = 'm'::character varying) AND
country_v AND (country = 'br'::character varying) AND dob_v AND (dob >=
'1974-08-26 00:00:00-04'::timestamp with time zone) AND (dob <=
'1985-08-26 00:00:00-04'::timestamp with time zone))
 Total runtime: 372.63 msec
(6 rows)


I ran this query 100 times per beater (no prepared queries) and ran
19 beaters in parellel.

P2 Machine: 345sec avg
Sun:565sec avg



I know solaris/sun isn't the preferred pg platform, and we have plenty of
capicty even with these numbers, I just find it a little suprising the
speed difference.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Sun vs a P2. Interesting results.

2003-08-26 Thread Jeff
On Tue, 26 Aug 2003, Darcy Buskermolen wrote:

> I'm still seeing differences in the planner estimates, have you run a VACUUM
> ANALYZE prior to running these tests?
>
I did. I shall retry that.. but the numbers (the cost estimates) are
pretty close on both.  the actual times are very different.

> Also, are the disk subsystems in these 2 systems the same? You may be seeing
> some discrepancies in things spindle speed,  U160 vs U320, throughput on
> specific RAID controlers, different blocksize, ect.
>

As I said in my first email IO isn't the problem here - the data set is
small enough that it is all cached (~10MB).  iostat reports 0 activity on
the disks on both the sun and p2.

and I just ran teh test again with 40 clients: 730s for hte p2, 1100 for
the sun.  (0% idle on both of them, no IO).  I think the next I may try is
recompiling with a newer gcc.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Sun vs a P2. Interesting results.

2003-08-28 Thread Jeff
Well, installing gcc 3.3.1 and using -mcpu=v9 didn't help. in fact it made
things worse.  Unless someone has something clever I'm just gonna stop
tinkering with it - my goal was met (it is several orders of magnitude
faster than informix ) and the hardware is being replaced in a month or
two.

thanks for the ideas / comments.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Sun vs a P2. Interesting results.

2003-08-28 Thread Jeff
On Tue, 26 Aug 2003, Neil Conway wrote:
>
> Would it be possible to get a profile (e.g. gprof output) for a postgres
> backend executing the query on the Sun machine?
>
Heh. Never thought of doing a profile!

I attached the entire gprof output, but here's the top few functions.

I did the test, 1 beater, 100 searches: 148 seconds total.

 30.9  45.5545.55nocachegetattr [16]
 16.0  69.2023.65internal_mcount [22]
  6.9  79.3710.17  5245902 0.00 0.00  heapgettup [21]
  6.0  88.28 8.91  3663201 0.00 0.00
ExecMakeFunctionResult
 [23]
  5.4  96.27 7.99 11431400 0.00 0.00  ExecEvalVar [25]
  3.0 100.73 4.46 18758201 0.00 0.00  ExecEvalExpr
 [24]
  3.0 105.17 4.44  5246005 0.00 0.00  AllocSetReset [29]
  2.5 108.89 3.72  5245700 0.00 0.00
HeapTupleSatisfiesSnapshot
 [30]
  2.0 111.78 2.89  5650632 0.00 0.00  LWLockRelease [32]
  1.6 114.10 2.32  5650632 0.00 0.00  LWLockAcquire [34]
  1.6 116.40 2.30  5245800 0.00 0.01  SeqNext [17]
  1.4 118.54 2.14  5438301 0.00 0.00  ExecStoreTuple [27]
  1.4 120.62 2.08  5245700 0.00 0.01  ExecQual [18]
  1.3 122.50 1.88  5379202 0.00 0.00  ReleaseAndReadBuffer
[35]
  1.1 124.16 1.66   178400 0.01 0.40  ExecScan [15]
  1.1 125.80 1.64_mcount (6247)
  1.1 127.41 1.61  5245902 0.00 0.01  heap_getnext [20]


.. as it turns out the profile gzipped is still huge (100kb) so I put it
on my web server - snag it at

http://www.jefftrout.com/~threshar/postgres/postgres-7.3.4-sol8-gprof.txt.gz

I'll do a profile for hte p2 and send post that in an hour or two

-- 
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] The results of my PostgreSQL/filesystem performance

2003-08-28 Thread Jeff
On Tue, 26 Aug 2003, Bill Moran wrote:

>
> Intelligent feedback is welcome.
>
That's some good work there, Lou. You'll make sgt for that someday.

But I think the next step, before trying out other filesystems and options
would be concurrency. Run a bunch of these beasts together and see what
happens (I don't think too many of us have a single session running).
Perhaps even make them "interfere" with each other to create as much
"pain" as possible?

on a side note - I might be blind here - I didn't see what version of pg
you were using or any postgresql.conf tweaks - or did you just use
whatever came with each distro?

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] bad estimates

2003-08-29 Thread Jeff
On Fri, 29 Aug 2003, Ken Geis wrote:

> Some good news here.  Doing the same as above on 7.4beta2 took 29
> minutes.  Now, the 7.3.3 was on reiser and 7.4 on ext2, so take that as
> you will.  7.4's index selectivity estimate seems much better; 7.3.3's
> anticipated rows was ten times the actual; 7.4's is one half of the actual.
>
Min() & Max() unfortunatly suck on PG. It will be that way for a while
perhaps at some point someone will make a "special" case and convince
-HACKERS it is a Good Thing(tm) (Like select count(*) from table being
'cached' - a lot of people probably get bad first impressions because of
that)

Would it be possible ot rewrite your queries replacing min/max with a
select stock_id from bigtable where blah = blorch order by stock_id
(desc|asc) limit 1? because that would enable PG to use an index and
magically "go fast". You may need a subselect..


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Selecting random rows efficiently

2003-08-30 Thread Jeff
On Sat, 30 Aug 2003, Richard Jones wrote:

> Hi,
> i have a table of around 3 million rows from which i regularly (twice a second
> at the moment) need to select a random row from
>
> currently i'm doing "order by rand() limit 1" - but i suspect this is
> responsible for the large load on my db server - i guess that PG is doing far
> too much work just to pick one row.
>

If you have an int id (aka serial) column then it is simple - just pick a
random number between 1 and currval('id_seq')...

or offset rand() limit 1 perhaps?

since you want random ther eis no need to bother with an order and that'll
save a sort.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Upgrade Woes

2003-09-11 Thread Jeff
On Thu, 11 Sep 2003, [EMAIL PROTECTED] wrote:

>
> The Vacuum full is performed once at the end of the whole job.
>
have you also tried vacuum analyze periodically - it does not lock the
table and can help quite a bit?

still odd why it would be that much slower between those versions.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] software vs hw hard on linux

2003-09-12 Thread Jeff
Due to various third party issues, and the fact PG rules, we're planning
on migrating our deplorable informix db to PG.  It is a rather large DB
with a rather high amount of activity (mostly updates).  So I'm going to
be aquiring a dual (or quad if they'll give me money) box. (In my testing
my glorious P2 with a 2 spindle raid0 is able to handle it fairly well)

What I'm wondering about is what folks experience with software raid vs
hardware raid on linux is.  A friend of mine ran a set of benchmarks at
work and found sw raid was running obscenely faster than the mylex and
(some other brand that isn't 3ware) raids..

On the pro-hw side you have ones with battery backed cache, chacnes are
they are less likely to fail..

On the pro-sw side you have lots of speed and less cost (unfortunately,
there is a pathetic budget so spending $15k on a raid card is out of the
question really).

any thoughts?

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Inconsistent performance

2003-09-16 Thread Jeff
On Tue, 16 Sep 2003, Joseph Bove wrote:

> I still think that 3 seconds is not acceptable. However, I reserve the
> right to be wrong. Does it sound unrealistic to expect PostgreSQL to be
> able to read 90,000 rows with 300 bytes per row in under a second?
>
first, check to see what your max throughput on your disk is using a
benchmark such as Bonnie (Making sure to use a size LARGER than phsyical
memory. 2x physical is veyr optimial).

next, run your query again with a vmstat 1 running in another term.

See how close the vmstat "bi" numbers correspond to your max according to
bonnie.  You could have an IO bottleneck.  (I once went running around
trying to figure it out and then discovered the issue was IO).



--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] LIKE query running slow

2003-09-24 Thread Jeff
On Tue, 23 Sep 2003, Garrett Bladow wrote:

> Recently we upgraded the RAM in our server. After the install a LIKE query that used 
> to take 5 seconds now takes 5 minutes. We have tried the usual suspects, VACUUM, 
> ANALYZE and Re-indexing.
>
> Any thoughts on what might have happened?
>
Did you reload the db? If you did perhaps you didn't use the "C" locale?
That can cause a huge slowdown.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Tuning/performance issue...

2003-10-01 Thread Jeff
On Tue, 30 Sep 2003, David Griffiths wrote:

>
> This is all part of a "migrate away from Oracle" project. We are looking at
> 3 databases -
> MySQL (InnoDB), Postgres and Matisse (object oriented). We have alot of
> queries like this
> or worse, and I'm worried that many of them would need to be re-written. The
> developers
> know SQL, but nothing about tuning, etc.
>

There's a movement at my company to ditch several commercial db's in favor
of a free one.  I'm currently the big pg fan around here and I've actually
written a rather lengthy presentation about pg features, why, tuning, etc.
but another part was some comparisons to other db's..

I decided so I wouldn't be blinding flaming mysql to give it a whirl and
loaded it up with the same dataset as pg.  First thing I hit was lack of
stored procedures.   But I decided to code around that, giving mysql the
benefit of the doubt.  What I found was interesting.

For 1-2 concurrent
'beaters' it screamed. ultra-fast.  But.. If you increase the concurrent
beaters up to say, 20 Mysql comes to a grinding halt.. Mysql and the
machine itself become fairly unresponsive.  And if you do cache unfriendly
queries it becomes even worse.   On PG - no problems at all. Scaled fine
and dandy up.  And with 40 concurrent beaters the machine was still
responsive.  (The numbers for 20 client was 220 seconds (pg) and 650
seconds (mysql))

So that is another test to try out - Given your configuration I expect you
have lots of concurrent activity.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Tuning/performance issue...

2003-10-01 Thread Jeff
On Wed, 1 Oct 2003, Oleg Lebedev wrote:

> Jeff,
> I would really appreciate if you could send me that lengthy presentation
> that you've written on pg/other dbs comparison.
> Thanks.
>

After I give the presentation at work and collect comments from my
coworkers (and remove some information you folks don't need to know :) I
will be very willing to post it for people to see.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] count(*) slow on large tables

2003-10-03 Thread Jeff
On Thu, 2 Oct 2003, Christopher Browne wrote:

> I can't imagine why the raw number of tuples in a relation would be
> expected to necessarily be terribly useful.
>

We use stuff like that for reporting queries.

example:
On our message boards each post is a row.  The powers that be like to know
how many posts there are total (In addition to 'today')-
select count(*) from posts is how it has been
done on our informix db.  With our port to PG I instead select reltuples
pg_class.

I know when I login to a new db (or unknown to me db) the first thing I do
is look at tables and see what sort of data there is.. but in code I'd
rarely do that.

I know some monitoring things around here also do a select count(*) on
sometable to ensure it is growing, but like you said, this is easily done
with the number of pages as well.

yes. Informix caches this data. I believe Oracle does too.

Mysql with InnoDB does the same thing PG does. (MyISAM caches it)

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Tuning/performance issue...

2003-10-04 Thread Jeff
On Fri, 3 Oct 2003, Bruce Momjian wrote:

>
> I have updated the FAQ to be:
>
>   In comparison to MySQL or leaner database systems, we are
>   faster for multiple users, complex queries, and a read/write query
>   load.  MySQL is faster for SELECT queries done by a few users.
>
> Is this accurate?  It seems so.
>
>

Another thing I noticed - If you use a dataset that can live in mysql's
query cache / os cache it screams, until it has to hit the disk. then
GRINDING HALT.

It would be nice if someone (I don't have the time now) did a comparison
of say:
selct value where name = XXX; [where xxx varies] with 1,10,20,50
connections

then make progressively more complex queries. And occasionally point out
mysql silly omissions:
select * from myview where id = 1234
[Oh wait! mysql doesn't have views. Ooopsy!]

Wrapping up - PG is not that slow for simple queries either.  It can be
rather zippy - and PREPARE can give HUGE gains - even for simple
statements.   I've often wondered if YACC, etc is a bottleneck (You can
only go as fast as your parser can go).

Hurray for PG!

And I'm giving my PG presentation monday.  I hope to post it tuesday after
I update with comments I receive and remove confidential information.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-06 Thread Jeff
On Sun, 5 Oct 2003, Neil Conway wrote:

>
> > I don't know any portable way to do that :-(
>
> For the non-portable way of doing this, are you referring to O_DIRECT?
>
> Even if it isn't available everywhere, it might be worth considering
> this at least for the platforms on which it is supported.
>

I strongly agree here only if we can prove there is a benefit.
I think it would be silly of us if some OS supported SnazzyFeatureC that
was able to speed up PG by a large percentage (hopefully, in a rather
non-invasive way in the code).  But, I do see the problem here with bloat
and PG being radically different platform to platform.  I suppose we could
dictate that at least N os's had to have it.. or perhaps supply it has
contrib/ patches Something to think about.

I'd be interested in tinkering with this, but I'm more interested at the
moment of why (with proof, not antecdotal) Solaris is so much slower than
Linux and what we cna do about this.  We're looking to move a rather large
Informix db to PG and ops has reservations about ditching Sun hardware.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] locking/performance, Solaris performance discovery

2003-10-06 Thread Jeff
On Mon, 6 Oct 2003, Andrew Sullivan wrote:

> There's plenty of academic work which purports to show that LRU is
> far from the best choice.  Just in principle, it seems obvious that a
> single-case seqscan-type operation (such as vacuum does) is a good
> way to lose your cache for no real gain.
>

Logically bypassing caches for a seq scan also makes sense.

> Interestingly, we're contemplating ditching Solaris because of the
> terrible reliability we're getting from the hardware.
>

The reason ops likes solaris / sun is twofold. 1. we have a pile of big
sun machines around.  2. Solaris / Sun is quite a bit more graceful in the
egvent of a hardware failure.  We've burned out our fair share of cpu's
etc and solaris has been rather graceful about it.

I've started profiling and running tests... currently it is leaning
towards the sysv semaphores. I see in src/backend/port/ that pg_sema.c is
linked to the sysv implementation.  So what I did was create a
semaphore set, and then fired off 5 copies of a program that attaches
to that semaphore and then locks/unlocks it 1M times.

2xP2-450, Linux 2.4.18: 1 process: 221680 / sec, 5 process: 98039 / sec
4xUltraSparc II-400Mhz, Solaris 2.6: 1 proc: 142857 / sec, 5 process:
23809

So I'm guessing that is where a LOT of the suck is coming from.

What I plan to do next is looking to see if there are other interprocess
locking mechanisms on solaris (perhaps pthread_mutex with that
inter-process flag or something) to see if I can get those numbers a
little closer.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] locking/performance, Solaris performance discovery

2003-10-06 Thread Jeff
On Mon, 6 Oct 2003, Tom Lane wrote:

>
> Does Solaris have Posix semaphores?  You could try using those instead.
>
>   regards, tom lane

Yep. It does.

I learned them quick enough (using posix_sema.c as a guide)
and found out that at least on Sol 2.6 they are slower than sysv - with 5
processes it went to about 16k lock/unlock a second.

I'm going to try to find a box around here I can get sol(8|9) on that has
sufficient disk space and see.  I'm guessing sun has likely made
improvements...


Another odd thing I'm trying to work out is why my profiles come out so
radically different on the linux box and the sun box.

Sun:
 31.17 18.9018.90 internal_mcount
 19.10 30.4811.58  8075381 0.00 0.00  _bt_checkkeys
  5.66 33.91 3.43 24375253 0.00 0.00  FunctionCall2
  4.82 36.83 2.92  8073010 0.00 0.00  _bt_step
  3.51 38.96 2.1314198 0.15 0.15  _read
  2.77 40.64 1.68  8069040 0.00 0.00  varchareq
  2.59 42.21 1.5728454 0.06 0.23  _bt_next
  2.29 43.60 1.39 1003 1.39 1.40  AtEOXact_Buffers
  1.86 44.73 1.13 16281197 0.00 0.00  pg_detoast_datum
  1.81 45.83 1.10 _mcount
  1.68 46.85 1.02 2181 0.47 0.47  pglz_decompress


Linux:
 11.14  0.62 0.62 1879 0.00 0.00  pglz_decompress
  6.71  0.99 0.37 1004 0.00 0.00  AtEOXact_Buffers
  3.80  1.20 0.21  1103045 0.00 0.00  AllocSetAlloc
  3.23  1.38 0.18   174871 0.00 0.00  nocachegetattr
  2.92  1.54 0.16  1634957 0.00 0.00  AllocSetFreeIndex
  2.50  1.68 0.1420303 0.00 0.00  heapgettup
  1.93  1.79 0.11 1003 0.00 0.00  AtEOXact_CatCache
  1.76  1.89 0.10   128442 0.00 0.00  hash_any
  1.72  1.98 0.1090312 0.00 0.00  FunctionCall3
  1.69  2.08 0.0950632 0.00 0.00  ExecTargetList
  1.60  2.17 0.0951647 0.00 0.00  heap_formtuple
  1.55  2.25 0.09   406162 0.00 0.00  newNode
  1.46  2.33 0.08   133044 0.00 0.00  hash_search

It is the same query with slightly different data (The Sun has probably..
20-40k more rows in the table the query hits).

I'll be digging up more info later today.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] SOlaris updates

2003-10-06 Thread Jeff
Ran the test on another linux box - the one that generated the dump the
sun loaded (which should have similar data...) and I got a profile plan
similar to the Sun. Which makes me feel more comfortable.

Still interesting why that other box gave me the different profile.
Now off the fun and exciting world of seeing what I can do about it.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Postgres low end processing.

2003-10-07 Thread Jeff
On Tue, 7 Oct 2003, Stef wrote:

> The initial instance took up 8372K and this fluctuated
> between +- 8372K  and 10372K, plus +- 3500K for
> every connection.
>

Does that include/exlude the size of say, shared code & libraries?
I know linux does copy-on-write forking.. so it may be less in reality...

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Sun performance - Major discovery!

2003-10-08 Thread Jeff
Well, as you guys know I've been tinkering with sun-vs-linux postgres for
a while trying to come up with reasons for the HUGE performance
differences. We've all had our anecdotal thoughts (fork sucks, ipc sucks,
ufs sucks, etc) and I've had a breakthrough.

Knowing that GCC only produces good code on x86 (and powerpc with apple's
mods, but it is doubtful that is as good as ibm's power compiler) I
decided to try out Sunsoft CC.  I'd heard from more than one person/place
that gcc makes abysmal sparc code.  Given that the performance profiles
for both the linux and sun boxes showed the same functions taking up most
of the time I thought I'd see what a difference sunsoft could give me.

So - hardware -
Sun E450 4x400mhz ultrasparc IIi, 4GB ram, scsi soemthing disk. (not
raid) solaris 2.6

Linux - 2xP3 500mhz, 2GB, scsi disk of some flavor (not raid) linux 2.2.17
(old I know!)

So here's the results using my load tester (single connection per beater,
repeats the query 1000 times with different input each time (we'll get
~20k rows back), the query is a common query around here.

I discounted the first run of the test as caches populated.

Linux - 1x - 35 seconds, 20x - 180 seconds

Sun - gcc - 1x 60 seconds  20x 245 seconds
Sun - sunsoft defaults - 1x 52 seonds 20x [similar to gcc most likely]
Sun - sunsoft -fast  - 1x 28 seconds  20x 164 seconds

As you math guru's can probably deduce - that is a rather large
improvement.  And by rather large I mean hugely significant.  With results
like this, I think it warrants mentioning in the FAQ_Solaris, and probably
the performance guide.

Connecting will always be a bit slower. But I think most people realize
that connecting to a db is not cheap.

I think update/etc will cause more locking, but I think IO will become the
bottle neck much sooner than lock/unlock will. (This is mostly anecdotal
given how fast solaris can lock/unlock a semaphore and how much IO I know
I have)

Oh yes, with was with 7.3.4 and sunsoft cc Sun WorkShop 6 update 1 C
5.2 2000/09/11 (which is old, perhaps newer ones make even better code?)

I'm not sure of PG's policy of non-gcc things in configure, but perhaps if
we detect sunsoft we toss in the -fast flag and maybe make it the
preferred one on sun? [btw, it compiled with no changes but it did spew
out tons of warnings]

comments?

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Sun performance - Major discovery!

2003-10-08 Thread Jeff
On Wed, 8 Oct 2003, Andrew Sullivan wrote:

> My worry about this test is that it gives us precious little
> knowledge about concurrent connection slowness, which is where I find
> the most significant problems.  When we tried a Sunsoft cc vs gcc 2.95
> on Sol 7 about 1 1/2 years ago, we found more or less no difference
> once we added more than 5 connections (and we always have more than 5
> connections).  It might be worth trying again, though, since we moved
> to Sol 8.
>

The 20x column are the results when I fired up 20 beater concurrently.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Sun performance - Major discovery!

2003-10-08 Thread Jeff
On Wed, 8 Oct 2003, Neil Conway wrote:

> What is the query?
>

It retrieves an index listing for our boards. The boards are flat (not
threaded) and messages are numbered starting at 1 for each board.

If you pass in 0 for the start_from it assumes the latest 60.

And it should be noted - in some cases some boards have nearly 2M posts.
Index on board_name, number.

I cannot give out too too much stuff ;)

create or replace function get_index2(integer, varchar, varchar)
returns setof snippet
as '
DECLARE
p_start alias for $1;
p_board alias for $2;
v_start integer;
v_num integer;
v_body text;
v_sender varchar(35);
v_time timestamptz;
v_finish integer;
v_row record;
v_ret snippet;
BEGIN

v_start := p_start;

if v_start = 0 then
select * into v_start from get_high_msg(p_board);
v_start := v_start - 59;
end if;

v_finish := v_start + 60;

for v_row in
select number, substr(body, 0, 50) as snip, member_handle,
timestamp
from posts
where board_name = p_board and
number >= v_start and
number < v_finish
order by number desc
LOOP
return next v_row;
END LOOP;

return;
END;
' language 'plpgsql';


> Interesting (and surprising that the performance differential is that
> large, to me at least). Can you tell if the performance gain comes from
> an improvement in a particular subsystem? (i.e. could you get a profile
> of Sun/gcc and compare it with Sun/sunsoft).
>

I'll get these later today.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] Presentation

2003-10-08 Thread Jeff
The boss cleared my de-company info-ified pg presentation.
It deals with PG features, crude comparison to other dbs, install, admin,
and most importantly - optimization & quirks.

Its avail in powerpoint and (ugg) powerpoint exported html.

Let me know if there are blatant errors, etc in there.
Maybe even slightly more subtle blatant errors :)

The people here thought it was good.

http://postgres.jefftrout.com/

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Sun performance - Major discovery!

2003-10-08 Thread Jeff
On Wed, 8 Oct 2003, Neil Conway wrote:

> Interesting (and surprising that the performance differential is that
> large, to me at least). Can you tell if the performance gain comes from
> an improvement in a particular subsystem? (i.e. could you get a profile
> of Sun/gcc and compare it with Sun/sunsoft).
>

Yeah - like I expected it was able to generate much better code for
_bt_checkkeys which was the #1 function in gcc on both sun & linux.

and as you can see, suncc was just able to generate much nicer code. I'd
look at the assembler output but that won't be useful since I am very
unfamiliar with the [ultra]sparc instruction set..


Here's the prof and gprof output for the latest run:
GCC:
  %   cumulative   self  self total
 time   seconds   secondscalls  ms/call  ms/call  name
 31.52 19.4419.44 internal_mcount
 20.28 31.9512.51  8199466 0.00 0.00  _bt_checkkeys
  5.61 35.41 3.46  8197422 0.00 0.00  _bt_step
  5.01 38.50 3.09 24738620 0.00 0.00  FunctionCall2
  3.00 40.35 1.85  8194186 0.00 0.00  varchareq
  2.61 41.96 1.6124309 0.07 0.28  _bt_next
  2.42 43.45 1.49 1003 1.49 1.51  AtEOXact_Buffers
  2.37 44.91 1.4612642 0.12 0.12  _read
  2.33 46.35 1.44 16517771 0.00 0.00  pg_detoast_datum
  2.08 47.63 1.28  8193186 0.00 0.00  int4lt
  1.35 48.46 0.83  8237204 0.00 0.00  BufferGetBlockNumber
  1.35 49.29 0.83  8193888 0.00 0.00  int4ge
  1.35 50.12 0.83 _mcount


SunCC -pg -fast.
 %Time Seconds Cumsecs  #Calls   msec/call  Name

  23.24.274.27108922056  0.  _mcount
  20.73.828.09 8304052  0.0005  _bt_checkkeys
  13.72.53   10.6225054788  0.0001  FunctionCall2
   5.10.94   11.56   24002  0.0392  _bt_next
   4.40.81   12.37 8301867  0.0001  _bt_step
   3.40.63   13.00 8298219  0.0001  varchareq
   2.70.50   13.5016726855  0.  pg_detoast_datum
   2.40.45   13.95 8342464  0.0001  BufferGetBlockNumber
   2.40.44   14.39 8297941  0.0001  int4ge
   2.20.41   14.801003  0.409   AtEOXact_Buffers
   2.00.37   15.17 4220349  0.0001  lc_collate_is_c
   2.00.37   15.54 8297219  0.  int4lt
   1.60.29   15.83   26537  0.0109  AllocSetContextCreate
   0.90.16   15.991887  0.085   pglz_decompress
   0.70.13   16.12  159966  0.0008  nocachegetattr
   0.70.13   16.25 4220349  0.  varstr_cmp
   0.60.11   16.36  937576  0.0001  MemoryContextAlloc
   0.50.09   16.45  150453  0.0006  hash_search





> -Neil
>
>
>
> ---(end of broadcast)---
> TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
>
>

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Presentation

2003-10-08 Thread Jeff
On Wed, 8 Oct 2003, Shridhar Daithankar wrote:


Thanks for the nitpicks :)

I've taken some into consideration.
I also signed onto the advocacy list so I can be in on discussions there.

Feel free to convert to whatever format you'd like. I originally started
working on it in OpenOffice, but I got mad at it. So I switched to
powerpoint and got mad at that too :)


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Presentation

2003-10-08 Thread Jeff
On Wed, 8 Oct 2003, Shridhar Daithankar wrote:

> * Same slide. IIRC postgresql always compresses bytea/varchar. Not too much sure
> about which but there is something that is compressed by default..:-)

> * Tablespaces has a patch floating somewhere. IIRC Gavin Sherry is the one who
> is most ahead of it. For all goodness, they will feature in 7.5 and design is

For the sake of things, I didn't include any features a patch provides. I
did include things that may appear in contrib/.

> * Mysql transaction breaks down if tables from different table types are involved.
> * Mysql transactions do not feature constant time commit/rollback like
> postgresql. The time to rollback depends upon size of transaction
> * Mysql does not split large files in segments the way postgresql do. Try
> storing 60GB of data in single mysql table.

I didn't add these ones. The user can figure this one out.
Perhaps when we/me expands this into multiple documents we can expand on
this.

> * Slide on caching. Postgresql can use 7000MB of caching. Important part is it
> does not lock that memory in it's own process space. OS can move around buffer
> cache but not memory space of an application.

I'm guilty of this myself - when I first started pg I was looking for a
way to make it use a zillion megs of memory like we have informix do -
Perhaps I'll reword that segment.. the point was to show PG relies on the
OS to do a lot of caching and that it doesn't do it itself.

> * Using trigger for maintening a row count would generate as much dead rows as
> you wanted to avoid in first place..:-)

We all know this.. but it is a way to get a fast select count(*) from
table


> All of them are really minor. It's a very well done presentation but 45 slides
> could be bit too much at a time. I suggest splitting the presentation in 3.
> Intro and comparison, features, administration, programming and tuning. Wow..
> they are 5..:-)
>

Yeah. What I'd really love to do is de-powerpointify it and make it a nice
set of "real" web pages.


> Can you rip out informix migration? It could be a good guide by itself.
>

I agree. It would be good to rip out. I think we have the oracle guide
somewhere..


I've put this updated on up on hte postgres.jefftrout.com site
along with openoffice version.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Sun performance - Major discovery!

2003-10-08 Thread Jeff
On Wed, 8 Oct 2003, Neil Conway wrote:

>
> What CFLAGS does configure pick for gcc? From
> src/backend/template/solaris, I'd guess it's not enabling any
> optimization. Is that the case? If so, some gcc numbers with -O and -O2
> would be useful.
>

I can't believe I didn't think of this before! heh.
Turns out gcc was getting nothing for flags.

I added -O2 to CFLAGS and my 60 seconds went down to 21.  A rather mild
improvment huh?

I did a few more tests and suncc still beats it out - but not by too much
now (Not enought to justify buying a license just for compiling pg)

I'll go run the regression test suite with my gcc -O2 pg and the suncc pg.
See if they pass the test.

If they do we should consider adding -O2 and -fast to the CFLAGS.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 8: explain analyze is your friend


  1   2   3   4   5   6   7   8   9   >