Re: [PERFORM] Has anyone run on the new G5 yet

2003-12-04 Thread Paul Tuckfield
(hope I'm posting this correctly)

You wrote:

>First question is do we gain anything by moving the RH Enterprise
>version of Linux in terms of performance, mainly in the IO realm as we
>are not CPU bound at all? Second and more radical, has anyone run
>postgreSQL on the new Apple G5 with an XRaid system? This seems like a
>great value combination. Fast CPU, wide bus, Fibre Channel IO, 2.5TB
>all for ~17k.

Wow, funny coincidence:  I've  got a pair of dual xeons w. 8G + 14disk
fcal arrays, and an xserve with an XRaid that I've been screwing around
with.  If you have specific tests you'd like to see, let me know.

--- so, for the truly IO bound, here's my recent messin' around summary:

In the not-so-structured tests I've done, I've been disappointed with
Redhat AS 2.1.  IO thruput.  I've had difficulty driving a lot of IO
thru my dual fcal channels:  I can only get one going at 60M/sec, and
when I drive IO to the second, I still see only about 60M/sec combined.
and when I does get that high it uses about 30% CPU on a dual xeon
hyperthreaded box, all in sys (by vmstat).  something very wrong there,
and the only thing I can conclude is that I'm serializing in the driver
somehow (qla2200 driver), thus parallel channels do the same as one, and
interrupt madness drives the cpu up just to do this contentious IO.

This contrasts with the Redhat 9 I just installed on a similar box, that
got 170M/sec on 2 fcal channels, and the expected 5-6% cpu.

The above testing was dd straight from /dev/rawX devices, so no buffer
cache confusion there.  

Also had problems getting the Redhat AS to bind to my newer qla2300
adapters at all, whereas they bound fine under RH9.   

Redhat makes the claim of finer grained locks/semaphores in the qla and
AIC drivers in RH AS, but my tests seem to show that the 2 fcal ports
were serializing against eachother in the kernel under RH AS, and not so
under RH9.  Maybe I'm useing the wrong driver under AS. eh.

so sort story long, it seems like you're better of with RH9.  But again,
before you lay out serious coin for xserve or others, if you have
specific tests you want to see, I'll take a little time to contrast w.
exserve.  One of the xeons also has an aic7x scsi controler w 4 drives
so It might match your rig better.

I also did some token testing on the xserve I have which I believe may
only have one processor (how do you tell on osX?) and the xraid has 5
spindles in it.  I did a cursory build of postgres on it and also a io
test (to the filesystem) and saw about 90M/sec.  Dunno if it has dual
paths (if you guys know how to tell, let me know)


Biggest problem I've had in the past w. linux in general is that it
seems to make poor VM choices under heavy filesystem IO.  I don't really
get exactly where it's going wrong , but I've had numerous experiences
on older systems where bursty IO would seem to cause paging on the box
(pageout of pieces of the oracle SGA shared memory) which is a
performance disaseter.  It seems to happen even when the shared memory
was sized reasonably below the size of physical ram, presumably because
linux is too aggressive in allocating filesystem cache (?) anyway, it
seems to make decisions based on desire for zippy workstation
performance and gets burned on thruput on database servers.  I'm
guessing this may be an issue for you , when doing heavy IO.  Thing is,
it'll show like you're IO bound kindof because you're thrashing.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Minimum hardware requirements for Postgresql db

2003-12-04 Thread Christopher Browne
[EMAIL PROTECTED] wrote:
> Sorry for my mistake on the 15000 recs per day.

It was useful for us to pick at that a bit; it was certainly looking a
mite suspicious.

> In fact, this server is planned as a OLTP database server for a retailer.
> Our intention is either to setup 1 or 2 Postgresql db in the server.
>
> The proper sizing info for the 1st Postgresql db should be:
>
> No. of item master : 200,000
> (This item master grows at 0.5% daily).
>
> No. of transactions from Point-of-Sales machines: 25,000

> Plus other tables, the total sizing that I estimated is 590,000
> records daily.

So that's more like 7 TPS, with, more than likely, a peak load several
times that.

> The 2nd Postgresql db will be used by end users on client machines linked
> via ODBC, doing manual data entry.
> This will house the item master, loyalty card master and other Historical
> data to be kept for at least 1.5 years.
>
> Therefore total sizing for this db is around 165,000,000 recs at any time.

FYI, it is useful to plan for purging the old data from the very
beginning; if you don't, things can get ugly :-(.

> In summary, the single machine must be able to take up around 100
> users connections via both socket and ODBC. And house the above
> number of records.

Based on multiplying the load by 40, we certainly move from
"pedestrian hardware where anything will do" to something requiring
more exotic hardware.   

- You _definitely_ want a disk array, with a bunch of SCSI disks.

- You _definitely_ will want some form of RAID controller with
  battery-backed cache.

- You probably want multiple CPUs.

- You almost certainly will want a second (and maybe third) complete
  redundant system that you replicate data to.

- The thing that will have _wild_ effects on whether this is enough,
  or whether you need to go for something even _more_ exotic
  (e.g. - moving to big iron UNIX(tm), whether that be Solaris,
  AIX, or HP/UX) is the issue of how heavily the main database gets
  hit by queries.

  If "all" it is doing is consolidating transactions, and there is
  little query load from the POS systems, that is a very different
  level of load from what happens if it is also servicing pricing
  queries.

  Performance will get _destroyed_, regardless of how heavy the iron
  is, if you hit the OLTP system with a lot of transaction reports.
  You'll want a secondary replicated system to draw that load off.

Evaluating whether it needs to be "big" hardware or "really enormous"
hardware is not realistic based on what you have said.  There are
_big_ variations possible based notably on:

 1. What kind of query load does the OLTP server have to serve up?

If the answer is "lots," then everything gets more expensive.

 2. How was the database schema and the usage of the clients designed?

How well it is done will have a _large_ impact on how many TPS the
system can cope with.

You'll surely need to do some prototyping, and be open to
possibilities such as that you'll need to consider alternative OSes.
On Intel/AMD hardware, it may be worth considering FreeBSD; it may
also be needful to consider "official UNIX(tm)" hardware.  It would be
unrealistic to pretend more certainty...
-- 
(reverse (concatenate 'string "ac.notelrac.teneerf" "@" "454aa"))
http://www.ntlug.org/~cbbrowne/nonrdbms.html
"Being really good at C++ is  like being really good at using rocks to
sharpen sticks."  -- Thant Tessman

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] tuning questions

2003-12-04 Thread Jack Coates
Hi,

sorry for duplication, I asked this on pgsql-admin first before
realizing it wasn't the appropriate list.

I'm having trouble optimizing PostgreSQL for an admittedly heinous
worst-case scenario load.

testbed:
dual P3 1.3 GHz box with 2GB RAM
two IDE 120G drives on separate channels (DMA on), OS on one, DB on the
other, some swap on each (totalling 2.8G).
RH Linux 8.

I've installed PG 7.3.4 from source (./configure && make && make
install) and from PGDG RPMs and can switch back and forth. I also have
the 7.4 source but haven't done any testing with it yet aside from
starting it and importing some data.

The application is on another server, and does this torture test: it
builds a large table (~6 million rows in one test, ~18 million in
another). Rows are then pulled in chunks of 4 to 6 thousand, acted on,
and inserted back into another table (which will of course eventually
grow to the full size of the first).

The problem is that pulling the 4 to 6 thousand rows puts PostgreSQL
into a tail spin: postmaster hammers on CPU anywhere from 90 seconds to
five minutes before returning the data. During this time vmstat shows
that disk activity is up of course, but it doesn't appear to be with
page swapping (free and top and vmstat).

Another problem is that performance of the 6 million row job is decent
if I stop the job and run a vacuumdb --analyze before letting it
continue; is this something that 7.4 will help with? vacuumb --analyze
doesn't seem to have much effect on the 18 million row job.

I've tweaked shared buffers to 8192, pushed sort memory to 2048, vacuum
memory to 8192, and effective cache size to 1.
/proc/sys/kernel/shmmax is set to 16 and /proc/sys/fs/file-max
is set to 65536. Ulimit -n 3192.

I've read several sites and postings on tuning PG and have tried a
number of different theories, but I'm still not getting the architecture
of how things work.

thanks,
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
"Interoperability is the keyword, uniformity is a dead end."
--Olivier Fourdan



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] autovacuum daemon stops doing work after about an hour

2003-12-04 Thread Vivek Khera
> "MTO" == Matthew T O'Connor <[EMAIL PROTECTED]> writes:

>> Then it just sits there.  I started it at 11:35am, and it is now
>> 3:30pm.

MTO> Weird Alphabetically speaking, is vkmlm."public"."user_list" be the
MTO> last table in the last schema in the last database?  You are running

conveniently, yes it is...

MTO> with -d4, so you would get a message about going to sleep shortly after
MTO> dealing with the last table, but you didn't get the sleep message, so I
MTO> don't think the problem is that pg_autovacuum is sleeping for an
MTO> inordinate amount time.

The only sleep logged was

[2003-12-03 04:47:13 PM] 1 All DBs checked in: 84996853 usec, will sleep for 469 secs.


Here's all it did on yesterday afternoon's "hour of work":

[2003-12-03 04:45:48 PM] Performing: ANALYZE "public"."url_track"
[2003-12-03 04:46:27 PM] Performing: ANALYZE "public"."msg_recipients"
[2003-12-03 04:46:55 PM] Performing: ANALYZE "public"."deliveries"
[2003-12-03 04:46:55 PM] Performing: ANALYZE "public"."user_list"
[2003-12-03 04:47:12 PM] Performing: ANALYZE "public"."sessions"
[2003-12-03 04:55:02 PM] Performing: ANALYZE "public"."url_track"
[2003-12-03 04:55:22 PM] Performing: VACUUM ANALYZE "public"."msg_recipients"
[2003-12-03 05:40:11 PM] Performing: VACUUM ANALYZE "public"."user_list"

then 18 minutes later, it reported:

[2003-12-03 05:58:25 PM] select relfilenode,reltuples,relpages from pg_class where 
relfilenode=18588202
[2003-12-03 05:58:25 PM]   table name: vkmlm."public"."user_list"
[2003-12-03 05:58:25 PM]  relfilenode: 18588202;   relisshared: 0
[2003-12-03 05:58:25 PM]  reltuples: 9;  relpages: 427920
[2003-12-03 05:58:25 PM]  curr_analyze_count:  2559236; cur_delete_count:   2475824
[2003-12-03 05:58:25 PM]  ins_at_last_analyze: 2559236; del_at_last_vacuum: 2475824
[2003-12-03 05:58:25 PM]  insert_threshold:509; delete_threshold1001

and stopped doing anything.


MTO> when you kill it, do you get a core file?  Could you do a backtrace and
MTO> see where pg_autovacuum is hung up?

nope.  unfortunately my PG libs are without debugging, too.  I'll
rebuild pg_autovacuum with debugging and run it under gdb so I can see
where it gets stuck.

I'll report back when I find something.  I just wanted to check first
if anyone else ran into this.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] tuning questions

2003-12-04 Thread Josh Berkus
Jack,

> The application is on another server, and does this torture test: it
> builds a large table (~6 million rows in one test, ~18 million in
> another). Rows are then pulled in chunks of 4 to 6 thousand, acted on,
> and inserted back into another table (which will of course eventually
> grow to the full size of the first).

>e tweaked shared buffers to 8192, pushed sort memory to 2048, vacuum
> memory to 8192, and effective cache size to 1.
> /proc/sys/kernel/shmmax is set to 16 and /proc/sys/fs/file-max
> is set to 65536. Ulimit -n 3192.

Have you read this?
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

Actually, your situation is not "worst case".  For one thing, your process is 
effectively single-user; this allows you to throw all of your resources at 
one user.The problem is that your settings have effectively throttled PG 
at a level appropriate to a many-user and/or multi-purpose system.  You need 
to "open them up".

For something involving massive updating/transformation like this, once you've 
done the basics (see that URL above) the main settings which will affect you 
are sort_mem and checkpoint_segments, both of which I'd advise jacking way up 
(test by increments).  Raising wal_buffers wouldn't hurt either.

Also, give some thought to running VACUUM and/or ANALYZE between segments of 
your procedure.Particularly if you do updates to many rows of a table and 
then query based on the changed data, it is vital to run an ANALYZE first, 
and usually a good idea to run a VACUUM if it was an UPDATE or DELETE and not 
an INSERT.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] tuning questions

2003-12-04 Thread Jeff
On Thu, 04 Dec 2003 08:06:23 -0800
Jack Coates <[EMAIL PROTECTED]> wrote:

> testbed:
> dual P3 1.3 GHz box with 2GB RAM
> two IDE 120G drives on separate channels (DMA on), OS on one, DB on
> the other, some swap on each (totalling 2.8G).
> RH Linux 8.

Side Note: be sure to turn off write caching on those disks or you may
have data corruption in the event of a failure

> The problem is that pulling the 4 to 6 thousand rows puts PostgreSQL
> into a tail spin: postmaster hammers on CPU anywhere from 90 seconds
> to five minutes before returning the data. During this time vmstat
> shows that disk activity is up of course, but it doesn't appear to be
> with page swapping (free and top and vmstat).
> 
Have you tried modifying the app to retrieve the rows in smaller chunks?
(use a cursor). this way it only needs to alloate memory to hold say,
100 rows at a time instead of 6000. 

Also, have you explain analyze'd your queries to make sure PG is picking
a good plan to execute?

> I've tweaked shared buffers to 8192, pushed sort memory to 2048,
> vacuum memory to 8192, and effective cache size to 1.
> /proc/sys/kernel/shmmax is set to 16 and /proc/sys/fs/file-max
> is set to 65536. Ulimit -n 3192.

you should set effective cache size bigger, especially with 2GB of
memory. effective_cache_size tells PG 'about' how much data it cna
expect the OS to cache.  

and.. I'm not sure about your query, but perhaps the sort of those 6000
rows is spilling to disk?  If you look in explain analyze you'll see in
the "Sort" step(s) it will tell you how many rows and how "wide" they
are.  If rows * width > sort_mem, it will have to spill the sort to
disk, which is slow.

If you post query info and explain analyze's we can help optimize the
query itself.


-- 
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] tuning questions

2003-12-04 Thread Rob Fielding

I've tweaked shared buffers to 8192, pushed sort memory to 2048, vacuum
memory to 8192, and effective cache size to 1.
/proc/sys/kernel/shmmax is set to 16 and /proc/sys/fs/file-max
is set to 65536. Ulimit -n 3192.
Your sharedmemory is too high, and not even being used effectivey. Your 
other settings are too low.

Ball park guessing here, but I'd say first read (and understand) this:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

Then make shared memory about 10-20% available ram, and set:

((shmmax/1024) - ( 14.2 * max_connections ) - 250 ) / 8.2 = shared_buffers

decrease random_page_cost to 0.3 and wack up sort mem by 16 times, 
effective cache size to about 50% RAM (depending on your other settings) 
and try that for starters.

--

Rob Fielding
[EMAIL PROTECTED]
www.dsvr.co.uk  Development Designer Servers Ltd

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Has anyone run on the new G5 yet

2003-12-04 Thread William Yu
Sean Shanny wrote:
First question is do we gain anything by moving the RH Enterprise 
version of Linux in terms of performance, mainly in the IO realm as we 
are not CPU bound at all?  Second and more radical, has anyone run 
postgreSQL on the new Apple G5 with an XRaid system?  This seems like a 
great value combination.  Fast CPU, wide bus, Fibre Channel IO, 2.5TB 
all for ~17k.
Seems like a great value but until Apple produces a G5 that supports 
ECC, I'd pass on them.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] tuning questions

2003-12-04 Thread scott.marlowe
On Thu, 4 Dec 2003, Jack Coates wrote:

> Another problem is that performance of the 6 million row job is decent
> if I stop the job and run a vacuumdb --analyze before letting it
> continue; is this something that 7.4 will help with? vacuumb --analyze
> doesn't seem to have much effect on the 18 million row job.

Just to add to what the others have said here, you probably want to run 
the pg_autovacuum daemon in the background.  It comes with 7.4 but will 
work fine with 7.3.  



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] tuning questions

2003-12-04 Thread Dror Matalon
On Thu, Dec 04, 2003 at 11:59:32AM -0500, Jeff wrote:
> On Thu, 04 Dec 2003 08:06:23 -0800
> Jack Coates <[EMAIL PROTECTED]> wrote:
> 
> > testbed:
> > dual P3 1.3 GHz box with 2GB RAM
> > two IDE 120G drives on separate channels (DMA on), OS on one, DB on
> > the other, some swap on each (totalling 2.8G).
> > RH Linux 8.
> 
> Side Note: be sure to turn off write caching on those disks or you may
> have data corruption in the event of a failure

I've seen this comment several times from different people.
Would someone care to explain how you would get data corruption? I
thought that the whole idea of the log is to provide a journal similar
to what you get in a journaling file system. 

In other words, the db writes a series of transactions to the log and marks 
that "log entry" (don't know the right nomeclature) as valid. When the db
crashes, it reads the log, and discards the last "log entry" if it wasn't
marked as valid, and "replays" any transactions that haven't been
commited ot the db. The end result being that you might loose your last
transaction(s) if the db crashes, but nothing ever gets corrupted.

So what am I missing in this picture?

Regards,

Dror

-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.fastbuzz.com
http://www.zapatec.com

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] tuning questions

2003-12-04 Thread Jord Tanner
If I understand the problem correctly, the issue is that IDE drives
signal that data has been written to disk when they actually are holding
the data in the write cache. In the case of a power down (and I remember
someone showing some test results confirming this, check the list
archive) the data in the drive write cache is lost, resulting in
corrupted logs. 

Anyone else have more details?

Jord Tanner

On Thu, 2003-12-04 at 09:57, Dror Matalon wrote:
> On Thu, Dec 04, 2003 at 11:59:32AM -0500, Jeff wrote:
> > On Thu, 04 Dec 2003 08:06:23 -0800
> > Jack Coates <[EMAIL PROTECTED]> wrote:
> > 
> > > testbed:
> > > dual P3 1.3 GHz box with 2GB RAM
> > > two IDE 120G drives on separate channels (DMA on), OS on one, DB on
> > > the other, some swap on each (totalling 2.8G).
> > > RH Linux 8.
> > 
> > Side Note: be sure to turn off write caching on those disks or you may
> > have data corruption in the event of a failure
> 
> I've seen this comment several times from different people.
> Would someone care to explain how you would get data corruption? I
> thought that the whole idea of the log is to provide a journal similar
> to what you get in a journaling file system. 
> 
> In other words, the db writes a series of transactions to the log and marks 
> that "log entry" (don't know the right nomeclature) as valid. When the db
> crashes, it reads the log, and discards the last "log entry" if it wasn't
> marked as valid, and "replays" any transactions that haven't been
> commited ot the db. The end result being that you might loose your last
> transaction(s) if the db crashes, but nothing ever gets corrupted.
> 
> So what am I missing in this picture?
> 
> Regards,
> 
> Dror
-- 
Jord Tanner <[EMAIL PROTECTED]>


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] tuning questions

2003-12-04 Thread Andrew Sullivan
On Thu, Dec 04, 2003 at 09:57:38AM -0800, Dror Matalon wrote:
> 
> I've seen this comment several times from different people.
> Would someone care to explain how you would get data corruption? I
> thought that the whole idea of the log is to provide a journal similar
> to what you get in a journaling file system. 

> So what am I missing in this picture?

That a journalling file system can _also_ have file corruption if you
have write caching enabled and no battery back up.  If the drive
tells the OS, "Yep!  It's all on the disk!" bit it is _not_ actually
scribed in the little bitty magnetic patterns -- and at that very
moment, the power goes away -- the data that was reported to have been
on the disk, but which was actually _not_ on the disk, is no longer
anywhere.  (Well, except in the past.  But time travel was disabled
some versions ago. ;-)

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] tuning questions

2003-12-04 Thread Josh Berkus
Scott,

> Just to add to what the others have said here, you probably want to run 
> the pg_autovacuum daemon in the background.  It comes with 7.4 but will 
> work fine with 7.3.  

I don't recommend using pg_autovacuum with a data transformation task.   pg_av 
is designed for "regular use" not huge batch tasks.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] tuning questions

2003-12-04 Thread scott.marlowe
On Thu, 4 Dec 2003, Josh Berkus wrote:

> Scott,
> 
> > Just to add to what the others have said here, you probably want to run 
> > the pg_autovacuum daemon in the background.  It comes with 7.4 but will 
> > work fine with 7.3.  
> 
> I don't recommend using pg_autovacuum with a data transformation task.   pg_av 
> is designed for "regular use" not huge batch tasks.

What bad thing is likely to happen if it's used here?  Fire too often or 
use too much I/O bandwidth?  Would that be fixed by the patch being tested 
to introduce a delay every x pages of vacuuming?


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] autovacuum daemon stops doing work after about an hour

2003-12-04 Thread Gaetano Mendola
Vivek Khera wrote:

"MTO" == Matthew T O'Connor <[EMAIL PROTECTED]> writes:


Then it just sits there.  I started it at 11:35am, and it is now
3:30pm.


MTO> Weird Alphabetically speaking, is vkmlm."public"."user_list" be the
MTO> last table in the last schema in the last database?  You are running
conveniently, yes it is...

MTO> with -d4, so you would get a message about going to sleep shortly after
MTO> dealing with the last table, but you didn't get the sleep message, so I
MTO> don't think the problem is that pg_autovacuum is sleeping for an
MTO> inordinate amount time.
The only sleep logged was

[2003-12-03 04:47:13 PM] 1 All DBs checked in: 84996853 usec, will sleep for 469 secs.
What I seen is:

# tail -f auto.log
[2003-12-04 07:10:18 PM] reltuples: 72;  relpages: 1
[2003-12-04 07:10:18 PM] curr_analyze_count:  72; cur_delete_count:   0
[2003-12-04 07:10:18 PM] ins_at_last_analyze: 72; del_at_last_vacuum: 0
[2003-12-04 07:10:18 PM] insert_threshold:572; delete_threshold536
[2003-12-04 07:10:18 PM] table name: empdb."public"."contracts"
[2003-12-04 07:10:18 PM] relfilenode: 17784;   relisshared: 0
[2003-12-04 07:10:18 PM] reltuples: 347;  relpages: 5
[2003-12-04 07:10:18 PM] curr_analyze_count:  347; cur_delete_count:   0
[2003-12-04 07:10:18 PM] ins_at_last_analyze: 347; del_at_last_vacuum: 0
[2003-12-04 07:10:18 PM] insert_threshold:847; delete_threshold673
[ 5 minutes of delay ]  <- LOOK THIS

[2003-12-04 07:10:18 PM] 503 All DBs checked in: 179396 usec, will sleep 
for 300 secs.
[2003-12-04 07:15:19 PM] 504 All DBs checked in: 98814 usec, will sleep 
for 300 secs.

I think is a good Idea put a fflush after:

fprintf(LOGOUTPUT, "[%s] %s\n", timebuffer, logentry);

Regards
Gaetano Mendola


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


[PERFORM] Slow UPADTE, compared to INSERT

2003-12-04 Thread Ivar Zarans
Hello!

I am relative newcomer to SQL and PostgreSQL world, so please forgive me
if this question is stupid.

I am experiencing strange behaviour, where simple UPDATE of one field is
very slow, compared to INSERT into table with multiple indexes. I have
two tables - one with raw data records (about 24000), where one field
contains status information (varchar(10)). First table has no indexes,
only primary key (recid). Second table contains processed records - some
fields are same as first table, others are calculated during processing.
Records are processed by Python script, which uses PyPgSQL for PostgreSQL
access.

Processing is done by selecting all records from table1 where status
matches certain criteria (import). Each record is processed and results
are inserted into table2, after inserting status field on same record in
table1 is updated with new value (done). Update statement itself is
extremely simple: "update table1 set status = 'done' where recid = ..."

Most interesting is, that insert takes 0.004 seconds in average, but
update takes 0.255 seconds in average. Processing of 24000 records took
around 1 hour 20 minutes.

Then i changed processing logic not to update every record in table1
after processing. Instead i did insert recid value into temporary table
and updated records in table1 after all records were processed and
inserted into table2:
UPDATE table1 SET Status = 'done' WHERE recid IN (SELECT recid FROM temptable)

This way i got processing time of 24000 records down to about 16 minutes.
About 13 minutes from this took last UPDATE statement.

Why is UPDATE so slow compared to INSERT? I would expect more or less
similar performance, or slower on insert since table2 has four indexes
in addition to primary key, table1 has only primary key, which is used
on update. Am i doing something wrong or is this normal?

I am using PostgreSQL 7.3.4, Debian/GNU Linux 3.0 (Woody),
kernel 2.4.21, Python 2.3.2, PyPgSQL 2.4

-- 
Ivar Zarans


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] tuning questions

2003-12-04 Thread Jack Coates
On Thu, 2003-12-04 at 09:12, Rob Fielding wrote:
> > 
> > I've tweaked shared buffers to 8192, pushed sort memory to 2048, vacuum
> > memory to 8192, and effective cache size to 1.
> > /proc/sys/kernel/shmmax is set to 16 and /proc/sys/fs/file-max
> > is set to 65536. Ulimit -n 3192.
> 
> Your sharedmemory is too high, and not even being used effectivey. Your 
> other settings are too low.
> 
> Ball park guessing here, but I'd say first read (and understand) this:
> 
> http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

I've read it many times, understanding is slower :-)

> 
> Then make shared memory about 10-20% available ram, and set:
> 
> ((shmmax/1024) - ( 14.2 * max_connections ) - 250 ) / 8.2 = shared_buffers
> 
> decrease random_page_cost to 0.3 and wack up sort mem by 16 times, 
> effective cache size to about 50% RAM (depending on your other settings) 
> and try that for starters.

Following this, I've done:
2gb ram
=
 2,000,000,000
bytes

15 % of that
=
   300,000,000
bytes

divided by
1024
=
   292,969
kbytes

max_conn *
14.2
=
   454
kbytes

subtract c4
=
   292,514
kbytes

subtract 250
=
   292,264
kbytes

divide by 8.2
=
35,642
shared_buffers

performance is unchanged for the 18M job -- pg continues to use ~
285-300M, system load and memory usage stay the same. I killed that,
deleted from the affected tables, inserted a 6M job, and started a
vacuumdb --anaylze. It's been running for 20 minutes now...

getting the SQL query better optimized for PG is on my todo list, but
not something I can do right now -- this application is designed to be
cross-platform with MS-SQL, PG, and Oracle so tweaking SQL is a touchy
subject.

The pgavd conversation is intriguing, but I don't really understand the
role of vacuuming. Would this be a correct statement: "PG needs to
regularly re-evaluate the database in order to adjust itself?" I'm
imagining that it continues to treat the table as a small one until
vacuum informs it that the table is now large?
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
"Interoperability is the keyword, uniformity is a dead end."
--Olivier Fourdan



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] tuning questions

2003-12-04 Thread Eric Soroos
On Dec 4, 2003, at 10:11 AM, Andrew Sullivan wrote:

On Thu, Dec 04, 2003 at 09:57:38AM -0800, Dror Matalon wrote:
I've seen this comment several times from different people.
Would someone care to explain how you would get data corruption? I
thought that the whole idea of the log is to provide a journal similar
to what you get in a journaling file system.

So what am I missing in this picture?
That a journalling file system can _also_ have file corruption if you
have write caching enabled and no battery back up.  If the drive
tells the OS, "Yep!  It's all on the disk!" bit it is _not_ actually
scribed in the little bitty magnetic patterns -- and at that very
moment, the power goes away -- the data that was reported to have been
on the disk, but which was actually _not_ on the disk, is no longer
anywhere.  (Well, except in the past.  But time travel was disabled
some versions ago. ;-)
It's not just a theoretical problem.  It's happened to me on a laptop 
drive in the last week or so.

I was testing out dbmail by hammering on it on Panther laptop, hfs+ 
journaling enabled, psql 7.4, latest and greatest.  I managed to hang 
the system hard, requiring a reboot. Psql wouldn't start after the 
crash, complaining of a damaged relation and helpfully telling me that 
'you may need to restore from backup'.

No big deal on the data loss, since it was a test/hammering 
installation. It would have been nice to be able to drop that relation 
or prune the entire database, but I'm sure that would ultimately run 
into referential integrity problems.

eric



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-04 Thread Jeff
On Thu, 4 Dec 2003 20:57:51 +0200
Ivar Zarans <[EMAIL PROTECTED]> wrote:
.

> table1 is updated with new value (done). Update statement itself is
> extremely simple: "update table1 set status = 'done' where recid =
> ..."
> 
> Most interesting is, that insert takes 0.004 seconds in average, but
> update takes 0.255 seconds in average. Processing of 24000 records
> took around 1 hour 20 minutes.

Do you have an index on recid?

and did you vacuum analyze after you loaded up the data?

> 
> Then i changed processing logic not to update every record in table1
> after processing. Instead i did insert recid value into temporary
> table and updated records in table1 after all records were processed
> and inserted into table2:
> UPDATE table1 SET Status = 'done' WHERE recid IN (SELECT recid FROM
> temptable)
> 

"IN" queries are terribly slow on versions before 7.4

> Why is UPDATE so slow compared to INSERT? I would expect more or less
> similar performance, or slower on insert since table2 has four indexes
> in addition to primary key, table1 has only primary key, which is used
> on update. Am i doing something wrong or is this normal?
> 

Remember, UPDATE has to do all the work of select and more. 

And if you have 4 indexes those will also add to the time (Since it has
to update/add them to the tree)

-- 
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] tuning questions

2003-12-04 Thread Josh Berkus
Jack,

> Following this, I've done:
> 2gb ram
> =
>  2,000,000,000
> bytes

This calculation is fun, but I really don't know where you got it from.   It 
seems quite baroque.  What are you trying to set, exactly?

> getting the SQL query better optimized for PG is on my todo list, but
> not something I can do right now -- this application is designed to be
> cross-platform with MS-SQL, PG, and Oracle so tweaking SQL is a touchy
> subject.

Well, if you're queries are screwed up, no amount of .conf optimization is 
going to help you much. You could criticize that PG is less adept than 
some other systems at re-writing "bad queries", and you would be correct.  
However, there's not much to do about that on existing systems.

How about posting some sample code?

> The pgavd conversation is intriguing, but I don't really understand the
> role of vacuuming. Would this be a correct statement: "PG needs to
> regularly re-evaluate the database in order to adjust itself?" I'm
> imagining that it continues to treat the table as a small one until
> vacuum informs it that the table is now large?

Not Vacuum, Analyze.  Otherwise correct.  Mind you, in "regular use" where 
only a small % of the table changes per hour, periodic ANALYZE is fine.  
However, in "batch data transform" analyze statements need to be keyed to the 
updates and/or imports.

BTW, I send a couple of e-mails to the Lyris documentation maintainer about 
updating out-of-date information about setting up PostgreSQL.   I never got a 
response, and I don't think my changes were made.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] autovacuum daemon stops doing work after about an hour

2003-12-04 Thread Vivek Khera
> "MTO" == Matthew T O'Connor <[EMAIL PROTECTED]> writes:

>> I'm running Postgres 7.4 release on FreeBSD 4.9-RELEASE.

MTO> I don't run FreeBSD, so I haven't tested with FreeBSD.  Recently Craig
MTO> Boston reported and submitted a patch for a crash on FreeBSD, but that
MTO> doesn't sound like your problem.  Could be some other type of platform
MTO> dependent problem. 

Oh lucky me.

I think I found it.  I compiled with -g -O and ran it under gdb, so
the output is line buffered.  The last thing it prints out now is
this:

[2003-12-04 02:11:17 PM] 3 All DBs checked in: -786419782 usec, will sleep for -1272 
secs.

since sleep() takes an unsigned int as its parameter, we are actually
sleeping for 4294966024 seconds == 136 years.

I recall reading about the negative time to test the dbs
somewhere... I guess I'll get on debugging that.  The time keeper in
this box is pretty darned accurate otherwise (using ntpd).

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.Khera Communications, Inc.
Internet: [EMAIL PROTECTED]   Rockville, MD   +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] tuning questions

2003-12-04 Thread Jack Coates
On Thu, 2003-12-04 at 11:20, Josh Berkus wrote:
> Jack,
> 
> > Following this, I've done:
> > 2gb ram
> > =
> >  2,000,000,000
> > bytes
> 
> This calculation is fun, but I really don't know where you got it from.   It 
> seems quite baroque.  What are you trying to set, exactly?
Message-ID:  <[EMAIL PROTECTED]>
Date: Thu, 04 Dec 2003 17:12:11 +
From: Rob Fielding <[EMAIL PROTECTED]

I'm trying to set Postgres's shared memory usage in a fashion that
allows it to return requested results quickly. Unfortunately, none of
these changes allow PG to use more than a little under 300M RAM.
vacuumdb --analyze is now taking an inordinate amount of time as well
(40 minutes and counting), so that change needs to be rolled back.

> 
> > getting the SQL query better optimized for PG is on my todo list, but
> > not something I can do right now -- this application is designed to be
> > cross-platform with MS-SQL, PG, and Oracle so tweaking SQL is a touchy
> > subject.
> 
> Well, if you're queries are screwed up, no amount of .conf optimization is 
> going to help you much. You could criticize that PG is less adept than 
> some other systems at re-writing "bad queries", and you would be correct.  
> However, there's not much to do about that on existing systems.
> 
> How about posting some sample code?

Tracking that down in CVS and translating from C++ is going to take a
while -- is there a way to get PG to log the queries it's receiving?

> 
> > The pgavd conversation is intriguing, but I don't really understand the
> > role of vacuuming. Would this be a correct statement: "PG needs to
> > regularly re-evaluate the database in order to adjust itself?" I'm
> > imagining that it continues to treat the table as a small one until
> > vacuum informs it that the table is now large?
> 
> Not Vacuum, Analyze.  Otherwise correct.  Mind you, in "regular use" where 
> only a small % of the table changes per hour, periodic ANALYZE is fine.  
> However, in "batch data transform" analyze statements need to be keyed to the 
> updates and/or imports.
> 
> BTW, I send a couple of e-mails to the Lyris documentation maintainer about 
> updating out-of-date information about setting up PostgreSQL.   I never got a 
> response, and I don't think my changes were made.

She sits on the other side of the cube wall from me, and if I find a
decent config it's going into the manual -- consider this a golden
opportunity :-)

-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
"Interoperability is the keyword, uniformity is a dead end."
--Olivier Fourdan



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-04 Thread Ivar Zarans
On Thu, Dec 04, 2003 at 02:23:20PM -0500, Jeff wrote:

> > Most interesting is, that insert takes 0.004 seconds in average, but
> > update takes 0.255 seconds in average. Processing of 24000 records
> > took around 1 hour 20 minutes.
> 
> Do you have an index on recid?

Yes, this is primary key of table1

> and did you vacuum analyze after you loaded up the data?

No, this is running as nightly cronjob. All tests were done during one
day, so no vacuum was done.

> "IN" queries are terribly slow on versions before 7.4

OK, this is useful to know :)

> > Why is UPDATE so slow compared to INSERT? I would expect more or less
> > similar performance, or slower on insert since table2 has four indexes
> > in addition to primary key, table1 has only primary key, which is used
> > on update. Am i doing something wrong or is this normal?

> Remember, UPDATE has to do all the work of select and more. 
> 
> And if you have 4 indexes those will also add to the time (Since it has
> to update/add them to the tree)

My primary concern is performance difference between INSERT and UPDATE
in my first tests. There i did select from table1, fetched record,
processed it and inserted into table2. Then updated status of fetched
record in table1. Repeated in cycle as long as fetch returned record.
Average time for INSERT was 0.004 seconds, average time for UPDATE 0.255
seconds. Update was done as "update table1 set status = 'done' where
recid = ". As far as i understand, this type of simple update should
be faster, compared to INSERT into table with four indexes, but in my
case it is more than 60 times slower. Why??

My second tests were done with temporary table and update query as: 
"UPDATE table1 SET Status = 'done' WHERE recid IN (SELECT recid FROM
temptable)". It is still slower than INSERT, but more or less
acceptable. Compared to my first tests overall processing time dropped
from 1 hour and 20 minutes to 16 minutes.

So, my question remains - why is simple update more than 60 times
slower, compared to INSERT?  Any ideas?

-- 
Ivar Zarans

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] autovacuum daemon stops doing work after about an hour

2003-12-04 Thread Vivek Khera
> "MTO" == Matthew T O'Connor <[EMAIL PROTECTED]> writes:

MTO> I don't run FreeBSD, so I haven't tested with FreeBSD.  Recently Craig
MTO> Boston reported and submitted a patch for a crash on FreeBSD, but that

some more debugging data:

(gdb) print now
$2 = {tv_sec = 1070565077, tv_usec = 216477}
(gdb) print then
$3 = {tv_sec = 1070561568, tv_usec = 668963}
(gdb) print diff
$4 = -5459981371352
(gdb) print sleep_secs
$5 = -1272

so for some reason, instead of calculating 3508547514 as the diff, it
got a hugely negative number.

I'll bet it has something to do with the compiler...  more debugging
to follow (without -O compilation...)



MTO> ---(end of broadcast)---
MTO> TIP 7: don't forget to increase your free space map settings

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.Khera Communications, Inc.
Internet: [EMAIL PROTECTED]   Rockville, MD   +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] tuning questions

2003-12-04 Thread Jack Coates
On Thu, 2003-12-04 at 12:27, Richard Huxton wrote:
> On Thursday 04 December 2003 19:50, Jack Coates wrote:
> >
> > I'm trying to set Postgres's shared memory usage in a fashion that
> > allows it to return requested results quickly. Unfortunately, none of
> > these changes allow PG to use more than a little under 300M RAM.
> > vacuumdb --analyze is now taking an inordinate amount of time as well
> > (40 minutes and counting), so that change needs to be rolled back.
> 
> You don't want PG to use all your RAM, it's designed to let the underlying OS 
> do a lot of caching for it. Probably worth having a look at vmstat/iostat and 
> see if it's saturating on I/O.

latest changes:
shared_buffers = 35642
max_fsm_relations = 1000
max_fsm_pages = 1
wal_buffers = 64
sort_mem = 32768
vacuum_mem = 32768
effective_cache_size = 1

/proc/sys/kernel/shmmax = 5

IO is active, but hardly saturated. CPU load is hefty though, load
average is at 4 now.

   procs  memoryswap  io
system cpu
 r  b  w   swpd   free   buff  cache  si  sobibo   incs  us 
sy  id
 0  2  1   2808  11436  39616 1902988   0   0   240   896  765   469  
2  11  87
 0  2  1   2808  11432  39616 1902988   0   0   244   848  768   540  
4   3  93
 0  2  1   2808  11432  39616 1902984   0   0   204   876  788   507  
3   4  93
 0  2  1   2808  11432  39616 1902984   0   0   360   416  715   495  
4   1  96
 0  2  1   2808  11432  39616 1902984   0   0   376   328  689   441  
2   1  97
 0  2  0   2808  11428  39616 1902976   0   0   464   360  705   479  
2   1  97
 0  2  1   2808  11428  39616 1902976   0   0   432   380  718   547  
3   1  97
 0  2  1   2808  11428  39616 1902972   0   0   440   372  742   512  
1   3  96
 0  2  1   2808  11428  39616 1902972   0   0   416   364  711   504  
3   1  96
 0  2  1   2808  11424  39616 1902972   0   0   456   492  743   592  
2   1  97
 0  2  1   2808  11424  39616 1902972   0   0   440   352  707   494  
2   1  97
 0  2  1   2808  11424  39616 1902972   0   0   456   360  709   494  
2   2  97
 0  2  1   2808  11436  39616 1902968   0   0   536   516  807   708  
3   2  94

-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
"Interoperability is the keyword, uniformity is a dead end."
--Olivier Fourdan



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-04 Thread Ivar Zarans
On Thu, Dec 04, 2003 at 08:23:36PM +, Richard Huxton wrote:

> Ah - it's probably not the update but the IN. You can rewrite it using PG's 
> non-standard FROM:
> 
> UPDATE t1 SET status='done' FROM t_tmp WHERE t1.rec_id = t_tmp.rec_id;

Thanks for the hint. I'll try this.

> Now that doesn't explain why the update is taking so long. One fifth of a 
> second is extremely slow. Are you certain that the index is being used?

Explain shows following output:

explain update table1 set status = 'PROC' where recid = '199901';

Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1 width=198)
   Index Cond: (recid = 199901::bigint)
   (2 rows)



-- 
Ivar Zarans


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-04 Thread William Yu
Ivar Zarans wrote:
I am experiencing strange behaviour, where simple UPDATE of one field is
very slow, compared to INSERT into table with multiple indexes. I have
two tables - one with raw data records (about 24000), where one field
In Postgres and any other DB that uses MVCC (multi-version concurrency), 
UPDATES will always be slower than INSERTS. With MVCC, what the DB does 
is makes a copy of the record, updates that record and then invalidates 
the previous record. This allows maintains a consistent view for anybody 
who's reading the DB and also avoids the requirement of row locks.

If you have to use UPDATE, make sure (1) your UPDATE WHERE clause is 
properly indexed and (2) you are running ANALYZE/VACUUM periodically so 
the query planner can optimize for your UPDATE statements.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] tuning questions

2003-12-04 Thread Richard Huxton
On Thursday 04 December 2003 19:50, Jack Coates wrote:
>
> I'm trying to set Postgres's shared memory usage in a fashion that
> allows it to return requested results quickly. Unfortunately, none of
> these changes allow PG to use more than a little under 300M RAM.
> vacuumdb --analyze is now taking an inordinate amount of time as well
> (40 minutes and counting), so that change needs to be rolled back.

You don't want PG to use all your RAM, it's designed to let the underlying OS 
do a lot of caching for it. Probably worth having a look at vmstat/iostat and 
see if it's saturating on I/O.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-04 Thread Richard Huxton
On Thursday 04 December 2003 19:51, Ivar Zarans wrote:
>
> My second tests were done with temporary table and update query as:
> "UPDATE table1 SET Status = 'done' WHERE recid IN (SELECT recid FROM
> temptable)". It is still slower than INSERT, but more or less
> acceptable. Compared to my first tests overall processing time dropped
> from 1 hour and 20 minutes to 16 minutes.

Ah - it's probably not the update but the IN. You can rewrite it using PG's 
non-standard FROM:

UPDATE t1 SET status='done' FROM t_tmp WHERE t1.rec_id = t_tmp.rec_id;

Now that doesn't explain why the update is taking so long. One fifth of a 
second is extremely slow. Are you certain that the index is being used?

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] autovacuum daemon stops doing work after about an hour

2003-12-04 Thread Matthew T. O'Connor
>> "MTO" == Matthew T O'Connor <[EMAIL PROTECTED]> writes:
>
> MTO> I don't run FreeBSD, so I haven't tested with FreeBSD.  Recently
> Craig MTO> Boston reported and submitted a patch for a crash on FreeBSD,
> but that
>
> some more debugging data:
>
> (gdb) print now
> $2 = {tv_sec = 1070565077, tv_usec = 216477}
> (gdb) print then
> $3 = {tv_sec = 1070561568, tv_usec = 668963}
> (gdb) print diff
> $4 = -5459981371352
> (gdb) print sleep_secs
> $5 = -1272
>
> so for some reason, instead of calculating 3508547514 as the diff, it
> got a hugely negative number.
>
> I'll bet it has something to do with the compiler...  more debugging to
> follow (without -O compilation...)

Could this be the recently reported bug where time goes backwards on
FreeBSD?  Can anyone who knows more about this problem chime in, I know it
was recently discussed on Hackers.

The simple fix is to just make sure it's a positive number.  If not, then
just sleep for some small positive amount.  I can make a patch for this,
probably sometime this weekend.

Thanks for tracking this down.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] autovacuum daemon stops doing work after about an hour

2003-12-04 Thread Vivek Khera
> "MTO" == Matthew T O'Connor <[EMAIL PROTECTED]> writes:

MTO> Could this be the recently reported bug where time goes backwards on
MTO> FreeBSD?  Can anyone who knows more about this problem chime in, I know it
MTO> was recently discussed on Hackers.


Time does not go backwards -- the now and then variables are properly
incrementing in time as you see from the debugging output.

The error appears to be with the computation of the "diff".  It is
either a C programming error, or a compiler error.  I'm not a C "cop"
so I can't tell you which it is.

Witness this program, below, compiled as "cc -g -o t t.c" and the
output here:

% ./t
seconds = 3509
seconds1 = 350900
useconds = -452486
stepped diff = 3508547514
seconds2 = -785967296
seconds3 = 350900
diff = -786419782
long long diff = 3508547514
%

apperantly, if you compute (now.tv_sec - then.tv_sec) * 100 all at
once, it overflows since the RHS is all computed using longs rather
than long longs.  Fix is to cast at least one of the values to long
long on the RHS, as in the computation of seconds3 below.  compare
that to the computation of seconds2 and you'll see that this is the
cause.

I'd be curious to see the output of this program on other platforms
and other compilers.  I'm using gcc 2.95.4 as shipped with FreeBSD
4.8+.

That all being said, you should never sleep less than the base time,
and never for more than a max amount, perhaps 1 hour?


--cut here--
#include 
#include 

int
main() 
{
  struct timeval now, then;
  long long diff = 0;
  long long seconds, seconds1, seconds2, seconds3, useconds;

  now.tv_sec = 1070565077L;
  now.tv_usec = 216477L;

  then.tv_sec = 1070561568L;
  then.tv_usec = 668963L;

  seconds = now.tv_sec - then.tv_sec;
  printf("seconds = %lld\n",seconds);
  seconds1 = seconds * 100;
  printf("seconds1 = %lld\n",seconds1);
  useconds = now.tv_usec - then.tv_usec;
  printf("useconds = %lld\n",useconds);

  diff = seconds1 + useconds;
  printf("stepped diff = %lld\n",diff);

  /* this appears to be the culprit... it should be same as seconds1 */
  seconds2 = (now.tv_sec - then.tv_sec) * 100;
  printf("seconds2 = %lld\n",seconds2);

  /* seems we need to cast long's to long long's for this computation */
  seconds3 = ((long long)now.tv_sec - (long long)then.tv_sec) * 100;
  printf("seconds3 = %lld\n",seconds3);
  

  diff = (now.tv_sec - then.tv_sec) * 100 + (now.tv_usec - then.tv_usec);
  printf ("diff = %lld\n",diff);

  diff = ((long long)now.tv_sec - (long long)then.tv_sec) * 100 + (now.tv_usec - 
then.tv_usec);
  printf ("long long diff = %lld\n",diff);

  exit(0);
}


--cut here--

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] autovacuum daemon stops doing work after about an hour

2003-12-04 Thread Vivek Khera
Actually, you can simplify the fix thusly:

  diff = (long long)(now.tv_sec - then.tv_sec) * 100 + (now.tv_usec - 
then.tv_usec);


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] tuning questions

2003-12-04 Thread Josh Berkus
Jack,

> latest changes:
> shared_buffers = 35642

This is fine, it's about 14% of available RAM.  Though the way you calculated 
it still confuses me.   It's not complicated; it should be between 6% and 15% 
of available RAM; since you're doing a data-transformation DB, yours should 
be toward the high end. 

> max_fsm_relations = 1000
> max_fsm_pages = 1

You want to raise this a whole lot if your data transformations involve large 
delete or update batches.I'd suggest running "vacuum analyze verbose" 
between steps to see how many dead pages you're accumulating.

> wal_buffers = 64
> sort_mem = 32768
> vacuum_mem = 32768
> effective_cache_size = 1

This is way the heck too low.  it's supposed to be the size of all available 
RAM; I'd set it to 2GB*65% as a start.

> IO is active, but hardly saturated. CPU load is hefty though, load
> average is at 4 now.

Unless you're doing huge statistical aggregates (like radar charts), or heavy 
numerical calculations-by-query, high CPU and idle I/O usually indicates a 
really bad query, like badly mismatched data types on a join or unconstrained 
joins or  overblown formatting-by-query.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] autovacuum daemon stops doing work after about an

2003-12-04 Thread Larry Rosenman


--On Thursday, December 04, 2003 16:20:22 -0500 Vivek Khera 
<[EMAIL PROTECTED]> wrote:

"MTO" == Matthew T O'Connor <[EMAIL PROTECTED]> writes:
MTO> Could this be the recently reported bug where time goes backwards on
MTO> FreeBSD?  Can anyone who knows more about this problem chime in, I
know it MTO> was recently discussed on Hackers.
Time does not go backwards -- the now and then variables are properly
incrementing in time as you see from the debugging output.
The error appears to be with the computation of the "diff".  It is
either a C programming error, or a compiler error.  I'm not a C "cop"
so I can't tell you which it is.
Witness this program, below, compiled as "cc -g -o t t.c" and the
output here:
% ./t
seconds = 3509
seconds1 = 350900
useconds = -452486
stepped diff = 3508547514
seconds2 = -785967296
seconds3 = 350900
diff = -786419782
long long diff = 3508547514
%
apperantly, if you compute (now.tv_sec - then.tv_sec) * 100 all at
once, it overflows since the RHS is all computed using longs rather
than long longs.  Fix is to cast at least one of the values to long
long on the RHS, as in the computation of seconds3 below.  compare
that to the computation of seconds2 and you'll see that this is the
cause.
I'd be curious to see the output of this program on other platforms
and other compilers.  I'm using gcc 2.95.4 as shipped with FreeBSD
4.8+.
this is with the UnixWare compiler:
$ cc -O -o testvk testvk.c
$ ./testvk
seconds = 3509
seconds1 = 350900
useconds = -452486
stepped diff = 3508547514
seconds2 = -785967296
seconds3 = 350900
diff = -786419782
long long diff = 3508547514
$
I think this is a C bug.



That all being said, you should never sleep less than the base time,
and never for more than a max amount, perhaps 1 hour?
--cut here--
# include 
# include 
int
main()
{
  struct timeval now, then;
  long long diff = 0;
  long long seconds, seconds1, seconds2, seconds3, useconds;
  now.tv_sec = 1070565077L;
  now.tv_usec = 216477L;
  then.tv_sec = 1070561568L;
  then.tv_usec = 668963L;
  seconds = now.tv_sec - then.tv_sec;
  printf("seconds = %lld\n",seconds);
  seconds1 = seconds * 100;
  printf("seconds1 = %lld\n",seconds1);
  useconds = now.tv_usec - then.tv_usec;
  printf("useconds = %lld\n",useconds);
  diff = seconds1 + useconds;
  printf("stepped diff = %lld\n",diff);
  /* this appears to be the culprit... it should be same as seconds1 */
  seconds2 = (now.tv_sec - then.tv_sec) * 100;
  printf("seconds2 = %lld\n",seconds2);
  /* seems we need to cast long's to long long's for this computation */
  seconds3 = ((long long)now.tv_sec - (long long)then.tv_sec) * 100;
  printf("seconds3 = %lld\n",seconds3);
  diff = (now.tv_sec - then.tv_sec) * 100 + (now.tv_usec -
then.tv_usec);   printf ("diff = %lld\n",diff);
  diff = ((long long)now.tv_sec - (long long)then.tv_sec) * 100 +
(now.tv_usec - then.tv_usec);   printf ("long long diff = %lld\n",diff);
  exit(0);
}
--cut here--

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED]
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


pgp0.pgp
Description: PGP signature


Re: [PERFORM] tuning questions

2003-12-04 Thread scott.marlowe
On Thu, 4 Dec 2003, Jack Coates wrote:

> On Thu, 2003-12-04 at 12:27, Richard Huxton wrote:
> > On Thursday 04 December 2003 19:50, Jack Coates wrote:
> > >
> > > I'm trying to set Postgres's shared memory usage in a fashion that
> > > allows it to return requested results quickly. Unfortunately, none of
> > > these changes allow PG to use more than a little under 300M RAM.
> > > vacuumdb --analyze is now taking an inordinate amount of time as well
> > > (40 minutes and counting), so that change needs to be rolled back.
> > 
> > You don't want PG to use all your RAM, it's designed to let the underlying OS 
> > do a lot of caching for it. Probably worth having a look at vmstat/iostat and 
> > see if it's saturating on I/O.
> 
> latest changes:
> shared_buffers = 35642
> max_fsm_relations = 1000
> max_fsm_pages = 1
> wal_buffers = 64
> sort_mem = 32768
> vacuum_mem = 32768
> effective_cache_size = 1
> 
> /proc/sys/kernel/shmmax = 5
> 
> IO is active, but hardly saturated. CPU load is hefty though, load
> average is at 4 now.

Postgresql is busily managing a far too large shared buffer.  Let the 
kernel do that.  Postgresql's shared buffers should be bug enough to hold 
as much of the current working set as it can, up to about 25% or so of the 
servers memory, or 512Meg, whichever comes first.  Unless a single query 
will actually use all of the buffer at once, you're not likely to see an 
improvement.

Also, your effective cache size is really small.  On a typical Postgresql 
server with 2 gigs of ram, you'll have about 1 to 1.5 gigs as kernel cache 
and buffer, and if it's dedicated to postgresql, then the effective cache 
setting for 1 gig would be 131072 (assuming 8k pages).

If you're updating a lot of tuples without vacuums, you'll likely want to 
up your fsm settings.

Note you can change things like sort_mem, effective_cache_size and 
random_page_cost on the fly (but not buffers, they're allocated at 
startup, nor fsm, they are as well.)

so, if you're gonna have one huge honkin query that needs to sort a 
hundred megs at a time, but you'd rather not up your sort memory that high 
(sort mem is PER SORT, not per backend or per database, so it can get out 
of hand quickly) then you can just 

set sort_mem=128000;

before throwing out the big queries that need all the sort.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] autovacuum daemon stops doing work after about an

2003-12-04 Thread Vivek Khera
> "LR" == Larry Rosenman <[EMAIL PROTECTED]> writes:

>> I'd be curious to see the output of this program on other platforms
>> and other compilers.  I'm using gcc 2.95.4 as shipped with FreeBSD
>> 4.8+.
LR> this is with the UnixWare compiler:
LR> $ cc -O -o testvk testvk.c
LR> $ ./testvk
LR> seconds = 3509
LR> seconds1 = 350900
LR> useconds = -452486
LR> stepped diff = 3508547514
LR> seconds2 = -785967296
LR> seconds3 = 350900
LR> diff = -786419782
LR> long long diff = 3508547514
LR> $

LR> I think this is a C bug.

Upon further reflection, I think so to.  The entire RHS is long's so
the arithmetic is done in longs, then assigned to a long long when
done (after things have overflowed).  Forcing any one of the RHS
values to be long long causes the arithmetic to all be done using long
longs, and then you get the numbers you expect.

I think you only notice this in autovacuum when it takes a long time
to complete the work, like my example of about 3500 seconds.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-04 Thread Ivar Zarans
On Thu, Dec 04, 2003 at 08:23:36PM +, Richard Huxton wrote:

> Ah - it's probably not the update but the IN. You can rewrite it using PG's 
> non-standard FROM:
> 
> UPDATE t1 SET status='done' FROM t_tmp WHERE t1.rec_id = t_tmp.rec_id;

This was one *very useful* hint! Using this method i got my processing
time of 24000 records down to around 3 minutes 10 seconds. Comparing
with initial 1 hour 20 minutes and then 16 minutes, this is impressive
improvement!

> Now that doesn't explain why the update is taking so long. One fifth of a 
> second is extremely slow. Are you certain that the index is being used?

I posted results of "EXPLAIN" in my previous message. Meanwhile i tried
to update just one record, using "psql". Also tried out "EXPLAIN
ANALYZE". This way i did not see any big delay - total runtime for one
update was around 1 msec.

I am confused - has slowness of UPDATE something to do with Python and
PyPgSQL, since "psql" seems to have no delay whatsoever? Or is this
related to using two cursors, one for select results and other for
update? Even if this is related to Python or cursors, how am i getting
so big speed improvement only by using different query? 

-- 
Ivar


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-04 Thread Richard Huxton
On Thursday 04 December 2003 19:59, William Yu wrote:
> Ivar Zarans wrote:
> > I am experiencing strange behaviour, where simple UPDATE of one field is
> > very slow, compared to INSERT into table with multiple indexes. I have
> > two tables - one with raw data records (about 24000), where one field
>
> In Postgres and any other DB that uses MVCC (multi-version concurrency),
> UPDATES will always be slower than INSERTS. With MVCC, what the DB does
> is makes a copy of the record, updates that record and then invalidates
> the previous record. 
[snip]

Yes, but he's seeing 0.25secs to update one row - that's something odd.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-04 Thread Richard Huxton
On Thursday 04 December 2003 22:13, Ivar Zarans wrote:
> On Thu, Dec 04, 2003 at 08:23:36PM +, Richard Huxton wrote:
> > Ah - it's probably not the update but the IN. You can rewrite it using
> > PG's non-standard FROM:
> >
> > UPDATE t1 SET status='done' FROM t_tmp WHERE t1.rec_id = t_tmp.rec_id;
>
> This was one *very useful* hint! Using this method i got my processing
> time of 24000 records down to around 3 minutes 10 seconds. Comparing
> with initial 1 hour 20 minutes and then 16 minutes, this is impressive
> improvement!

Be aware, this is specific to PG - I'm not aware of this construction working 
on any other DB. Three minutes still doesn't sound brilliant, but that could 
be tuning issues.

> > Now that doesn't explain why the update is taking so long. One fifth of a
> > second is extremely slow. Are you certain that the index is being used?
>
> I posted results of "EXPLAIN" in my previous message. Meanwhile i tried
> to update just one record, using "psql". Also tried out "EXPLAIN
> ANALYZE". This way i did not see any big delay - total runtime for one
> update was around 1 msec.

Yep - the explain looked fine. If you run EXPLAIN ANALYSE it will give you 
timings too (actual timings will be slightly less than reported ones since PG 
won't be timing/reporting).

> I am confused - has slowness of UPDATE something to do with Python and
> PyPgSQL, since "psql" seems to have no delay whatsoever? Or is this
> related to using two cursors, one for select results and other for
> update? Even if this is related to Python or cursors, how am i getting
> so big speed improvement only by using different query?

Hmm - you didn't mention cursors. If this was a problem with PyPgSQL in 
general I suspect we'd know about it by now. It could however be some 
cursor-related issue. In general, you're probably better off trying to do 
updates/inserts as a single statement and letting PG manage things rather 
than processing one row at a time.

If you've got the time, try putting together a small test-script with some 
dummy data and see if it's reproducible. I'm sure the other Python users 
would be interested in seeing where the problem is.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] tuning questions

2003-12-04 Thread Eric Soroos
IO is active, but hardly saturated. CPU load is hefty though, load
average is at 4 now.
   procs  memoryswap  io
system cpu
 r  b  w   swpd   free   buff  cache  si  sobibo   incs  
us  sy  id

 0  2  1   2808  11432  39616 1902984   0   0   204   876  788   507  
3   4  93
You're getting a load average of 4 with 93% idle?

That's a reasonable number of context switches, and if the blocks 
you're reading/writing are discontinous, I could see io saturation 
rearing it's head.

This looks to me like you're starting and killing a lot of processes.

Is this thrashing psql connections, or is it one big query? What are 
your active processes?

Your effective cache size looks to be about 1900 megs (+- binary), 
assuming all of it is pg.

eric
 

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] tuning questions

2003-12-04 Thread Jack Coates
On Thu, 2003-12-04 at 13:24, Josh Berkus wrote:
> Jack,
> 
> > latest changes:
> > shared_buffers = 35642
> 
> This is fine, it's about 14% of available RAM.  Though the way you calculated 
> it still confuses me.   It's not complicated; it should be between 6% and 15% 
> of available RAM; since you're doing a data-transformation DB, yours should 
> be toward the high end. 
> 
> > max_fsm_relations = 1000
> > max_fsm_pages = 1
> 
> You want to raise this a whole lot if your data transformations involve large 
> delete or update batches.I'd suggest running "vacuum analyze verbose" 
> between steps to see how many dead pages you're accumulating.

This looks really difficult to tune, and based on the load I'm giving
it, it looks really important. I've tried the verbose analyze and I've
looked at the rules of thumb, neither approach seems good for the
pattern of "hammer the system for a day or two, then leave it alone for
a week." I'm setting it to 50 (half of the biggest table size
divided by a 6k page size), but I'll keep tweaking this.

> 
> > wal_buffers = 64
> > sort_mem = 32768
> > vacuum_mem = 32768
> > effective_cache_size = 1
> 
> This is way the heck too low.  it's supposed to be the size of all available 
> RAM; I'd set it to 2GB*65% as a start.

This makes a little bit of difference. I set it to 65% (15869 pages).
Now we have some real disk IO:
   procs  memoryswap  io
system cpu
 r  b  w   swpd   free   buff  cache  si  sobibo   incs  us 
sy  id
 0  3  1   2804  10740  40808 1899856   0   0 26624 0  941  4144 
13  24  63
 1  2  1   2804  10808  40808 1899848   0   0 2174860 1143  3655  
9  22  69

still high cpu (3-ish load) though, and there's no noticeable
improvement in query speed.

> 
> > IO is active, but hardly saturated. CPU load is hefty though, load
> > average is at 4 now.
> 
> Unless you're doing huge statistical aggregates (like radar charts), or heavy 
> numerical calculations-by-query, high CPU and idle I/O usually indicates a 
> really bad query, like badly mismatched data types on a join or unconstrained 
> joins or  overblown formatting-by-query.

Ran that by the programmer responsible for this area and watched the
statements go by with tcpdump -X. Looks like really simple stuff to me:
select a handful of values, then insert into one table and delete from
another.
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
"Interoperability is the keyword, uniformity is a dead end."
--Olivier Fourdan



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] tuning questions

2003-12-04 Thread Jack Coates
On Thu, 2003-12-04 at 14:59, Eric Soroos wrote:
> >
> > IO is active, but hardly saturated. CPU load is hefty though, load
> > average is at 4 now.
> >
> >procs  memoryswap  io
> > system cpu
> >  r  b  w   swpd   free   buff  cache  si  sobibo   incs  
> > us  sy  id
> 
> >  0  2  1   2808  11432  39616 1902984   0   0   204   876  788   507  
> > 3   4  93
> 
> You're getting a load average of 4 with 93% idle?
down a bit since my last set of tweaks, but yeah:
  3:18pm  up 2 days,  3:37,  3 users,  load average: 3.42, 3.31, 2.81
66 processes: 65 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states:  2.0% user,  3.4% system,  0.0% nice, 93.4% idle
CPU1 states:  1.3% user,  2.3% system,  0.0% nice, 95.2% idle
Mem:  2064656K av, 2053896K used,   10760K free,   0K shrd,   40388K
buff
Swap: 2899716K av,2800K used, 2896916K free 1896232K
cached

  PID USER PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
23103 root  15   0  1072 1072   840 R 1.3  0.0   0:01 top
23046 postgres  15   0 33364  32M 32220 S 0.5  1.6   0:12 postmaster
> 
> That's a reasonable number of context switches, and if the blocks 
> you're reading/writing are discontinous, I could see io saturation 
> rearing it's head.
> 
> This looks to me like you're starting and killing a lot of processes.

isn't that by design though? I've been looking at other postgres servers
around the company and they seem to act pretty similar under load (none
is being pounded to this level, though).

> 
> Is this thrashing psql connections, or is it one big query? What are 
> your active processes?

[EMAIL PROTECTED] root]# ps auxw | grep postgres
postgres 23042  0.0  0.4 308808 8628 pts/0   S14:46   0:00
/usr/bin/postmaster -p 5432
postgres 23043  0.0  0.4 309788 8596 pts/0   S14:46   0:00 postgres:
stats buffer process   
postgres 23044  0.0  0.4 308828 8620 pts/0   S14:46   0:00 postgres:
stats collector process   
postgres 23046  0.6  1.4 309952 29872 pts/0  R14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 INSERT waiting
postgres 23047  1.4 14.7 310424 304240 pts/0 S14:46   0:21 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23048  0.4 14.7 310044 304368 pts/0 S14:46   0:07 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23049  0.0  0.5 309820 10352 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23050  0.0  0.6 310424 13352 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23051  0.0  0.6 309940 12992 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23052  0.0  0.5 309880 11916 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23053  0.0  0.6 309924 12872 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23054  0.0  0.6 310012 13460 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23055  0.0  0.5 309932 12284 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23056  2.0 14.7 309964 304072 pts/0 S14:46   0:30 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23057  2.4 14.7 309916 304104 pts/0 S14:46   0:37 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23058  0.0  0.6 310392 13168 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23059  0.5 14.7 310424 304072 pts/0 S14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23060  0.0  0.6 309896 13212 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23061  0.5  1.4 309944 29832 pts/0  R14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 INSERT
postgres 23062  0.6  1.4 309936 29832 pts/0  S14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 INSERT waiting
postgres 23063  0.6  1.4 309944 30028 pts/0  S14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 INSERT waiting
postgres 23064  0.6  1.4 309944 29976 pts/0  S14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 INSERT waiting
postgres 23065  1.4 14.7 310412 304112 pts/0 S14:46   0:21 postgres:
lmuser lmdb 216.91.56.200 idle
postgres 23066  0.5  1.4 309944 29496 pts/0  S14:46   0:08 postgres:
lmuser lmdb 216.91.56.200 INSERT waiting
postgres 23067  0.5  1.4 310472 30040 pts/0  D14:46   0:09 postgres:
lmuser lmdb 216.91.56.200 idle
postgres 23068  0.6  1.4 309936 30104 pts/0  R14:46   0:09 postgres:
lmuser lmdb 216.91.56.200 INSERT waiting
postgres 23069  0.5  1.4 309936 29716 pts/0  S14:46   0:09 postgres:
lmuser lmdb 216.91.56.200 INSERT waiting
postgres 23070  0.6  1.4 309944 29744 pts/0  S14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 INSERT waiting

ten-ish stay idle all the time, the inserts go to update when the big
select is done and rows get moved from the active to the completed
table.

> Your effective cache size looks to be about 1900 megs (+- binary), 
> assuming all of it is pg.
> 
> eric
>   
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
"Interoperability is the keyword, uniformity is a dead en

Re: [PERFORM] tuning questions

2003-12-04 Thread Richard Huxton
On Thursday 04 December 2003 23:16, Jack Coates wrote:
>
> > > effective_cache_size = 1
> >
> > This is way the heck too low.  it's supposed to be the size of all
> > available RAM; I'd set it to 2GB*65% as a start.
>
> This makes a little bit of difference. I set it to 65% (15869 pages).

That's still only about 127MB (15869 * 8KB).

> Now we have some real disk IO:
>procs  memoryswap  io
> system cpu
>  r  b  w   swpd   free   buff  cache  si  sobibo   incs  us
> sy  id
>  0  3  1   2804  10740  40808 1899856   0   0 26624 0  941  4144

According to this your cache is currently 1,899,856 KB which in 8KB blocks is 
237,482 - be frugal and say effective_cache_size = 20 (or even 15 if 
the trace above isn't typical).

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] tuning questions

2003-12-04 Thread Jack Coates
On Thu, 2003-12-04 at 15:47, Richard Huxton wrote:
> On Thursday 04 December 2003 23:16, Jack Coates wrote:
> >
> > > > effective_cache_size = 1
> > >
> > > This is way the heck too low.  it's supposed to be the size of all
> > > available RAM; I'd set it to 2GB*65% as a start.
> >
> > This makes a little bit of difference. I set it to 65% (15869 pages).
> 
> That's still only about 127MB (15869 * 8KB).

yeah, missed the final digit when I copied it into the postgresql.conf
:-( Just reloaded with 158691 pages.
> 
> > Now we have some real disk IO:
> >procs  memoryswap  io
> > system cpu
> >  r  b  w   swpd   free   buff  cache  si  sobibo   incs  us
> > sy  id
> >  0  3  1   2804  10740  40808 1899856   0   0 26624 0  941  4144
> 
> According to this your cache is currently 1,899,856 KB which in 8KB blocks is 
> 237,482 - be frugal and say effective_cache_size = 20 (or even 15 if 
> the trace above isn't typical).

d'oh, just realized what you're telling me here. /me smacks forehead.
Let's try effective_cache of 183105... (75%). Starting both servers,
waiting for big fetch to start, and...

   procs  memoryswap  io
system cpu
 r  b  w   swpd   free   buff  cache  si  sobibo   incs  us 
sy  id
 0  0  0   2800  11920  40532 1906516   0   0 0 0  521 8  
0   0 100
 0  1  0   2800  11920  40532 1906440   0   0   35652  611   113  
1   3  97
 0  1  0   2800  11920  40532 1906424   0   0 20604 0  897   808  
1  18  81
 0  1  0   2800  11920  40532 1906400   0   0 26112 0  927   820  
1  13  87
 0  1  0   2800  11920  40532 1906384   0   0 26112 0  923   812  
1  12  87
 0  1  0   2800  11920  40532 1906372   0   0 24592 0  921   805  
1  13  87
 0  1  0   2800  11920  40532 1906368   0   0  324848  961  1209  
0   4  96
 0  1  0   2800  11920  40532 1906368   0   0  2600 0  845  1631  
0   2  98
 0  1  0   2800  11920  40532 1906364   0   0  2728 0  871  1714  
0   2  98

better in vmstat... but the query doesn't work any better unfortunately.

The frustrating thing is, we also have a UP P3-500 with 512M RAM and two
IDE drives with the same PG install which is doing okay with this load
-- still half the speed of MS-SQL2K, but usable. I'm at a loss.
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
"Interoperability is the keyword, uniformity is a dead end."
--Olivier Fourdan



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-04 Thread Ivar Zarans
On Thu, Dec 04, 2003 at 10:45:21PM +, Richard Huxton wrote:

> If you've got the time, try putting together a small test-script with some 
> dummy data and see if it's reproducible. I'm sure the other Python users 
> would be interested in seeing where the problem is.

Tried with test-script, but this functioned normally (Murphy's law!).
Then tweaked postrgesql.conf and switched on debugging options. Results
show (in my opinion) that Python has nothing to do with slow UPDATE.
Timing from postgresql itself shows duration of 0.29 sec.

===
postgres[21247]: [2707] DEBUG:  StartTransactionCommand
postgres[21247]: [2708-1] LOG:  query: 
postgres[21247]: [2708-2] UPDATE
postgres[21247]: [2708-3] imp_cdr_200311
postgres[21247]: [2708-4] SET
postgres[21247]: [2708-5] Status = 'SKIP'
postgres[21247]: [2708-6] WHERE
postgres[21247]: [2708-7] ImpRecID = '202425'
...
Skipped rewritten parse tree
...
postgres[21247]: [2710-1] LOG:  plan:
postgres[21247]: [2710-2]{ INDEXSCAN 
postgres[21247]: [2710-3]:startup_cost 0.00 
postgres[21247]: [2710-4]:total_cost 6.01 
postgres[21247]: [2710-5]:rows 1 
postgres[21247]: [2710-6]:width 199 
postgres[21247]: [2710-7]:qptargetlist (
...
Skipped target list
...
postgres[21247]: [2711] DEBUG:  CommitTransactionCommand
postgres[21247]: [2712] LOG:  duration: 0.292529 sec
===

Any suggestions for further investigation?

-- 
Ivar Zarans


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-04 Thread Ivar Zarans

I have played around with explain and explain analyze and noticed one
interesting oddity:

===
explain UPDATE table1 SET status = 'SKIP' WHERE recid = 196641;

 Seq Scan on table1 (cost=0.00..16709.97 rows=1 width=199)
 Filter: (recid = 196641)

=== 

explain UPDATE table1 SET status = 'SKIP' WHERE recid = '196641';
 
 Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=199)
 Index Cond: (recid = 196641::bigint)

===

explain UPDATE table1 SET status = 'SKIP' WHERE recid = 196641::bigint;
 
 Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=199)
 Index Cond: (recid = 196641::bigint)

===

Why first example, where recid is given as numeric constant, is using
sequential scan, but second example, where recid is given as string
constant works with index scan, as expected? Third example shows, that
numeric constant must be typecasted in order to function properly.

Is this normal behaviour of fields with bigint type?

-- 
Ivar Zarans


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-04 Thread Christopher Kings-Lynne
Why first example, where recid is given as numeric constant, is using
sequential scan, but second example, where recid is given as string
constant works with index scan, as expected? Third example shows, that
numeric constant must be typecasted in order to function properly.
Is this normal behaviour of fields with bigint type?
Yes, it's a known performance problem in PostgreSQL 7.4 and below.  I 
believe it's been fixed in 7.5 CVS already.

Chris



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] tuning questions

2003-12-04 Thread Eric Soroos
d'oh, just realized what you're telling me here. /me smacks forehead.
Let's try effective_cache of 183105... (75%). Starting both servers,
waiting for big fetch to start, and...
   procs  memoryswap  io
system cpu
 r  b  w   swpd   free   buff  cache  si  sobibo   incs  us
sy  id
 0  0  0   2800  11920  40532 1906516   0   0 0 0  521 8
0   0 100
 0  1  0   2800  11920  40532 1906440   0   0   35652  611   113
1   3  97
 0  1  0   2800  11920  40532 1906424   0   0 20604 0  897   808
1  18  81
 0  1  0   2800  11920  40532 1906400   0   0 26112 0  927   820
1  13  87
 0  1  0   2800  11920  40532 1906384   0   0 26112 0  923   812
1  12  87
 0  1  0   2800  11920  40532 1906372   0   0 24592 0  921   805
1  13  87
 0  1  0   2800  11920  40532 1906368   0   0  324848  961  1209
0   4  96
 0  1  0   2800  11920  40532 1906368   0   0  2600 0  845  1631
0   2  98
 0  1  0   2800  11920  40532 1906364   0   0  2728 0  871  1714
0   2  98
better in vmstat... but the query doesn't work any better 
unfortunately.
Your io now looks like you're getting a few seconds of continuous read, 
and then you're getting into maxing out random reads. These look about 
right for a single ide drive.

The frustrating thing is, we also have a UP P3-500 with 512M RAM and 
two
IDE drives with the same PG install which is doing okay with this load
-- still half the speed of MS-SQL2K, but usable. I'm at a loss.
I wonder if you're doing table scans. From the earlier trace, it looked 
like you have a few parallel select/process/insert processes going.

If that's the case, you might be getting a big sequential scan at 
first, then at some point you have enough selects going that it wtarts 
looking more like random access.

Can you run one of the selects from the psql console and see how fast 
it runs?  Do your inserts have any foreign key relations?

One thing you might try is to shut down the postmaster and move the 
pg_clog and pg_xlog directories to the other drive, and leave symlinks 
pointing back. That should help your insert performance by putting the 
wal on a seperate drive from the table data. It will really help if you 
wind up having uncached read and write access at the same time. You 
also might gain by using software raid 0 (with large stripe size, 512k 
or so) across both drives, but if you don't have the appropriate 
paritions in there now it's going to be a bunch of work.

eric

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] [ADMIN] Index not used. WHY?

2003-12-04 Thread Stephan Szabo

On Thu, 4 Dec 2003, Andrei Bintintan wrote:

> Hi,
>
> I have the following table:
> CREATE TABLE public.rights (
> id int4 DEFAULT nextval('"rights_id_seq"'::text) NOT NULL,
> id_user int4 NOT NULL,
> id_modull int4 NOT NULL,
> CONSTRAINT rights_pkey PRIMARY KEY (id)
> )
>
> and I created the following indexes:
>
> CREATE INDEX right_id_modull_idx ON rights USING btree (id_modull);
> CREATE INDEX right_id_user_idx ON rights USING btree (id_user);
>
> Now the problem:
>
> EXPLAIN SELECT * FROM rights r WHERE r.id_modull =15
> returnes:
> Seq Scan on rights r (cost=0.00..12.30 rows=42 width=12)
> Filter: (id_modull = 15)
>
> EXPLAIN SELECT * FROM rights r WHERE r.id_user =15
> returnes:
> Index Scan using right_id_user_idx on rights r (cost=0.00..8.35 rows=11 width=12)
> Index Cond: (id_user = 15)
>
> Question: Why the right_id_modull_idx is NOT USED at the 1st query and
> the second query the right_id_user_idx index is used.

As a note, pgsql-performance is a better list for these questions.

So, standard questions:

How many rows are in the table, what does EXPLAIN ANALYZE show for the
queries, if you force index usage (set enable_seqscan=off) on the first
what does EXPLAIN ANALYZE show then, have you used ANALYZE/VACUUM ANALYZE
recently?


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[PERFORM] Async Query Processing on Solaris

2003-12-04 Thread Passynkov, Vadim
I am using Asynchronous Query Processing interface from libpq library.
And I got some strange results on Solaris

My test select query is 'SELECT * from pg_user;'
and I use select system synchronous I/O multiplexer in 'C'

The first test sends 1 select queries using 10 nonblocking connections
to database ( PQsendQuery ).
The second test sends the same 1 select queries using 1 connection (
PQexec ).

On FreeBSD there is a huge difference between the async and the sync tests.
The async test is much faster than sync test.
On Solaris there is no speed difference between async and sync test,
actually async test is even slower than sync test.

Q. Why ?

On FreeBSD:

/usr/bin/time ./PgAsyncManager async
async test start ... 1  done
9.46 real 3.48 user 1.25 sys

/usr/bin/time ./PgAsyncManager sync
sync test start ... 1  done
22.64 real 3.35 user 1.24 sys

On Solaris:

/usr/bin/time ./PgAsyncManager async
async test start ... 1  done

real   20.6
user2.1
sys 0.4

/usr/bin/time ./PgAsyncManager sync
sync test start ... 1  done

real   18.4
user1.1
sys 0.5


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings