Re: [PERFORM] Best COPY Performance

2006-10-31 Thread Teemu Torma
On Tuesday 31 October 2006 21:11, Worky Workerson wrote: > One thing which I never mentioned was that I > am using ext3 mounted with noatime,data=writeback. You might also want to try with data=ordered. I have noticed that nowadays it seems to be a bit faster, but not much. I don't know why, m

Re: [PERFORM] Best COPY Performance

2006-10-31 Thread Worky Workerson
1 0 345732 29304 770272 12946764 0 0 16 16428 1192 3105 12 2 85 1 1 0 345732 30840 770060 12945480 0 0 20 16456 1196 3151 12 2 84 1 1 0 345732 32760 769972 12943528 0 0 12 16460 1185 3103 11 2 86 1 >> >> iirc, he is running quad opteron 885 (8 cores), so if my math is >>

Re: [PERFORM] Best COPY Performance

2006-10-31 Thread Luke Lonergan
Checkpoints are not an issue here, the vmstat you included was on a 5 second interval, so the 'bursts' were bursting at a rate of 60MB/s. - Luke Msg is shrt cuz m on ma treo ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Best COPY Performance

2006-10-31 Thread Worky Workerson
Maybe it is just the PK *build* that slows it down, but I just tried some small scale experiments on my MacBook Pro laptop (which has the same disk performance as your server) and I get only a 10-15% slowdown from having a PK on an integer column. The 10-15% slowdown was on 8.1.5 MPP, so it used

Re: [PERFORM] Best COPY Performance

2006-10-31 Thread Worky Workerson
>I'm guessing the high bursts are checkpoints. Can you check your log > >files for pg and see if you are getting warnings about checkpoint > >frequency? You can get some mileage here by increasing wal files. > > Nope, nothing in the log. I have set: > wal_buffers=128 > checkpoint_segments=128

Re: [PERFORM] Best COPY Performance

2006-10-31 Thread Luke Lonergan
Worky (!), On 10/31/06 12:11 PM, "Worky Workerson" <[EMAIL PROTECTED]> wrote: > Any recommendations on what to > look at to find a fix? One thing which I never mentioned was that I > am using ext3 mounted with noatime,data=writeback. You can try setting the max readahead like this: /sbin/bloc

Re: [PERFORM] Best COPY Performance

2006-10-31 Thread Worky Workerson
> And here are the dd results for 16GB RAM, i.e. 4,000,000 8K blocks: So, if we divide 32,000 MB by the real time, we get: /data (data): 89 MB/s write 38 MB/s read ... snip ... The read speed on your /data volume is awful to the point where you should consider it broken and find a fix. A quick

Re: [PERFORM] Best COPY Performance

2006-10-30 Thread Stefan Kaltenbrunner
Luke Lonergan wrote: Stefan, On 10/30/06 8:57 AM, "Stefan Kaltenbrunner" <[EMAIL PROTECTED]> wrote: We've found that there is an ultimate bottleneck at about 12-14MB/s despite having sequential write to disk speeds of 100s of MB/s. I forget what the latest bottleneck was. I have personally m

Re: [PERFORM] Best COPY Performance

2006-10-30 Thread Luke Lonergan
Stefan, On 10/30/06 8:57 AM, "Stefan Kaltenbrunner" <[EMAIL PROTECTED]> wrote: >> We've found that there is an ultimate bottleneck at about 12-14MB/s despite >> having sequential write to disk speeds of 100s of MB/s. I forget what the >> latest bottleneck was. > > I have personally managed to l

Re: [PERFORM] Best COPY Performance

2006-10-30 Thread Stefan Kaltenbrunner
Luke Lonergan wrote: Greg, On 10/30/06 7:09 AM, "Spiegelberg, Greg" <[EMAIL PROTECTED]> wrote: I broke that file into 2 files each of 550K rows and performed 2 simultaneous COPY's after dropping the table, recreating, issuing a sync on the system to be sure, &c and nearly every time both COPY'

Re: [PERFORM] Best COPY Performance

2006-10-30 Thread Luke Lonergan
Greg, On 10/30/06 7:09 AM, "Spiegelberg, Greg" <[EMAIL PROTECTED]> wrote: > I broke that file into 2 files each of 550K rows and performed 2 > simultaneous COPY's after dropping the table, recreating, issuing a sync > on the system to be sure, &c and nearly every time both COPY's finish in > 12 s

Re: [PERFORM] Best COPY Performance

2006-10-30 Thread Spiegelberg, Greg
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Luke Lonergan > Sent: Saturday, October 28, 2006 12:07 AM > To: Worky Workerson; Merlin Moncure > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Best COP

Re: [PERFORM] Best COPY Performance

2006-10-28 Thread Luke Lonergan
Michael (aka Worky), On 10/28/06 5:03 AM, "Michael Artz" <[EMAIL PROTECTED]> wrote: > PG 8.1.3 x86_64. I installed it via a RH rpm for their "Web Services > Beta", or something like that. I know I'm a bit behind the times, but > getting stuff in (and out) of my isolated lab is a bit of a pain.

Re: [PERFORM] Best COPY Performance

2006-10-28 Thread Michael Artz
> Are you saying that I should be able to issue multiple COPY commands > because my I/O wait is low? I was under the impression that I am I/O > bound, so multiple simeoultaneous loads would have a detrimental > effect ... The reason I asked how many CPUs was to make sense of the 12% usr CPU time

Re: [PERFORM] Best COPY Performance

2006-10-27 Thread Luke Lonergan
Worky, On 10/27/06 8:47 PM, "Worky Workerson" <[EMAIL PROTECTED]> wrote: 1 0 345732 29304 770272 12946764 0 0 16 16428 1192 3105 12 2 85 1 1 0 345732 30840 770060 12945480 0 0 20 16456 1196 3151 12 2 84 1 1 0 345732 32760 769972 12943528 0 0 12 16460 1185 3103 11 2 86 1

Re: [PERFORM] Best COPY Performance

2006-10-27 Thread Worky Workerson
On 10/27/06, Merlin Moncure <[EMAIL PROTECTED]> wrote: > > r b swpd free buffcache si so bibo in cs us sy id wa > > 1 0 345732 29328 770980 12947212 0 0 20 16552 1223 3677 12 2 85 1 > > 1 0 345732 29840 770520 12946924 0 0 20 29244 1283 2955 11 2 85 1 > > 1 0 345732 32144

Re: [PERFORM] Best COPY Performance

2006-10-27 Thread Worky Workerson
Worky (that your real name? :-) Nope, its Mike. worky.workerson is just the email that I use for "work" :) How many CPUs on the machine? Can you send the result of "cat /proc/cpuinfo"? Not at work at the moment, however I do have quad dual-core opterons, like Merlin mentioned. Is your "c

Re: [PERFORM] Best COPY Performance

2006-10-27 Thread Merlin Moncure
On 10/28/06, Luke Lonergan <[EMAIL PROTECTED]> wrote: Worky (that your real name? :-) On 10/27/06 12:08 PM, "Worky Workerson" <[EMAIL PROTECTED]> wrote: > Here it is, taken from a spot about halfway through a 'cat file | > psql' load, with the "Oracle-is-installed-and-running" caveat: > > r b

Re: [PERFORM] Best COPY Performance

2006-10-27 Thread Luke Lonergan
Worky (that your real name? :-) On 10/27/06 12:08 PM, "Worky Workerson" <[EMAIL PROTECTED]> wrote: > Here it is, taken from a spot about halfway through a 'cat file | > psql' load, with the "Oracle-is-installed-and-running" caveat: > > r b swpd free buffcache si so bibo in cs u

Re: [PERFORM] Best COPY Performance

2006-10-27 Thread Worky Workerson
The read speed on your /data volume is awful to the point where you should consider it broken and find a fix. A quick comparison: the same number on a 16 drive internal SATA array with 7200 RPM disks gets 950 MB/s read, about 25 times faster for about 1/4 the price. I'm hoping that the poor per

Re: [PERFORM] Best COPY Performance

2006-10-27 Thread Merlin Moncure
On 10/27/06, Worky Workerson <[EMAIL PROTECTED]> wrote: I'm hoping that the corporate Oracle machine won't shut down my pg projects. On total side note, if anyone knows how to best limit Oracle's impact on a system (i.e. memory usage, etc), I'd be interested. rm -rf /usr/local/oracle? merlin

Re: [PERFORM] Best COPY Performance

2006-10-27 Thread Worky Workerson
I do have a dirty little secret, one which I wasn't completely aware of until a little while ago. Apparently, someone decided to install Oracle on the server, and use the SAN as the primary tablespace, so that might have something to do with the poor performance of the SAN. At least, I'm hoping t

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Jim C. Nasby
On Wed, Oct 25, 2006 at 11:25:01AM -0400, Worky Workerson wrote: > >I'm guessing the high bursts are checkpoints. Can you check your log > >files for pg and see if you are getting warnings about checkpoint > >frequency? You can get some mileage here by increasing wal files. > > Nope, nothing in

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Luke Lonergan
Mr. Worky, On 10/25/06 11:26 AM, "Worky Workerson" <[EMAIL PROTECTED]> wrote: > And here are the dd results for 16GB RAM, i.e. 4,000,000 8K blocks: So, if we divide 32,000 MB by the real time, we get: /home (WAL): 53 MB/s write 84 MB/s read /data (data): 89 MB/s write 38 MB/s read The write a

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Markus Schaber
Hi, Craig, Craig A. James wrote: >> Would dblink() help in any way? > > It might if perl wasn't so damned good at this. ;-) You know that you can use Perl inside PostgreS via plperl? HTH, Markus ---(end of broadcast)--- TIP 3: Have you checked o

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Worky Workerson
Merlin/Luke: > in theory, with 10 10k disks in raid 10, you should be able to keep > your 2fc link saturated all the time unless your i/o is extremely > random. random i/o is the wild card here, ideally you should see at > least 2000 seeks in bonnie...lets see what comes up. I suspect the pr

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Craig A. James
Spiegelberg, Greg wrote: The data are on two different computers, and I do processing of the data as it passes through the application. Otherwise, the INSERT INTO ... SELECT is my first choice. Would dblink() help in any way? It might if perl wasn't so damned good at this. ;-) Craig

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Spiegelberg, Greg
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Craig A. James > Sent: Wednesday, October 25, 2006 12:52 PM > To: Jim C. Nasby > Cc: Worky Workerson; Merlin Moncure; pgsql-performance@postgresql.org > Subject: Re: [PERFOR

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Craig A. James
Jim C. Nasby wrote: Wait... so you're using perl to copy data between two tables? And using a cursor to boot? I can't think of any way that could be more inefficient... What's wrong with a plain old INSERT INTO ... SELECT? Or if you really need to break it into multiple transaction blocks, at le

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Luke Lonergan
Merlin, On 10/25/06 8:38 AM, "Merlin Moncure" <[EMAIL PROTECTED]> wrote: > in theory, with 10 10k disks in raid 10, you should be able to keep > your 2fc link saturated all the time unless your i/o is extremely > random. random i/o is the wild card here, ideally you should see at > least 2000 se

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Worky Workerson
I'm guessing the high bursts are checkpoints. Can you check your log files for pg and see if you are getting warnings about checkpoint frequency? You can get some mileage here by increasing wal files. Nope, nothing in the log. I have set: wal_buffers=128 checkpoint_segments=128 checkpoint_ti

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Merlin Moncure
On 10/25/06, Worky Workerson <[EMAIL PROTECTED]> wrote: > I'm guessing the high bursts are checkpoints. Can you check your log > files for pg and see if you are getting warnings about checkpoint > frequency? You can get some mileage here by increasing wal files. Nope, nothing in the log. I h

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Jim C. Nasby
On Wed, Oct 25, 2006 at 08:03:38AM -0400, Worky Workerson wrote: > I'm just doing CSV style transformations (and calling a lot of > functions along the way), but the end result is a straight bulk load > of data into a blank database. And we've established that Postgres > can do *way* better than w

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Jim C. Nasby
On Tue, Oct 24, 2006 at 10:36:04PM -0700, Craig A. James wrote: > Jim C. Nasby wrote: > >Well, given that perl is using an entire CPU, it sounds like you should > >start looking either at ways to remove some of the overhead from perl, > >or to split that perl into multiple processes. > > I use Per

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Luke Lonergan
Mr. Worky Workerson, On 10/25/06 5:03 AM, "Worky Workerson" <[EMAIL PROTECTED]> wrote: > However, I am still curious as to the rather slow COPYs from psql to > local disks. Like I mentioned previously, I was only seeing about 5.7 > MB/s (1.8 GB / 330 seconds), where it seemed like others were do

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Merlin Moncure
On 10/23/06, Worky Workerson <[EMAIL PROTECTED]> wrote: The disk load is where I start to get a little fuzzy, as I haven't played with iostat to figure what is "normal". The local drives contain PG_DATA as well as all the log files, but there is a tablespace on the FibreChannel SAN that contains

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Markus Schaber
Hi, Worky, Worky Workerson wrote: > $ psql -c "COPY my_table TO STDOUT" > my_data > $ ls my_data > 2018792 edgescape_pg_load > $ time cat my_data | psql -c "COPY mytable FROM STDIN" > real5m43.194s > user0m35.412s > sys 0m9.567s That's via PSQL, and you get about 5 MB/Sec. >> On a t

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Worky Workerson
On 10/25/06, Craig A. James <[EMAIL PROTECTED]> wrote: Jim C. Nasby wrote: > Well, given that perl is using an entire CPU, it sounds like you should > start looking either at ways to remove some of the overhead from perl, > or to split that perl into multiple processes. I use Perl for big databa

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Alex Stapleton
On 23 Oct 2006, at 22:59, Jim C. Nasby wrote: http://stats.distributed.net used to use a perl script to do some transformations before loading data into the database. IIRC, when we switched to using C we saw 100x improvement in speed, so I suspect that if you want performance perl isn't the wa

Re: [PERFORM] Best COPY Performance

2006-10-24 Thread Craig A. James
Jim C. Nasby wrote: Well, given that perl is using an entire CPU, it sounds like you should start looking either at ways to remove some of the overhead from perl, or to split that perl into multiple processes. I use Perl for big database copies (usually with some processing/transformation alon

Re: [PERFORM] Best COPY Performance

2006-10-24 Thread Worky Workerson
http://stats.distributed.net used to use a perl script to do some transformations before loading data into the database. IIRC, when we switched to using C we saw 100x improvement in speed, so I suspect that if you want performance perl isn't the way to go. I think you can compile perl into C, so m

Re: [PERFORM] Best COPY Performance

2006-10-24 Thread Jim C. Nasby
On Tue, Oct 24, 2006 at 09:17:08AM -0400, Worky Workerson wrote: > >http://stats.distributed.net used to use a perl script to do some > >transformations before loading data into the database. IIRC, when we > >switched to using C we saw 100x improvement in speed, so I suspect that > >if you want per

Re: [PERFORM] Best COPY Performance

2006-10-24 Thread Jim C. Nasby
On Mon, Oct 23, 2006 at 03:37:47PM -0700, Craig A. James wrote: > Jim C. Nasby wrote: > >http://stats.distributed.net used to use a perl script to do some > >transformations before loading data into the database. IIRC, when we > >switched to using C we saw 100x improvement in speed, so I suspect th

Re: [PERFORM] Best COPY Performance

2006-10-24 Thread Worky Workerson
Markus, Could you COPY one of your tables out to disk via psql, and then COPY it back into the database, to reproduce this measurement with your real data? $ psql -c "COPY my_table TO STDOUT" > my_data $ ls my_data 2018792 edgescape_pg_load $ time cat my_data | psql -c "COPY mytable FROM STDIN

Re: [PERFORM] Best COPY Performance

2006-10-23 Thread Craig A. James
Jim C. Nasby wrote: http://stats.distributed.net used to use a perl script to do some transformations before loading data into the database. IIRC, when we switched to using C we saw 100x improvement in speed, so I suspect that if you want performance perl isn't the way to go. I think you can comp

Re: [PERFORM] Best COPY Performance

2006-10-23 Thread Joshua D. Drake
> Ultimately, you might be best of using triggers instead of rules for the > partitioning since then you could use copy. Or go to raw insert commands > that are wrapped in a transaction. My experience is that triggers are quite a bit faster than rules in any kind of partitioning that involves mor

Re: [PERFORM] Best COPY Performance

2006-10-23 Thread Jim C. Nasby
On Mon, Oct 23, 2006 at 11:10:19AM -0400, Worky Workerson wrote: > >> I am most interested in loading two tables, one with about 21 (small) > >> VARCHARs where each record is about 200 bytes, and another with 7 > >> INTEGERs, 3 TIMESTAMPs, and 1 BYTEA where each record is about 350 > >> bytes. > >

Re: [PERFORM] Best COPY Performance

2006-10-23 Thread Worky Workerson
> I am most interested in loading two tables, one with about 21 (small) > VARCHARs where each record is about 200 bytes, and another with 7 > INTEGERs, 3 TIMESTAMPs, and 1 BYTEA where each record is about 350 > bytes. indexes/keys? more memory for sorting during index creation can have a dramati

Re: [PERFORM] Best COPY Performance

2006-10-23 Thread Luke Lonergan
Markus, On 10/23/06 2:27 AM, "Markus Schaber" <[EMAIL PROTECTED]> wrote: > On a table with no indices, triggers and contstraints, we managed to > COPY about 7-8 megabytes/second with psql over our 100 MBit network, so > here the network was the bottleneck. We routinely get 10-12MB/s on I/O hardw

Re: [PERFORM] Best COPY Performance

2006-10-23 Thread Markus Schaber
Hi, Worky, Worky Workerson wrote: > I am currently getting > between 10K and 15K inserts/second. > I ran trivial little insert into a table with a single integer row and > came close to 250K inserts/second using psql's \copy, so I'm thinking > that my code could be optimized a bit more, but want

Re: [PERFORM] Best COPY Performance

2006-10-20 Thread Merlin Moncure
On 10/21/06, Worky Workerson <[EMAIL PROTECTED]> wrote: What is the best COPY performance that you have gotten on a "normal" table? I know that this is question is almost too general, but it might help me out a bit, or at least give me the right things to tweak. Perhaps the question can be rewr

[PERFORM] Best COPY Performance

2006-10-20 Thread Worky Workerson
What is the best COPY performance that you have gotten on a "normal" table? I know that this is question is almost too general, but it might help me out a bit, or at least give me the right things to tweak. Perhaps the question can be rewritten as "Where are the major bottlenecks in a COPY?" or