[PERFORM] Insert performance

2003-08-16 Thread Shridhar Daithankar
Hi all,

Couple of days ago, one of my colleague, Rahul Iyer posted a query regarding 
insert performance of 5M rows. A common suggestion was to use copy.

Unfortunately he can not use copy due to some constraints. I was helping him to 
get maximum out of it. We were playing with a data set of 500K rows on SunOS5.6 
and postgresql 7.3.3

Best we could get was 500K records in 355 sec. That's roughly 1400 inserts per 
sec. This was with default settings and around 10K inserts per transaction. 
Postgresql was hogging all the CPU during load.

Troubled by this, I set up a similar database at home. That is a Athlon 
XP2000+/512MB machine with a 40GB seagate disk. It is running slackware 9.0 
with 2.4.20(IIRC).

I have attached the results of my experiements and the code I used to 
benchmark. It was a simple table with an integer and a varchar(30) field.

I was really amazed to see the numbers. First of all, it beat the sunOS machine 
 left and right. Bruce posted some numbers of 9K inserts/sec. Here we see the 
same.

Secondly I also played with filesystems. Ext3 does not seem to be performing as 
good. Reiser and ext2 did fine. Unfortunately the kernel didn't support XFS/JFS 
so could not test them.

I have also attached the raw benchmark data in kspread format, for the curious. 
Didn't exported to anything else because kspread had troubles with exporting 
formula values.

I also noticed that reiser takes hell lot more CPU than ext2 and ext3. It 
nearly peaks out all CPU capacity. Not so with ext2.

Comments? One thing I can't help to notice is sunOs is not on same scale. The 
sunOS machine is a 1GB RAM machine. It has oracle and mysql running on it and 
have 300MB swap in use but I am sure it has SCSI disk and in all respect I 
would rather expect a RISC machine to perform better than an athlon XP machine, 
at least for an IO.

If you want me to post details of sparc machine, please let me know how do I 
find it. I have never worked with sparcs earlier and have no intention of doing 
this again..:-)

Bye
 Shridhar

--
Fourth Law of Applied Terror:   The night before the English History mid-term, 
your Biologyinstructor will assign 200 pages on planaria.Corollary: Every 
instructor assumes that you have nothing else to do except  study for that 
instructor's course.



insert.c
Description: Binary data


Results.zip
Description: Zip archive


pgbenchmark.zip
Description: Zip archive

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Insert performance

2003-08-16 Thread Josh Berkus
Shridhar,

> Unfortunately he can not use copy due to some constraints. 

Why not use COPY to load the table, and then apply the constraints by query 
afterwords?  It might not be faster, but then again it might.

> I was really amazed to see the numbers. First of all, it beat the sunOS
> machine left and right. Bruce posted some numbers of 9K inserts/sec. Here
> we see the same.

> Comments? One thing I can't help to notice is sunOs is not on same scale.
> The sunOS machine is a 1GB RAM machine. It has oracle and mysql running on
> it and have 300MB swap in use but I am sure it has SCSI disk and in all
> respect I would rather expect a RISC machine to perform better than an
> athlon XP machine, at least for an IO.

It's been reported on this list several times that Solaris is the *worst* of 
the *nixes for PostgreSQL performance.   No analysis has been posted as to 
why; my own thoughts are:
- Solaris' multi-threaded architecture which imposes a hefty per-process 
overhead, about triple that of Linux, slowing new connections and large 
multi-user activity;
- Poor filesystem management; Sun simply hasn't kept up with IBM, Reiser, Red 
Hat and BSD in developing filesystems.
... but that's based on inadequate experimentation, just a few tests on 
Bonnie++ on a Netra running Solaris 8.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Insert performance

2003-08-16 Thread Christopher Browne
Martha Stewart called it a Good Thing [EMAIL PROTECTED] (Josh Berkus)wrote:
> Shridhar,
>> Unfortunately he can not use copy due to some constraints. 

> Why not use COPY to load the table, and then apply the constraints
> by query afterwords?  It might not be faster, but then again it
> might.

If you can transform the information into COPYable form, that's
certainly a good thing.  Whether it will be workable or not is another
question.

> ... but that's based on inadequate experimentation, just a few tests
> on Bonnie++ on a Netra running Solaris 8.

As far as the filesystem issues are concerned, you're probably using
an "old, standard" version of UFS.  The "high performance" option on
Solaris involves using third-party Veritas software.

The persistence of this is somewhat surprising; I'd somewhat have
expected Sun to have bought out Veritas or some such thing, as it's a
pretty vital technology that _isn't_ totally under their control.
Actually, an entertaining option would be for them to buy out SGI, as
that would get them control of XFS and a number of other interesting
technologies.

My expectations of a Netra wouldn't be terribly high, either; they
seem to exist as a product so that people that need a cheap Sun box
have an option.  They are mostly running IDE disk, and the latest
IA-32 hardware is likely to have newer faster interface options.
-- 
(format nil "[EMAIL PROTECTED]" "cbbrowne" "ntlug.org")
http://cbbrowne.com/info/advocacy.html
"Windows NT was designed to be administered by an idiot and usually
is..."  -- Chris Adams" <[EMAIL PROTECTED]>

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Benchmark

2003-08-16 Thread David Emery
On Saturday, August 16, 2003, at 01:21 AM, Josh Berkus wrote:

Mixo,

I need to is measure the perfomance of a ticketing system (written in  
perl)
which has web interface (html::mason, apache2) with Pg as a backend.  
Users
of the ticketing system can only connect to the backend via the web
interface
I'd suggest Perl LWP.  There's even a good article on how to use it in  
last
month's Linux Magazine (or the previous month, not sure).

The Perl module HTTP::WebTest should help you do what you want.

http://search.cpan.org/author/ILYAM/HTTP-WebTest-2.03/lib/HTTP/ 
WebTest.pm

It should save a ton of time over building one from scratch. Lots of  
docs too. There are other Perl web testing tools too if you do a search  
at:

http://search.cpan.org

HTH,
Dave
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match