Re: [PERFORM] which dual-CPU hardware/OS is fastest for PostgreSQL?

2005-01-14 Thread Benjamin Wragg
Hi,

From what I've been reading on the list for the last few months, adaptec
isn't that good when it comes to RAID controllers, but LSI keeps popping up.
Is there any particual models that are recommended as I'm in the market for
two new servers both with RAID controllers. The server specs I'm thinking
are as follows:

Box 1 
Fedora 64bit core 3
4 GB RAM (2GB per CPU)
2 x Opteron CPU ???
Tyan K8S
LSI® 53C1030 U320 SCSI controller Dual-channel 

Box 2
Fedora 64bit core 3
2 GB RAM (1GB per CPU)
2 x Opteron CPU ???
Tyan K8S
LSI® 53C1030 U320 SCSI controller Dual-channel 

This motherboard has can "Connects to PCI-X Bridge A, LSI® ZCR (Zero Channel
RAID) support (SCSI Interface Steering Logic)". I believe this means I can
get a LSI MegaRAID 320-0 which a few have mentioned on the list
(http://www.lsilogic.com/products/megaraid/scsi_320_0.html). It supports
RAID 10 and supports battery backed cache. Anyone had any experience with
this? 

Any other particular controller that people recommend? From what I've been
reading RAID 10 and battery backed cache sound like things I need. :)

Thanks,

Benjamin Wragg

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Josh Berkus
Sent: Tuesday, 11 January 2005 5:35 PM
To: pgsql-performance@postgresql.org
Cc: Christopher Browne
Subject: Re: [PERFORM] which dual-CPU hardware/OS is fastest for PostgreSQL?

Chris,

> I don't know so much about FreeBSD's handling of this, but on Linux, 
> there's pretty strong indication that _SOFTWARE_ RAID is faster than 
> hardware RAID.

Certainly better than an Adaptec.  But not necessarily better than a
medium-end RAID card, like an LSI.  It really depends on the quality of the
controller.

Also, expected concurrent activity should influence you.   On  a dedicated 
database server, you'll seldom max out the CPU but will often max of the 
disk, so the CPU required by software RAID is "free".   However, if you have

a Web/PG/E-mail box which frequently hits 100% CPU, then even a lower-end
RAID card can be beneficial simply by taking load off the CPU.

--
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

--
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.10 - Release Date: 10/01/2005
 

-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.11 - Release Date: 12/01/2005
 


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] sum of all values

2005-01-14 Thread Richard Huxton
Madison Kelly wrote:
Hi all,
  Is there a fast(er) way to get the sum of all integer values for a 
certain condition over many thousands of rows? What I am currently doing 
is this (which takes ~5-10sec.):
OK, I'm assuming you've configured PG to your satisfaction and this is 
the only query giving you problems.

SELECT SUM (a.file_size) FROM file_info_1 a, file_set_1 b WHERE 
a.file_name=b.fs_name AND a.file_parent_dir=b.fs_parent_dir AND 
a.file_type=b.fs_type AND b.fs_backup='t';
You'll want to run EXPLAIN ANALYSE SELECT SUM... and post the output of 
that, although the query looks straightforward enough.

  Here are the schemas, in case they help:
tle-bu=> \d file_info_1Table "public.file_info_1"
 Column  |  Type   | Modifiers
-+-+
 file_acc_time   | bigint  | not null
 file_group_name | text| not null
 file_group_uid  | integer | not null
 file_mod_time   | bigint  | not null
 file_name   | text| not null
 file_parent_dir | text| not null
 file_perm   | text| not null
 file_size   | bigint  | not null
 file_type   | text| not null default 'f'::text
 file_user_name  | text| not null
 file_user_uid   | integer | not null
Indexes:
"file_info_1_display_idx" btree (file_parent_dir, file_name, file_type)
"file_info_1_search_idx" btree (file_parent_dir, file_name, file_type)
tle-bu=> \d file_set_1   Table "public.file_set_1"
Column |  Type   | Modifiers
---+-+
 fs_backup | boolean | not null default true
 fs_display| boolean | not null default false
 fs_name   | text| not null
 fs_parent_dir | text| not null
 fs_restore| boolean | not null default false
 fs_type   | text| not null default 'f'::text
Indexes:
"file_set_1_sync_idx" btree (fs_parent_dir, fs_name, fs_type)
1. WHERE ARE YOUR PRIMARY KEYS???
2. Why do you have two identical indexes on file_info_1
3. WHERE ARE YOUR PRIMARY KEYS???
4. Am I right in thinking that always, file_name==fs_name (i.e. they 
represent the same piece of information) and if so, why are you storing 
it twice? Same for _parent_dir too
5. file_type/fs_type are being held as unbounded text? Not an index into 
some lookup table or a varchar(N)?

Can you explain what you're trying to do here - it might be you want to 
alter your database design.
--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[PERFORM] query optimization help

2005-01-14 Thread sarlav kumar
Hi All,
 
I have the following query to generate a report grouped by "states".
 
SELECT distinct upper(cd.state) as mystate, SUM(d.amount) as total_amount, SUM(COALESCE(d.fee,0) + COALESCE(mp.seller_fee, 0) + COALESCE(mp.buyer_fee,0)) as total_fee FROM data d left JOIN customerdata cd ON d.uid = cd.uid LEFT JOIN merchant_purchase mp ON d.id = mp.data_id WHERE d.what IN (26,0, 15) AND d.flags IN (1,9,10,12 ) AND lower(cd.country) = 'us' AND date_part('year',d.time)= 2004 GROUP BY mystate ORDER BY mystate;
 mystate | total_amount | total_fee -+--+--- | 3695 | 0 AR  | 3000 | 0 AZ  | 1399 | 0 CA  |   113100 |  6242 FL  |   121191 |  9796 GA  | 34826876 |    47 GEORGIA |    57990 |   &nbs
 p; 
 3500 IEIE    |   114000 |  4849 MD  |    2 |  1158 MI  |   906447 | 0 NY  | 8000 |   600 PA  | 6200 |   375 SC  |    25000 |   600 TN  |  1443681 |  1124
 |    13300 | 0(15 rows)
If you notice, my problem in this query is that the records for GA, GEORGIA appear separately. But what I want to do is  to have them combined to a single entry with their values summed up . Initially we had accepted both formats as input for the state field. Also, there are some invalid entries for the state field (like the "IEIE" and null values), which appear because the input for state was not validated initially. These entries have to be eliminated from the report.This query did not take a long time to complete, but did not meet the needs for the report. 
 
So, the query was rewritten to the following query which takes nearly 7-8 mins to complete on our test database:
 
SELECT (SELECT DISTINCT pc.state FROM postalcode pc WHERE UPPER(cd.state) IN (pc.state, pc.state_code)) as mystate, SUM(d.amount) as total_amount, SUM(COALESCE(d.fee,0) + COALESCE(mp.seller_fee, 0) + COALESCE(mp.buyer_fee,0)) as total_fee FROM data d JOIN customerdata cd ON d.uid = cd.uid LEFT JOIN merchant_purchase mp ON d.id = mp.data_id WHERE d.what IN (26,0, 15) AND d.flags IN (1,9,10,12 ) AND lower(cd.country) = 'us' AND date_part('year', d.time) = 2004 GROUP BY mystate ORDER BY mystate;    mystate | total_amount | total_fee +--+--- ARIZONA    | 1399 | 0 ARKANSAS   | 3000
 | 0 CALIFORNIA |   113100 |  6242 FLORIDA    |   121191 |  9796 GEORGIA    | 34884866 |    482388 MARYLAND   |    2 |  1158 MICHIGAN   |   906447 | 0 NEW YORK   | 8000 |   600 PENNSYLVANIA   | 6200
 |   375 SOUTH CAROLINA |    25000 |   600 TENNESSEE  |  1443681 |  1124    |   130995 |  4849
 
Here is the explain analyze of this query:
  QUERY PLAN   Aggregate  (cost=1226.57..1226.58 rows=1 width=38) (actual time=362355.58..362372.09 rows=12 loops=1)   ->  Group  (cost=1226.57..1226.57 rows=1 width=38) (actual
 time=362355.54..362367.73 rows=2197 loops=1) ->  Sort  (cost=1226.57..1226.57 rows=1 width=38) (actual time=362355.53..362356.96 rows=2197 loops=1)   Sort Key: (subplan)   ->  Nested Loop  (cost=0.00..1226.56 rows=1 width=38) (actual time=166.11..362321.46 rows=2197 loops=1) ->  Nested Loop  (cost=0.00..1220.53 rows=1 width=26) (actual time=1.68..361.32 rows=2115 loops=1)   ->  Seq Scan on customerdata cd  (cost=0.00..274.32 ro
 ws=31
 width=10) (actual time=0.04..29.87 rows=3303 loops=1) Filter: (lower((country)::text) = 'us'::text)   ->  Index Scan using data_uid_idx on data d  (cost=0.00..30.08 rows=1 width=16) (actual time=0.04..0.09 rows=1 loops=3303) Index Cond: (d.uid =
 "outer".uid) Filter: (((what = 26) OR (what = 0) OR (what = 15)) AND ((flags = 1) OR (flags = 9) OR (flags = 10) OR (flags = 12)) AND (date_part('year'::text, "time") = 2004::double precision)) ->  Index Scan using merchant_purchase_data_idx on merchant_purchase mp  (cost=0.00..6.01 rows=1 width=12) (actual time=0.05..0.05 rows=1 loops=2115)   Index Cond: ("outer".id =
 mp.data_id) SubPlan   ->  Unique  (cost=2237.12..2243.22 rows=122 width=13) (actual time=161.25..164.68 rows=1 loops=2197) ->  Sort  (cost=2237.12..2240.17 rows=1220 width=13) (actual time=161.21..161.88 rows=1033 loops=2197)   Sort Key:
 

Re: [PERFORM] query optimization help

2005-01-14 Thread Merlin Moncure
Please post in plaintext, not html where possible.
Your group by clause was 'myst'...was that supposed to be mystate?

Her is something you might try...use the original query form and create a 
function which resolves the state code from the input data...you are already 
doing that with upper.

So,

create function get_state_code(text) returns char(2) as 
$$
select case when len($1) = 2 
then upper($1)
else lookup_state_code($1)
end;
$$
language sql stable;

lookup_state_code is a similar function which is boils down to a select from a 
lookup table.  Or, you could make a giant cast statement (when GEORGIA then GA, 
etc).  and now your function becomes IMMUTABLE and should execute very fast.  
Just make sure all the states are spelled correct in the original table via 
domain constraint.

Merlin



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of sarlav kumar
Sent: Friday, January 14, 2005 9:40 AM
To: pgsqlnovice; pgsqlperform
Subject: [PERFORM] query optimization help

Hi All,
 
I have the following query to generate a report grouped by "states".
 
SELECT distinct upper(cd.state) as mystate, SUM(d.amount) as total_amount, 
SUM(COALESCE(d.fee,0) + COALESCE(mp.seller_fee, 0) + COALESCE(mp.buyer_fee,0)) 
as total_fee FROM data d left JOIN customerdata cd ON d.uid = cd.uid LEFT JOIN 
merchant_purchase mp ON d.id = mp.data_id WHERE d.what IN (26,0, 15) AND 
d.flags IN (1,9,10,12 ) AND lower(cd.country) = 'us' AND 
date_part('year',d.time)= 2004 GROUP BY myst
ate ORDER BY mystate;


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] query optimization help

2005-01-14 Thread sarlav kumar
Hi,
 
Thanks for the help. I actually got around with it by doing the following.
I created a temporary table:
 
create table statesnew as select distinct state,state_code from postalcode where lower(country)='us';
 
And then changed the query to :
 
SELECT (SELECT sn.state FROM statesnew sn WHERE UPPER(cd.state) IN (sn.state, sn.state_code)) as mystate, SUM(d.amount) as total_amount, SUM(COALESCE(d.fee,0) + COALESCE(mp.seller_fee, 0) + COALESCE(mp.buyer_fee,0)) as total_fee FROM data d JOIN customerdata cd ON d.uid = cd.uid LEFT JOIN merchant_purchase mp ON d.id = mp.data_id WHERE d.what IN (26,0, 15) AND d.flags IN (1,9,10,12 ) AND lower(cd.country) = 'us' AND date_part('year', d.time) = 2004 GROUP BY mystate ORDER BY mystate;
 
This worked well, as it reduced the number of entries it had to search from.
 
I am not sure how to use the function you have written. Can you give me pointers on that?
 
Thanks,
Saranya
 
Merlin Moncure <[EMAIL PROTECTED]> wrote:

Please post in plaintext, not html where possible.Your group by clause was 'myst'...was that supposed to be mystate?
Yes, It is mystate. It continues on the next line:)
Her is something you might try...use the original query form and create a function which resolves the state code from the input data...you are already doing that with upper.So,create function get_state_code(text) returns char(2) as $$select case when len($1) = 2 then upper($1)else lookup_state_code($1)end;$$language sql stable;lookup_state_code is a similar function which is boils down to a select from a lookup table. Or, you could make a giant cast statement (when GEORGIA then GA, etc). and now your function becomes IMMUTABLE and should execute very fast. Just make sure all the states are spelled correct in the original table via domain constraint.Merlin__Do You Yahoo!?Tired of spam?  Yahoo! Mail has the best spam protection around http://mail.yahoo.com 

Re: [PERFORM] sum of all values

2005-01-14 Thread Madison Kelly
Richard Huxton wrote:
Madison Kelly wrote:
Hi all,
  Is there a fast(er) way to get the sum of all integer values for a 
certain condition over many thousands of rows? What I am currently 
doing is this (which takes ~5-10sec.):

OK, I'm assuming you've configured PG to your satisfaction and this is 
the only query giving you problems.
  This is a program for general consumption (hopefully... 
eventually...) so I want to leave the psql config alone. Once I am 
happier with the program I will try different tuning options and write a 
faq though I expect 9 out of 10 users won't read it.

SELECT SUM (a.file_size) FROM file_info_1 a, file_set_1 b WHERE 
a.file_name=b.fs_name AND a.file_parent_dir=b.fs_parent_dir AND 
a.file_type=b.fs_type AND b.fs_backup='t';

You'll want to run EXPLAIN ANALYSE SELECT SUM... and post the output of 
that, although the query looks straightforward enough.
tle-bu=> EXPLAIN ANALYZE SELECT SUM (a.file_size) FROM file_info_1 a, 
file_set_1 b WHERE a.file_name=b.fs_name AND 
a.file_parent_dir=b.fs_parent_dir AND a.file_type=b.fs_type AND 
b.fs_backup='t';

QUERY PLAN

 Aggregate  (cost=2202.54..2202.54 rows=1 width=8) (actual 
time=5078.744..5078.748 rows=1 loops=1)
   ->  Merge Join  (cost=724.94..2202.51 rows=11 width=8) (actual 
time=3281.677..4969.719 rows=12828 loops=1)
 Merge Cond: (("outer".file_parent_dir = "inner".fs_parent_dir) 
AND ("outer".file_name = "inner".fs_name) AND ("outer".file_type = 
"inner".fs_type))
 ->  Index Scan using file_info_1_search_idx on file_info_1 a 
(cost=0.00..1317.11 rows=12828 width=104) (actual time=0.042..116.825 
rows=12828 loops=1)
 ->  Sort  (cost=724.94..740.97 rows=6414 width=96) (actual 
time=3281.516..3350.640 rows=12828 loops=1)
   Sort Key: b.fs_parent_dir, b.fs_name, b.fs_type
   ->  Seq Scan on file_set_1 b  (cost=0.00..319.35 
rows=6414 width=96) (actual time=0.029..129.129 rows=12828 loops=1)
 Filter: (fs_backup = true)
 Total runtime: 5080.729 ms
(9 rows)

  Here are the schemas, in case they help:
tle-bu=> \d file_info_1Table "public.file_info_1"
 Column  |  Type   | Modifiers
-+-+
 file_acc_time   | bigint  | not null
 file_group_name | text| not null
 file_group_uid  | integer | not null
 file_mod_time   | bigint  | not null
 file_name   | text| not null
 file_parent_dir | text| not null
 file_perm   | text| not null
 file_size   | bigint  | not null
 file_type   | text| not null default 'f'::text
 file_user_name  | text| not null
 file_user_uid   | integer | not null
Indexes:
"file_info_1_display_idx" btree (file_parent_dir, file_name, 
file_type)
"file_info_1_search_idx" btree (file_parent_dir, file_name, 
file_type)

tle-bu=> \d file_set_1   Table "public.file_set_1"
Column |  Type   | Modifiers
---+-+
 fs_backup | boolean | not null default true
 fs_display| boolean | not null default false
 fs_name   | text| not null
 fs_parent_dir | text| not null
 fs_restore| boolean | not null default false
 fs_type   | text| not null default 'f'::text
Indexes:
"file_set_1_sync_idx" btree (fs_parent_dir, fs_name, fs_type)

1. WHERE ARE YOUR PRIMARY KEYS???
2. Why do you have two identical indexes on file_info_1
3. WHERE ARE YOUR PRIMARY KEYS???
4. Am I right in thinking that always, file_name==fs_name (i.e. they 
represent the same piece of information) and if so, why are you storing 
it twice? Same for _parent_dir too
5. file_type/fs_type are being held as unbounded text? Not an index into 
some lookup table or a varchar(N)?

Can you explain what you're trying to do here - it might be you want to 
alter your database design.
--
  Richard Huxton
  Archonet Ltd
  This is where I have to admit my novice level of knowledge. Until now 
I have been more concerned with "making it work". It is only now that I 
have finished (more or less) the program that I have started going back 
and trying to find ways to speed it up. I have not used postgres (or 
perl or anything) before this program. I hope my questions aren't too 
basic. ^.^;

  I keep hearing about Primary Keys but I can't say that I know what 
they are or how they are used. If I do understand, it is a way to 
reference another table's entry (using a foreign key)? The two matching 
indexes is a typo in my program that I hadn't noticed, I'll fix that asap.

  Here is what the database is used for:
  This is a backup program and I use the DB to store extended 
information on all selected files and directories on a partition. Each 
partition has it's own 'file_info_#' and 'file_set_#' tables where '#' 
matches the ID stored for that partition in the DB in another table.

  The 'file_info_#' table stored the data that 

Re: [PERFORM] Best filesystem for PostgreSQL Database Cluster under Linux

2005-01-14 Thread Alex Turner
On Wed, 12 Jan 2005 07:25:43 +1100, Pete de Zwart <[EMAIL PROTECTED]> wrote:
[snip]
> improvement on I/O compared to the improvement potential of moving to
> SCSI/FCAL, that and getting more memory.
> 

I would like to ask the question that continues to loom large over all
DBAs.  SCSI, FCAL and SATA, which works best.

Most FCAL loops have a speed limit of either 1Gbps or 2Gbps.  This is
only 100MB/sec or 200MB/sec.  U320 SCSI can handle 320MB/sec and the
AMCC (formerly 3Ware) SATA Raid cards show throughput over 400MB/sec
with good IOs/sec on PCI-X.

I am not prepared to stand by whilst someone makes a sideways claim
that SCSI or FCAL is implicitly going to give better performance than
anything else.  It will depend on your data set, and how you configure
your drives, and how good your controller is.  We have a Compaq Smart
Array controler with a 3 drive RAID 5 than can't break 10MB/sec write
on a Bonnie++ benchmark.  This is virtualy the slowest system in our
datacenter, but has a modern controler and 10k disks, whilst our PATA
systems manage much better throughput. (Yes I know that MB/sec is not
the only speed measure, it also does badly on IO/sec).

Alex Turner
NetEconomist

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] which dual-CPU hardware/OS is fastest for PostgreSQL?

2005-01-14 Thread Alex Turner
Without starting too much controvesy I hope, I would seriously
recommend you evaluate the AMCC Escalade 9500S SATA controller.  It
has many of the features of a SCSI controler, but works with cheaper
drives, and for half the price or many SCSI controlers (9500S-8MI goes
for abour $500).  See http://plexq.com/~aturner/3ware.pdf for their 4
way, 8 way and 12 way RAID benchmarks including RAID 0, RAID 5 and
RAID 10.  If others have similar data, I would be very interested to
see how it stacks up against other RAID controllers.

Alex Turner
NetEconomist


On Fri, 14 Jan 2005 16:52:42 +1100, Benjamin Wragg <[EMAIL PROTECTED]> wrote:
> Hi,
> 
> From what I've been reading on the list for the last few months, adaptec
> isn't that good when it comes to RAID controllers, but LSI keeps popping up.
> Is there any particual models that are recommended as I'm in the market for
> two new servers both with RAID controllers. The server specs I'm thinking
> are as follows:
> 
> Box 1
> Fedora 64bit core 3
> 4 GB RAM (2GB per CPU)
> 2 x Opteron CPU ???
> Tyan K8S
> LSI® 53C1030 U320 SCSI controller Dual-channel
> 
> Box 2
> Fedora 64bit core 3
> 2 GB RAM (1GB per CPU)
> 2 x Opteron CPU ???
> Tyan K8S
> LSI® 53C1030 U320 SCSI controller Dual-channel
> 
> This motherboard has can "Connects to PCI-X Bridge A, LSI® ZCR (Zero Channel
> RAID) support (SCSI Interface Steering Logic)". I believe this means I can
> get a LSI MegaRAID 320-0 which a few have mentioned on the list
> (http://www.lsilogic.com/products/megaraid/scsi_320_0.html). It supports
> RAID 10 and supports battery backed cache. Anyone had any experience with
> this?
> 
> Any other particular controller that people recommend? From what I've been
> reading RAID 10 and battery backed cache sound like things I need. :)
> 
> Thanks,
> 
> Benjamin Wragg
> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Josh Berkus
> Sent: Tuesday, 11 January 2005 5:35 PM
> To: pgsql-performance@postgresql.org
> Cc: Christopher Browne
> Subject: Re: [PERFORM] which dual-CPU hardware/OS is fastest for PostgreSQL?
> 
> Chris,
> 
> > I don't know so much about FreeBSD's handling of this, but on Linux,
> > there's pretty strong indication that _SOFTWARE_ RAID is faster than
> > hardware RAID.
> 
> Certainly better than an Adaptec.  But not necessarily better than a
> medium-end RAID card, like an LSI.  It really depends on the quality of the
> controller.
> 
> Also, expected concurrent activity should influence you.   On  a dedicated
> database server, you'll seldom max out the CPU but will often max of the
> disk, so the CPU required by software RAID is "free".   However, if you have
> 
> a Web/PG/E-mail box which frequently hits 100% CPU, then even a lower-end
> RAID card can be beneficial simply by taking load off the CPU.
> 
> --
> Josh Berkus
> Aglio Database Solutions
> San Francisco
> 
> ---(end of broadcast)---
> TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
> 
> --
> No virus found in this incoming message.
> Checked by AVG Anti-Virus.
> Version: 7.0.300 / Virus Database: 265.6.10 - Release Date: 10/01/2005
> 
> --
> No virus found in this outgoing message.
> Checked by AVG Anti-Virus.
> Version: 7.0.300 / Virus Database: 265.6.11 - Release Date: 12/01/2005
> 
> 
> ---(end of broadcast)---
> TIP 7: don't forget to increase your free space map settings
>

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] which dual-CPU hardware/OS is fastest for PostgreSQL?

2005-01-14 Thread Greg Stark
Jan Dittmer <[EMAIL PROTECTED]> writes:

> You could always do raid 1 over raid 0, with newer kernels (2.6ish)
> there is even a dedicated raid10 driver.

Aren't you much better off doing raid 0 over raid 1? 

With raid 1 over raid 0 you're mirroring two stripe sets. That means if any
drive from the first stripe set goes you lose the whole side of the mirror. If
any drive of the second stripe set goes you lost your array. Even if they're
not the same position in the array.

If you do raid 0 over raid 1 then you're striping a series of mirrored drives.
So if any drive fails you only lose that drive from the stripe set. If another
drive fails then you're ok as long as it isn't the specific drive that was
paired with the first failed drive.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] which dual-CPU hardware/OS is fastest for PostgreSQL?

2005-01-14 Thread Merlin Moncure
Alex wrote:
> Without starting too much controvesy I hope, I would seriously
> recommend you evaluate the AMCC Escalade 9500S SATA controller.  It
> has many of the features of a SCSI controler, but works with cheaper
> drives, and for half the price or many SCSI controlers (9500S-8MI goes
> for abour $500).  See http://plexq.com/~aturner/3ware.pdf for their 4
> way, 8 way and 12 way RAID benchmarks including RAID 0, RAID 5 and
> RAID 10.  If others have similar data, I would be very interested to
> see how it stacks up against other RAID controllers.

At the risk of shaming myself with another 'me too' post, I'd like to
say that my experiences back this up 100%.  The Escalade controllers are
excellent and the Raptor drives are fast and reliable (so far).  With
the money saved from going SCSI, instead of a RAID 5 a 10 could be built
for roughly the same price and capacity, guess which array is going to
be faster?

I think the danger about SATA is that many SATA components are not
server quality, so you have to be more careful about what you buy.  For
example, you can't just assume your SATA backplane has hot swap lights
(got bit by this one myself, heh).  

Merlin


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] which dual-CPU hardware/OS is fastest for PostgreSQL?

2005-01-14 Thread Josh Berkus
Merlin,

> I think the danger about SATA is that many SATA components are not
> server quality, so you have to be more careful about what you buy.  For
> example, you can't just assume your SATA backplane has hot swap lights
> (got bit by this one myself, heh).

Yeah, that's my big problem with anything IDE.My personal experience of 
failure rates for IDE drives, for example, is about 1 out of 10 fails in 
service before it's a year old; SCSI has been more like 1 out of 50.  

Also, while I've seen benchmarks like Escalade's, my real-world experience has 
been that the full bi-directional r/w of SCSI means that it takes 2 SATA 
drives to equal one SCSI drive in a heavy r/w application.   However, ODSL is 
all SCSI so I don't have any numbers to back that up.

But one of my clients needs a new docs server, so maybe I can give an Escalade 
a spin.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Index on a function and SELECT DISTINCT

2005-01-14 Thread Adrian Holovaty
If I have this table, function and index in Postgres 7.3.6 ...

"""
CREATE TABLE news_stories (
id serial primary key NOT NULL,
pub_date timestamp with time zone NOT NULL,
...
)
CREATE OR REPLACE FUNCTION get_year_trunc(timestamp with time zone) returns 
timestamp with time zone AS 'SELECT date_trunc(\'year\',$1);' LANGUAGE 'SQL' 
IMMUTABLE;
CREATE INDEX news_stories_pub_date_year_trunc ON 
news_stories( get_year_trunc(pub_date) );
"""
 
...why does this query not use the index?
 
db=# EXPLAIN SELECT DISTINCT get_year_trunc(pub_date) FROM news_stories;
   QUERY PLAN
-
 Unique  (cost=59597.31..61311.13 rows=3768 width=8)
   ->  Sort  (cost=59597.31..60454.22 rows=342764 width=8)
 Sort Key: date_trunc('year'::text, pub_date)
 ->  Seq Scan on news_stories  (cost=0.00..23390.55 rows=342764 
width=8)
(4 rows)

The query is noticably slow (2 seconds) on a database with 150,000+ records. 
How can I speed it up?

Thanks,
Adrian

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] sum of all values

2005-01-14 Thread Richard Huxton
Madison Kelly wrote:
Richard Huxton wrote:
Madison Kelly wrote:
Hi all,
  Is there a fast(er) way to get the sum of all integer values for a 
certain condition over many thousands of rows? What I am currently 
doing is this (which takes ~5-10sec.):
OK, I'm assuming you've configured PG to your satisfaction and this is 
the only query giving you problems.
  This is a program for general consumption (hopefully... eventually...) 
so I want to leave the psql config alone. Once I am happier with the 
program I will try different tuning options and write a faq though I 
expect 9 out of 10 users won't read it.
PostgreSQL is not FireFox, and you can't expect it to work efficiently 
without doing at least some configuration. The settings to support 100 
simultaneous connections on a dual-Opteron with 8GB RAM are not the same 
as on a single-user laptop.
Take half an hour to read through the performance-tuning guide here:
  http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

SELECT SUM (a.file_size) FROM file_info_1 a, file_set_1 b WHERE 
a.file_name=b.fs_name AND a.file_parent_dir=b.fs_parent_dir AND 
a.file_type=b.fs_type AND b.fs_backup='t';
You'll want to run EXPLAIN ANALYSE SELECT SUM... and post the output 
of that, although the query looks straightforward enough.
tle-bu=> EXPLAIN ANALYZE SELECT SUM (a.file_size) FROM file_info_1 a, 
file_set_1 b WHERE a.file_name=b.fs_name AND 
a.file_parent_dir=b.fs_parent_dir AND a.file_type=b.fs_type AND 
b.fs_backup='t';

QUERY PLAN

 Aggregate  (cost=2202.54..2202.54 rows=1 width=8) (actual 
time=5078.744..5078.748 rows=1 loops=1)
   ->  Merge Join  (cost=724.94..2202.51 rows=11 width=8) (actual 
time=3281.677..4969.719 rows=12828 loops=1)
 Merge Cond: (("outer".file_parent_dir = "inner".fs_parent_dir) 
AND ("outer".file_name = "inner".fs_name) AND ("outer".file_type = 
"inner".fs_type))
 ->  Index Scan using file_info_1_search_idx on file_info_1 a 
(cost=0.00..1317.11 rows=12828 width=104) (actual time=0.042..116.825 
rows=12828 loops=1)
 ->  Sort  (cost=724.94..740.97 rows=6414 width=96) (actual 
time=3281.516..3350.640 rows=12828 loops=1)
   Sort Key: b.fs_parent_dir, b.fs_name, b.fs_type
   ->  Seq Scan on file_set_1 b  (cost=0.00..319.35 
rows=6414 width=96) (actual time=0.029..129.129 rows=12828 loops=1)
 Filter: (fs_backup = true)
 Total runtime: 5080.729 ms
Well, it's slow, but that's probably your settings. Run VACUUM ANALYSE 
on the tables though, it looks like you've got default statistics (It's 
expecting exactly 1/2 the fs_backup values to be true - 6414 out of 12828).

  Here are the schemas, in case they help:
tle-bu=> \d file_info_1Table "public.file_info_1"
 Column  |  Type   | Modifiers
-+-+
 file_acc_time   | bigint  | not null
 file_group_name | text| not null
 file_group_uid  | integer | not null
 file_mod_time   | bigint  | not null
 file_name   | text| not null
 file_parent_dir | text| not null
 file_perm   | text| not null
 file_size   | bigint  | not null
 file_type   | text| not null default 'f'::text
 file_user_name  | text| not null
 file_user_uid   | integer | not null
Indexes:
"file_info_1_display_idx" btree (file_parent_dir, file_name, 
file_type)
"file_info_1_search_idx" btree (file_parent_dir, file_name, 
file_type)

tle-bu=> \d file_set_1   Table "public.file_set_1"
Column |  Type   | Modifiers
---+-+
 fs_backup | boolean | not null default true
 fs_display| boolean | not null default false
 fs_name   | text| not null
 fs_parent_dir | text| not null
 fs_restore| boolean | not null default false
 fs_type   | text| not null default 'f'::text
Indexes:
"file_set_1_sync_idx" btree (fs_parent_dir, fs_name, fs_type)

1. WHERE ARE YOUR PRIMARY KEYS???
2. Why do you have two identical indexes on file_info_1
3. WHERE ARE YOUR PRIMARY KEYS???
4. Am I right in thinking that always, file_name==fs_name (i.e. they 
represent the same piece of information) and if so, why are you 
storing it twice? Same for _parent_dir too
5. file_type/fs_type are being held as unbounded text? Not an index 
into some lookup table or a varchar(N)?

Can you explain what you're trying to do here - it might be you want 
to alter your database design.
--
  Richard Huxton
  Archonet Ltd
  This is where I have to admit my novice level of knowledge. Until now 
I have been more concerned with "making it work". It is only now that I 
have finished (more or less) the program that I have started going back 
and trying to find ways to speed it up. I have not used postgres (or 
perl or anything) before this program. I hope my questions aren't too 
basic. ^.^;
There's a rule of thumb about throwing the 

Re: [PERFORM] which dual-CPU hardware/OS is fastest for PostgreSQL?

2005-01-14 Thread Greg Stark
Josh Berkus  writes:

> Merlin,
> 
> > I think the danger about SATA is that many SATA components are not
> > server quality, so you have to be more careful about what you buy.  For
> > example, you can't just assume your SATA backplane has hot swap lights
> > (got bit by this one myself, heh).
> 
> Yeah, that's my big problem with anything IDE.My personal experience of 
> failure rates for IDE drives, for example, is about 1 out of 10 fails in 
> service before it's a year old; SCSI has been more like 1 out of 50.  

Um. I'm pretty sure the actual hardware is just the same stuff. It's just the
interface electronics that change.

> Also, while I've seen benchmarks like Escalade's, my real-world experience 
> has 
> been that the full bi-directional r/w of SCSI means that it takes 2 SATA 
> drives to equal one SCSI drive in a heavy r/w application.   However, ODSL is 
> all SCSI so I don't have any numbers to back that up.

Do we know that these SATA/IDE controllers and drives don't "lie" about fsync
the way most IDE drives do? Does the controller just automatically disable the
write caching entirely?

I don't recall, did someone have a program that tested the write latency of a
drive to test this?

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] which dual-CPU hardware/OS is fastest for PostgreSQL?

2005-01-14 Thread Merlin Moncure
Greg wrote:
> Josh Berkus  writes:
> 
> > Merlin,
> >
> > > I think the danger about SATA is that many SATA components are not
> > > server quality, so you have to be more careful about what you buy.
> For
> > > example, you can't just assume your SATA backplane has hot swap
lights
> > > (got bit by this one myself, heh).
> >
> > Yeah, that's my big problem with anything IDE.My personal
experience
> of
> > failure rates for IDE drives, for example, is about 1 out of 10
fails in
> > service before it's a year old; SCSI has been more like 1 out of 50.
> 
> Um. I'm pretty sure the actual hardware is just the same stuff. It's
just
> the
> interface electronics that change.
> 
> > Also, while I've seen benchmarks like Escalade's, my real-world
> experience has
> > been that the full bi-directional r/w of SCSI means that it takes 2
SATA
> > drives to equal one SCSI drive in a heavy r/w application.
However,
> ODSL is
> > all SCSI so I don't have any numbers to back that up.
> 
> Do we know that these SATA/IDE controllers and drives don't "lie"
about
> fsync
> the way most IDE drives do? Does the controller just automatically
disable
> the
> write caching entirely?
> 
> I don't recall, did someone have a program that tested the write
latency
> of a
> drive to test this?
> 
> --
> greg

The Escalades, at least, work the way they are supposed to.  The raid
controller supports write back/write through.  Thus, you can leave fsync
on in pg with decent performance (not as good as fsync=off, though) and
count on the bbu to cover you in the event of a power failure.  Our
internal testing here confirmed the controller and the disks sync when
you tell them to (namely escalade/raptor).

Merlin

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] which dual-CPU hardware/OS is fastest for PostgreSQL?

2005-01-14 Thread Jan Dittmer
Greg Stark wrote:
> Jan Dittmer <[EMAIL PROTECTED]> writes:
> 
> 
>>You could always do raid 1 over raid 0, with newer kernels (2.6ish)
>>there is even a dedicated raid10 driver.
> 
> 
> Aren't you much better off doing raid 0 over raid 1? 
> 
> With raid 1 over raid 0 you're mirroring two stripe sets. That means if any
> drive from the first stripe set goes you lose the whole side of the mirror. If
> any drive of the second stripe set goes you lost your array. Even if they're
> not the same position in the array.
> 
> If you do raid 0 over raid 1 then you're striping a series of mirrored drives.
> So if any drive fails you only lose that drive from the stripe set. If another
> drive fails then you're ok as long as it isn't the specific drive that was
> paired with the first failed drive.


Ever heart of Murphy? :-) But of course you're right - I tend to mix up
the raid levels...

Jan

---(end of broadcast)---
TIP 8: explain analyze is your friend