Re: [PERFORM] Hardware suggestions for Linux/PGSQL server

2003-12-15 Thread Jeff Bohmer
In the last exciting episode, [EMAIL PROTECTED] ("Andrew G. Hammond") wrote:
 I don't know what your budget is, but there are now 10k RPM SATA 150
 drives on the market. Their price/performance is impressive. You may
 want to consider going with a bunch of these instead of SCSI disks
 (more spindles vs. faster spindles). 3ware makes a hardware raid
 card that can drive up to 12 SATA disks. I have been told by a few
 people who have used it that the linux driver is very solid.
We got a couple of those in for testing purposes; when opportunity
presents itself, I'll have to check to see if they are any more honest
about commits than traditional IDE drives.
If they still "lie" the same way IDE drives do, it is entirely
possible that they are NOT nearly as impressive as you presently
imagine.  It's not much good if they're "way fast" if you can't trust
them to actually store data when they claim it is stored...
We lost data because of this very problem when a UPS didn't signal 
the shut down before it ran out of juice.

Here's an excellent explanation of the problem:
http://archives.postgresql.org/pgsql-general/2003-10/msg01343.php
This post indicates that SATA drives still have problems, but a new 
ATA standard might fix things in the future:
http://archives.postgresql.org/pgsql-general/2003-10/msg01395.php

SATA RAID is a good option for a testing server, though.

- Jeff
--
Jeff Bohmer
VisionLink, Inc.
_
303.402.0170
www.visionlink.org
_
People. Tools. Change. Community.
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Index problem or function problem?

2003-12-15 Thread Tom Lane
LIANHE SHAO <[EMAIL PROTECTED]> writes:
> PGA=> explain select ei.expid, er.geneid,
> er.sampleid, ei.annotation, si.samplename,  
> ei.title as exp_name, aaa.chip,
> aaa.sequence_derived_from as accession_number,
> aaa.gene_symbol, aaa.title as gene_function,
> er.exprs, er.mas5exprs from expressiondata er,
> experimentinfo ei, sampleinfo si,
> affy_array_annotation aaa where exists (select
> distinct ei.expid from experimentinfo) and
> ei.annotation =  aaa.chip and (lower (aaa.title)
> like '%mif%' or lower(aaa.sequence_description) like
> '%mif%') and exists (select distinct ei.annotation
> from experimentinfo) and ei.expid = er.expid and
> er.expid = si.expid and er.sampleid = si.sampleid
> and er.geneid = aaa.probeset_id order by si.sampleid
> limit 20;

What is the purpose of the EXISTS() clauses?  They are almost surely not
doing what you intended, because AFAICS they are just an extremely
expensive means of producing a constant-TRUE result.  In
exists (select distinct ei.expid from experimentinfo)
"ei.expid" is an outer reference, which will necessarily be the same
value over all rows of the sub-select.  After computing this same value
for every row of experimentinfo, the system performs a DISTINCT
operation (sort + unique, not cheap) ... and then all it checks for is
whether at least one row was produced, which means the DISTINCT
operation was completely unnecessary.  The only way the EXISTS could
return false is if experimentinfo were empty, but if it were so then the
outer FROM would've produced no rows and we'd not have got to WHERE
anyway.

I'm not sure why you get a worse plan for the simpler variant of the
query; it would help to see EXPLAIN ANALYZE rather than EXPLAIN output.
But it's not worth trying to improve the performance until you are
calculating correct answers, and I suspect the above is not doing
what you are after at all.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] TRUNCATE veeeery slow compared to DELETE in 7.4

2003-12-15 Thread Tom Lane
Hartmut Raschick <[EMAIL PROTECTED]> writes:
> [ TRUNCATE is much slower in 7.4 than in 7.3 ]

After looking into this, I think this is because when Rod Taylor
reimplemented TRUNCATE to make it transaction-safe, he essentially
turned it into a variant of CLUSTER.  It is slow because it is creating
and deleting dummy tables and indexes.  I think this is not really
necessary and it could be done better while still being
transaction-safe.  All we really need is to create a new empty table
file, update the table's pg_class row with the new relfilenode, mark
the old file for deletion, and then run REINDEX TABLE (which will
perform similar shenanigans with the indexes).

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Unsubscribe

2003-12-15 Thread Rhaoni Chiu Pereira






---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly