[PERFORM] Trying to track down weird query stalls

2009-03-30 Thread dan
I'm running a 64-bit build of Postgres 8.3.5 on AIX 5.3, and have a really
strange, annoying transient problem with one particular query stalling.

The symptom here is that when this query is made with X or more records in
a temp table involved in the join (where X is constant when the problem
manifests, but is different between manifestations) the query takes about
20 minutes. When the query is made with X-1 records it takes less than a
second. It's just this one query -- for everything else the system's nice
and snappy. The machine load's never above 9 (it's a 32 CPU box) and
hasn't had less than 60G of free system memory on it.

An EXPLAIN ANALYZE of the two queries (with X-1 and X records) is even
more bizarre. Not only are the two query plans identical (save trivial
differences because of the record count differences) but the explain
EXPLAIN ANALYZE total runtimes are near-identical -- the fast case showed
259ms, the slow case 265ms.

When the slow query was actually run, though, it took 20 minutes. There
were no blocked back ends shown in pg_stat_activity, and the back end
itself was definitely moving. I trussed the back end stuck running the
slow query and it spent nearly all its time doing kread() and kwrite()
calls. The DB log didn't show anything interesting in it. I checked to
make sure the SQL statement that was EXPLAIN ANALYZEd was the one actually
executed, and I pulled the client code into the debugger and
single-stepped through just to make sure it was getting stuck on that one
SQL statement and it wasn't the client doing something unexpected.

Just to be even more annoying, the problem goes away. Right now I can't
trigger the problem. Last friday it happened reliably feeding 58 records
into this query. The week before it was 38 records. Today? Nothing, the
system's full of snappy.

At the moment I'm at a loss as to what's going wrong, and the fact that I
can't duplicate it right now when I actually have time to look into the
problem's damned annoying. What I'm looking for are some hints as to where
to look the next time it does happen, or things I can flip on to catch
more in the log. (The logging parameters are all set at their defaults)
I'm going to try attaching with dbx to get a stack trace (I assume this is
relatively safe?) but past that I'm kind of stumped.

Help?





-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trying to track down weird query stalls

2009-03-30 Thread dan
> On Mon, Mar 30, 2009 at 1:50 PM,   wrote:
>> I'm running a 64-bit build of Postgres 8.3.5 on AIX 5.3, and have a
>> really
>> strange, annoying transient problem with one particular query stalling.
>>
>> The symptom here is that when this query is made with X or more records
>> in
>> a temp table involved in the join (where X is constant when the problem
>> manifests, but is different between manifestations) the query takes
>> about
>> 20 minutes. When the query is made with X-1 records it takes less than a
>> second. It's just this one query -- for everything else the system's
>> nice
>> and snappy. The machine load's never above 9 (it's a 32 CPU box) and
>> hasn't had less than 60G of free system memory on it.
>>
>> An EXPLAIN ANALYZE of the two queries (with X-1 and X records) is even
>> more bizarre. Not only are the two query plans identical (save trivial
>> differences because of the record count differences) but the explain
>> EXPLAIN ANALYZE total runtimes are near-identical -- the fast case
>> showed
>> 259ms, the slow case 265ms.
>
> log_min_duration_statement is a good place to start, but it sounds
> like the query plan you're getting when you test it by hand isn't the
> same one that's actually executing, so I'm not sure how far that's
> going to get you.  contrib/auto_explain sounds like it would be just
> the thing, but unfortunately that's an 8.4 feature which hasn't been
> released yet.  I'm not sure whether it could be built and run against
> 8.3.

I'm not executing any of the EXPLAINs by hand, because I didn't want to
have to worry about typos or filling in temp tables with test data. Inside
the app the SQL for the problematic query's stored in a variable -- when
the task runs with debugging enabled it first executes the query with
EXPLAIN ANALYZE prepended and dumps the output, then it executes the query
itself. It's possible something's going wrong in that, but the code's
pretty simple.

Arguably in this case the actual query should run faster than the EXPLAIN
ANALYZE version, since the cache is hot. (Though that'd only likely shave
a few dozen ms off the runtime)

-Dan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trying to track down weird query stalls

2009-03-30 Thread dan
> On Mon, Mar 30, 2009 at 2:42 PM,   wrote:
>>> On Mon, Mar 30, 2009 at 1:50 PM,   wrote:
>> I'm not executing any of the EXPLAINs by hand, because I didn't want to
>> have to worry about typos or filling in temp tables with test data.
>> Inside
>> the app the SQL for the problematic query's stored in a variable -- when
>> the task runs with debugging enabled it first executes the query with
>> EXPLAIN ANALYZE prepended and dumps the output, then it executes the
>> query
>> itself. It's possible something's going wrong in that, but the code's
>> pretty simple.
>>
>> Arguably in this case the actual query should run faster than the
>> EXPLAIN
>> ANALYZE version, since the cache is hot. (Though that'd only likely
>> shave
>> a few dozen ms off the runtime)
>
> Well... yeah.  Also EXPLAIN ANALYZE has a non-trivial amount of
> overhead, so that is quite bizarre.  I have to suspect there is some
> subtle difference between the way the EXPLAIN ANALYZE is done and the
> way the actual query is done... like maybe one uses parameter
> substitution and the other doesn't or, well, I don't know.  But I
> don't see how turning on debugging (which is essentially what EXPLAIN
> ANALYZE is) can prevent the query from being slow.

Hence the query to the list. *Something* is going on, and beats me what.
I'm assuming I'm triggering some bug in the postgres back end, or there's
some completely bizarre edge case that this tickles. (The massive
kread/kwrite activity that truss showed me when I checked seemed rather
unusual, to say the least)

EXPLAIN ANALYZE is my normal means of diagnosing performance problems, but
that isn't helping as it shows perfectly sane results. That leaves
abnormal means, and outside of trussing the back end or attaching with dbx
to get a stack trace I just don't have any of those. I'm not even sure
what I should be looking for when I do get a stack trace.

-Dan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trying to track down weird query stalls

2009-03-30 Thread dan
> On Mon, Mar 30, 2009 at 12:42 PM,   wrote:
>> Arguably in this case the actual query should run faster than the
>> EXPLAIN
>> ANALYZE version, since the cache is hot. (Though that'd only likely
>> shave
>> a few dozen ms off the runtime)
>
> Joining a lot of tables together?  Could be GEQO kicking in.

Only if I get different query plans for the query depending on whether
it's being EXPLAIN ANALYZEd or not. That seems unlikely...

-Dan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trying to track down weird query stalls

2009-03-30 Thread dan
> On Mon, Mar 30, 2009 at 1:42 PM,   wrote:
>>> On Mon, Mar 30, 2009 at 12:42 PM,   wrote:
>>>> Arguably in this case the actual query should run faster than the
>>>> EXPLAIN
>>>> ANALYZE version, since the cache is hot. (Though that'd only likely
>>>> shave
>>>> a few dozen ms off the runtime)
>>>
>>> Joining a lot of tables together?  Could be GEQO kicking in.
>>
>> Only if I get different query plans for the query depending on whether
>> it's being EXPLAIN ANALYZEd or not. That seems unlikely...
>
> Yes, you can.  In fact you often will.  Not because it's being
> explained or not, just because that's how GEQO works.

Ouch. I did *not* know that was possible -- I assumed that the plan was
deterministic and independent of explain analyze. The query has seven
tables (one of them a temp table) and my geqo_threshold is set to 12. If
I'm reading the docs right GEQO shouldn't kick in.

-Dan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trying to track down weird query stalls

2009-03-30 Thread dan
> On Mon, Mar 30, 2009 at 4:02 PM,   wrote:
>>> On Mon, Mar 30, 2009 at 1:42 PM,   wrote:
>>>>> On Mon, Mar 30, 2009 at 12:42 PM,   wrote:
>>>>>> Arguably in this case the actual query should run faster than the
>>>>>> EXPLAIN
>>>>>> ANALYZE version, since the cache is hot. (Though that'd only likely
>>>>>> shave
>>>>>> a few dozen ms off the runtime)
>>>>>
>>>>> Joining a lot of tables together?  Could be GEQO kicking in.
>>>>
>>>> Only if I get different query plans for the query depending on whether
>>>> it's being EXPLAIN ANALYZEd or not. That seems unlikely...
>>>
>>> Yes, you can.  In fact you often will.  Not because it's being
>>> explained or not, just because that's how GEQO works.
>>
>> Ouch. I did *not* know that was possible -- I assumed that the plan was
>> deterministic and independent of explain analyze. The query has seven
>> tables (one of them a temp table) and my geqo_threshold is set to 12. If
>> I'm reading the docs right GEQO shouldn't kick in.
>
> Any chance we could see the actual query?  Right now I think we are
> shooting in the dark.

The query is:

select distinct
   temp_symbol.entityid,
   temp_symbol.libname,
   temp_symbol.objid,
   temp_symbol.objname,
   temp_symbol.fromsymid,
   temp_symbol.fromsymtype,
   temp_symbol.objinstance,
   NULL,
   temp_symbol.csid,
   libinstance.entityid,
   NULL,
   libobject.objid,
   NULL,
   provide_symbol.symbolid,
   provide_symbol.symboltype,
   libobject.objinstance,
   libobject.libinstanceid,
   objectinstance.csid,
   NULL,
   provide_symbol.is_weak,
   NULL,
   provide_symbol.is_local,
   NULL,
   provide_symbol.is_template,
   NULL,
   provide_symbol.is_common
  from libinstance,
   library,
   libobject,
   provide_symbol,
   temp_symbol,
   objectinstance,
   attributes
where libinstance.libdate <= 1238445044
   and libinstance.enddate > 1238445044
   and libinstance.libinstanceid = libobject.libinstanceid
   and libinstance.architecture = ?


   and attributes.entityid = libinstance.entityid
   and attributes.branchid = libinstance.branchid
   and attributes.architecture = libinstance.architecture
   and library.libid = libinstance.libid
   and not secondary
and attribute in ('notoffline', 'notoffline')
and (provide_symbol.symboltype = 'T')
   and libobject.objinstance = provide_symbol.objinstance
   and libinstance.branchid = ?
   and provide_symbol.symbolid = temp_symbol.symbolid
   and objectinstance.objinstance = libobject.objinstance
and libinstance.istemp =  0

The explain analyze for the query's attached in a (possibly hopeless)
attempt to keep it from being word-wrapped into unreadability.

-Dan
Unique  (cost=25323267.82..25350114.59 rows=397730 width=192) (actual 
time=277.963..278.054 rows=35 loops=1)
  ->  Sort  (cost=25323267.82..25324262.14 rows=397730 width=192) (actual 
time=277.960..277.967 rows=35 loops=1)
Sort Key: temp_symbol.entityid, temp_symbol.libname, temp_symbol.objid, 
temp_symbol.objname, temp_symbol.fromsymid, temp_symbol.fromsymtype, 
temp_symbol.objinstance, temp_symbol.csid, libinstance.ent
ityid, libobject.objid, provide_symbol.symbolid, libobject.objinstance, 
libobject.libinstanceid, objectinstance.csid, provide_symbol.is_weak, 
provide_symbol.is_local, provide_symbol.is_template, provide_symb
ol.is_common
Sort Method:  quicksort  Memory: 27kB
->  Merge Join  (cost=25234932.60..25286276.09 rows=397730 width=192) 
(actual time=276.217..277.397 rows=35 loops=1)
  Merge Cond: (libinstance.libinstanceid = libobject.libinstanceid)
  ->  Sort  (cost=7950.07..7958.88 rows=3523 width=8) (actual 
time=256.190..256.655 rows=1838 loops=1)
Sort Key: libinstance.libinstanceid
Sort Method:  quicksort  Memory: 135kB
->  Merge Join  (cost=7604.97..7742.52 rows=3523 width=8) 
(actual time=231.880..252.275 rows=1838 loops=1)
  Merge Cond: (libinstance.libid = library.libid)
  ->  Sort  (cost=5951.50..5960.32 rows=3529 width=12) 
(actual time=156.866..157.638 rows=1849 loops=1)
Sort Key: libinstance.libid
Sort Method:  quicksort  Memory: 135kB
->  Merge Join  (cost=5657.33..5743.55 
rows=3529 width=12) (actual time=141.311..152.022 rows=1849 loops=1)
  Merge Cond: (libinstance.entityid = 
attributes.entityid)
  ->  Sort  (cost=414

Re: [PERFORM] Trying to track down weird query stalls

2009-03-30 Thread dan
> d...@sidhe.org escribió:
>
>> where libinstance.libdate <= 1238445044
>>and libinstance.enddate > 1238445044
>>and libinstance.libinstanceid = libobject.libinstanceid
>>and libinstance.architecture = ?
>
> How are you generating the explain?  My bet is that you're just
> substituting a literal in the architecture condition, but if the driver
> is smart then maybe it's preparating the query beforehand.  You'll get a
> different plan in that case.

I don't think so. Perl's DBI is involved, but the statement's in a
variable. The code in question is:

   if ($db->{debug}) {
$db->debug("SQL is: $sql\n");
my $rows = $db->{dbh}->selectall_arrayref("explain analyze $sql",
  undef, $db->{arch},
  $db->{basebranch});
foreach my $row (@$rows) {
  $db->debug(join(" ", @$row). "\n");
}
$db->debug_stamp("Initial query done\n");
  }

  $rows = $db->{dbh}->selectall_arrayref($sql,
 undef, $db->{arch},
 $db->{basebranch});

There's no transform of the sql variable between the two statements, just
a quick loop over the returned rows from the explain analyze to print them
out. (I did try to make sure that the debugging bits were done the same
way as the mainline code, but I may have bobbled something)

-Dan



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trying to track down weird query stalls

2009-03-30 Thread dan
> d...@sidhe.org escribió:
>> > d...@sidhe.org escribió:
>> >
>> >> where libinstance.libdate <= 1238445044
>> >>and libinstance.enddate > 1238445044
>> >>and libinstance.libinstanceid = libobject.libinstanceid
>> >>and libinstance.architecture = ?
>> >
>> > How are you generating the explain?  My bet is that you're just
>> > substituting a literal in the architecture condition, but if the
>> driver
>> > is smart then maybe it's preparating the query beforehand.  You'll get
>> a
>> > different plan in that case.
>>
>> I don't think so. Perl's DBI is involved, but the statement's in a
>> variable.
>
> So what's the "?" in the query you pasted earlier?

The first ? (for architecture) is 1, the second ? (for branchid) is 0.
They both should get passed to Postgres as $1 and $2, respectively,
assuming DBD::Pg does its substitution right. (They're both supposed to go
in as placeholders)

-Dan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trying to track down weird query stalls

2009-03-30 Thread dan
> d...@sidhe.org escribió:
>
>> > So what's the "?" in the query you pasted earlier?
>>
>> The first ? (for architecture) is 1, the second ? (for branchid) is 0.
>> They both should get passed to Postgres as $1 and $2, respectively,
>> assuming DBD::Pg does its substitution right. (They're both supposed to
>> go
>> in as placeholders)
>
> Right, so how about you reread what I wrote above?
>
> Oh, hmm, so to be more clear: I don't think DBD::Pg is actually sending
> EXECUTE PREPARE.  You need to do this over psql.

Fair enough. (And sorry about the mis-read) Next time this occurs I'll try
and duplicate this in psql. FWIW, a quick read of the C underlying the
DBD::Pg module shows it using PQexecPrepared, so I'm pretty sure it is
using prepared statements with placeholders, but double-checking seems
prudent.

-Dan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Query tuning help

2005-05-08 Thread Dan Harris
Sorry to bother everyone with yet another "my query isn't using an 
index" problem but I am over my head on this one..  I am open to ways 
of restructuring this query to perform better.

I have a table, 'ea', with 22 million rows in it.  VACUUM ANALYZE has 
been just run on the table.

This is the result of:
explain analyze
select distinct
em.incidentid,
ea.recordtext as retdata,
eg.long,
eg.lat
from
ea, em, eg
where
em.incidentid = ea.incidentid and
em.incidentid = eg.incidentid and
em.entrydate >= '2005-1-1 00:00' and
em.entrydate <= '2005-5-9 00:00'
and ea.incidentid in (
select
incidentid
from
ea
where
recordtext like '%RED%'
)
	and ea.incidentid in (
		select
			incidentid
		from
			ea
		where
			recordtext like '%CORVETTE%'
	)
	and (  recordtext like '%RED%' or recordtext like '%CORVETTE%'  )  
order by em.entrydate

-
ANALYZE RESULTS
-
 Unique  (cost=774693.72..774693.76 rows=1 width=159) (actual 
time=446787.056..446787.342 rows=72 loops=1)
   ->  Sort  (cost=774693.72..774693.72 rows=1 width=159) (actual 
time=446787.053..446787.075 rows=72 loops=1)
 Sort Key: em.incidentid, public.ea.recordtext, eg.long, eg.lat
 ->  Nested Loop  (cost=771835.10..774693.71 rows=1 width=159) 
(actual time=444378.655..446786.746 rows=72 loops=1)
   ->  Nested Loop  (cost=771835.10..774688.81 rows=1 
width=148) (actual time=444378.532..446768.381 rows=72 loops=1)
 ->  Nested Loop IN Join  
(cost=771835.10..774678.88 rows=2 width=81) (actual 
time=444367.080..446191.864 rows=701 loops=1)
   ->  Nested Loop  (cost=771835.10..774572.05 
rows=42 width=64) (actual time=444366.859..445463.232 rows=1011 
loops=1)
 ->  HashAggregate  
(cost=771835.10..771835.10 rows=1 width=17) (actual 
time=444366.702..444368.583 rows=473 loops=1)
   ->  Seq Scan on ea  
(cost=0.00..771834.26 rows=335 width=17) (actual 
time=259.746..444358.837 rows=592 loops=1)
 Filter: 
((recordtext)::text ~~ '%CORVETTE%'::text)
 ->  Index Scan using ea1 on ea  
(cost=0.00..2736.43 rows=42 width=47) (actual time=2.085..2.309 rows=2 
loops=473)
   Index Cond: 
((ea.incidentid)::text = ("outer".incidentid)::text)
   Filter: (((recordtext)::text ~~ 
'%RED%'::text) OR ((recordtext)::text ~~ '%CORVETTE%'::text))
   ->  Index Scan using ea1 on ea  
(cost=0.00..2733.81 rows=42 width=17) (actual time=0.703..0.703 rows=1 
loops=1011)
 Index Cond: 
(("outer".incidentid)::text = (ea.incidentid)::text)
 Filter: ((recordtext)::text ~~ 
'%RED%'::text)
 ->  Index Scan using em_incidentid_idx on em  
(cost=0.00..4.95 rows=1 width=67) (actual time=0.820..0.821 rows=0 
loops=701)
   Index Cond: (("outer".incidentid)::text = 
(em.incidentid)::text)
   Filter: ((entrydate >= '2005-01-01 
00:00:00'::timestamp without time zone) AND (entrydate <= '2005-05-09 
00:00:00'::timestamp without time zone))
   ->  Index Scan using eg_incidentid_idx on eg  
(cost=0.00..4.89 rows=1 width=79) (actual time=0.245..0.246 rows=1 
loops=72)
 Index Cond: (("outer".incidentid)::text = 
(eg.incidentid)::text)
 Total runtime: 446871.880 ms
(22 rows)

-
EXPLANATION
-
The reason for the redundant LIKE clause is that first, I only want 
those "incidentid"s that contain the words 'RED' and 'CORVETTE'.  BUT, 
those two words may exist across multiple records with the same 
incidentid.  Then, I only want to actually work with the rows that 
contain one of the words.  This query will repeat the same logic for 
however many keywords are entered by the user.  I have investigated 
text searching options and have not found them to be congruous with my 
application.

Why is it choosing a sequential scan one part of the query when 
searching for the words, yet using an index scan for another part of 
it?  Is there a better way to structure the query to give it better 
hints?

I'm using 8.0.1 on a 4-way Opteron with beefy RAID-10 and 12GB of RAM.
Thank you for any advice.
-Dan

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Query tuning help

2005-05-08 Thread Dan Harris
On May 8, 2005, at 6:51 PM, Russell Smith wrote:
On Mon, 9 May 2005 09:20 am, Dan Harris wrote:
You cannot use an index for %CORVETTE%, or %RED%.  There is no way
for the index to know if a row had that in the middle without scanning  
the whole
index.  So it's much cheaper to do a sequence scan.

While I believe you, I'm confused by this line in my original EXPLAIN  
ANALYZE:

->  Index Scan using ea1 on ea  (cost=0.00..2736.43 rows=42 width=47)  
(actual time=2.085..2.309 rows=2 loops=473)
Index Cond:  
((ea.incidentid)::text = ("outer".incidentid)::text)
Filter: (((recordtext)::text  
~~ '%RED%'::text) OR ((recordtext)::text ~~ '%CORVETTE%'::text))
Doesn't that mean it was using an index to filter?  Along those lines,  
before I created index 'ea1', the query was much much slower.  So, it  
seemed like creating this index made a difference.

One possible way to make the query faster is to limit based on date,  
as you will only get about 700 rows.
And then don't use subselects, as they are doing full sequence scans.   
I think this query does what you do
above, and I think it will be faster, but I don't know.

I REALLY like this idea! If I could just filter by date first and then  
sequential scan through those, it should be very manageable.  Hopefully  
I can keep this goal while still accommodating the requirement listed  
in my next paragraph.

select distinct em.incidentid, ea.recordtext as retdata, eg.long,   
eg.lat
FROM em JOIN ea ON (em.incidentid = ea.incidentid AND em.entrydate >=  
'2005-1-1 00:00'
AND em.entrydate <= '2005-5-9 00:00' AND ea.recordtext like '%RED%'  
AND ea.recordtext like '%CORVETTE%')
JOIN eg ON em.incidentid = eg.incidentid WHERE (recordtext like  
'%RED%' or recordtext like '%CORVETTE%'  );

I have run this, and while it is very fast, I'm concerned it's not  
doing what I need.  Here's the situation:

Due to the format of the systems with which I integrate ( I have no  
control over these formats ), we will get these 'recordtext' values one  
line at a time, accumulating over time.  The only way I can find to  
make this work is to insert a new record for each line.  The problem  
is, that when someone wants to search multiple keywords, they expect  
these words to be matched across multiple records with a given incident  
number.

 For a very simple example:
IncidentID		DateRecordtext
--		-			 
---
1			2005-05-01 14:21	blah blah blah RED blah blah
			2005-05-01 14:23	not what we are looking for
1			2005-05-02 02:05	blah CORVETTE blah blah

So, doing a search with an 'and' condition, e.g. WHERE RECORDTEXT LIKE  
'%RED%' AND RECORDTEXT LIKE '%CORVETTE%' , will not match because the  
condition will only be applied to a single row of recordtext at a time,  
not a whole group with the same incident number.

If I were to use tsearch2 for full-text indexing, would I need to  
create another table that merges all of my recordtext rows into a  
single 'text' field type?  If so, this is where I run into problems, as  
my logic also needs to match multiple words in their original order.  I  
may also receive additional updates to the previous data.  In that  
case, I need to replace the original record with the latest version of  
it.  If I have already concatenated these rows into a single field, the  
logic to in-line replace only the old text that has changed is very  
very difficult at best.  So, that's the reason I had to do two  
subqueries in my example.  Please tell me if I misunderstood your logic  
and it really will match given my condition above, but it didn't seem  
like it would.

Thanks again for the quick responses!  This list has been a great  
resource for me.

-Dan
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Query tuning help

2005-05-08 Thread Dan Harris
On May 8, 2005, at 8:06 PM, Josh Berkus wrote:

If I were to use tsearch2 for full-text indexing, would I need to
create another table that merges all of my recordtext rows into a
single 'text' field type?
No.   Read the OpenFTS docs, they are fairly clear on how to set up a 
simple
FTS index. (TSearch2 ~~ OpenFTS)

If so, this is where I run into problems, as
my logic also needs to match multiple words in their original order.
I have been reading the Tsearch2 docs and either I don't understand 
something or I'm not communicating my situation clearly enough.  It 
seems that Tsearch2 has a concept of "document".  And, in everything I 
am reading, they expect your "document" to be all contained in a single 
row.  Since my words can be spread across multiple rows, I don't see 
that Tsearch2 will combine all 'recordtext' row values with the same 
"incidentid" into a single vector.  Am I overlooking something in the 
docs?

I'm doing something fairly similar on one of my projects and it works 
very
well.

I'd be curious what similarities they have?  Is it the searching across 
multiple rows or the order of words?

The limitations on TSearch2 indexes are:
1) they are expensive to update, so your data loads would be noticably 
slower.
2) they are only fast when cached in RAM (and when cached, are *very* 
fast).
So if you have a variety of other processes that tend to fill up RAM 
between
searches, you may find them less useful.
3) You have to create a materialized index column next to recordtext, 
which
will increase the size of the table.
Duly noted.  If this method can search across rows, I'm willing to 
accept this overhead for the speed it would add.

In the meantime, is there any way I can reach my goal without Tsearch2 
by just restructuring my query to narrow down the results by date 
first, then seq scan for the 'likes'?

-Dan
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Query tuning help

2005-05-08 Thread Dan Harris
On May 8, 2005, at 8:32 PM, Russell Smith wrote:
I have run this, and while it is very fast, I'm concerned it's not
doing what I need.
How fast is very fast?
It took 35 seconds to complete versus ~450 my old way.
select distinct em.incidentid, ea.recordtext as retdata, eg.long, 
eg.lat
FROM em JOIN ea ON (em.incidentid = ea.incidentid AND em.entrydate >=  
'2005-1-1 00:00'
AND em.entrydate <= '2005-5-9 00:00' AND (ea.recordtext like '%RED%'  
OR ea.recordtext like '%CORVETTE%'))
JOIN eg ON em.incidentid = eg.incidentid WHERE
em.incidentid IN
(select distinct em.incidentid, ea.recordtext as retdata, eg.long, 
eg.lat
FROM em JOIN ea ON (em.incidentid = ea.incidentid AND em.entrydate >=  
'2005-1-1 00:00'
AND em.entrydate <= '2005-5-9 00:00' AND ea.recordtext like 
'%CORVETTE%'))
JOIN eg ON em.incidentid = eg.incidentid)  AND
em.incidentid IN
(select distinct em.incidentid, ea.recordtext as retdata, eg.long, 
eg.lat
FROM em JOIN ea ON (em.incidentid = ea.incidentid AND em.entrydate >=  
'2005-1-1 00:00'
AND em.entrydate <= '2005-5-9 00:00' AND ea.recordtext like '%RED%'))
JOIN eg ON em.incidentid = eg.incidentid)

Yes, it is nasty, but so was my previous query :)  So long as this is 
faster, I'm ok with that.  I'll see if i can make this work.  Thank you 
very much.

-Dan
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] investigating slow queries through pg_stat_activity

2005-06-20 Thread Dan Harris
I've got some queries generated by my application that will, for some  
reason, run forever until I kill the pid.  Yet, when I run the  
queries manually to check them out, they usually work fine.  To get  
more information about these queries, I'm writing a utility to take  
snapshots of pg_stat_activity every 5 minutes.  If it finds a query  
that runs for longer than 15 minutes, it will trap the query so I can  
run 'explain analyze' on it and see where the weakness is.


However, the problem I have is that pg_stat_activity only returns the  
first n (255?) characters of the SQL as "current_query", so it gets  
chopped off at the end.  I would very much like to find out how I can  
get the *entire* query that is active.  Is this possible?


Also, I'm sure some people will respond with "turn on query  
logging".. I've explored that option and the formatting of the log  
file and the fact that EVERY query is logged is not what I'm after  
for this project.  The "infinite-running" queries are unpredictable  
and may only happen once a week.  Logging 24/7 in anticipation of one  
of these occurrences is not something I'd like to do.


Thanks,

Dan Harris

---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


[PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Dan Harris

Gurus,

A table in one of my databases has just crossed the 30 million row  
mark and has begun to feel very sluggish for just about anything I do  
with it.  I keep the entire database vacuumed regularly.  And, as  
long as I'm not doing a sequential scan, things seem reasonably quick  
most of the time.  I'm now thinking that my problem is IO because  
anything that involves heavy ( like a seq scan ) IO seems to slow to  
a crawl.  Even if I am using indexed fields to grab a few thousand  
rows, then going to sequential scans it gets very very slow.


I have also had the occurrence where queries will not finish for days  
( I eventually have to kill them ).  I was hoping to provide an  
explain analyze for them, but if they never finish... even the  
explain never finishes when I try that.


For example, as I'm writing this, I am running an UPDATE statement  
that will affect a small part of the table, and is querying on an  
indexed boolean field.


I have been waiting for over an hour and a half as I write this and  
it still hasn't finished.  I'm thinking "I bet Tom, Simon or Josh  
wouldn't put up with this kind of wait time..", so I thought I would  
see if anyone here had some pointers.  Maybe I have a really stupid  
setting in my conf file that is causing this.  I really can't believe  
I am at the limits of this hardware, however.



The query:
update eventactivity set ftindex = false where ftindex = true;  
( added the where clause because I don't want to alter where ftindex  
is null )




The table:
  Column|Type | Modifiers
-+-+---
entrydate   | timestamp without time zone |
incidentid  | character varying(40)   |
statustype  | character varying(20)   |
unitid  | character varying(20)   |
recordtext  | character varying(255)  |
recordtext2 | character varying(255)  |
insertdate  | timestamp without time zone |
ftindex | boolean |
Indexes: eventactivity1 btree (incidentid),
 eventactivity_entrydate_idx btree (entrydate),
 eventactivity_ftindex_idx btree (ftindex),
 eventactivity_oid_idx btree (oid)




The hardware:

4 x 2.2GHz Opterons
12 GB of RAM
4x10k 73GB Ultra320 SCSI drives in RAID 0+1
1GB hardware cache memory on the RAID controller

The OS:
Fedora, kernel 2.6.6-1.435.2.3smp ( redhat stock kernel )
filesystem is mounted as ext2

#

vmstat output ( as I am waiting for this to finish ):
procs ---memory-- ---swap-- -io --system--  
cpu
r  b   swpd   free   buff  cache   si   sobibo   incs us  
sy id wa
0  1   5436 2823908  26140 918370401  2211   540  694   336   
9  2 76 13


#

iostat output ( as I am waiting for this to finish ):
avg-cpu:  %user   %nice%sys %iowait   %idle
   9.190.002.19   13.08   75.53

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
cciss/c0d0  329.26 17686.03  4317.57  161788630   39496378


#
This is a dedicated postgresql server, so maybe some of these  
settings are more liberal than they should be?


relevant ( I hope ) postgresql.conf options are:

shared_buffers = 5
effective_cache_size = 1348000
random_page_cost = 3
work_mem = 512000
max_fsm_pages = 8
log_min_duration_statement = 6
fsync = true ( not sure if I'm daring enough to run without this )
wal_buffers = 1000
checkpoint_segments = 64
checkpoint_timeout = 3000


# FOR PG_AUTOVACUUM --#
stats_command_string = true
stats_row_level = true

Thanks in advance,
Dan








---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Dan Harris

So sorry, I forgot to mention I'm running version 8.0.1

Thanks


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Dan Harris


On Jul 13, 2005, at 1:11 PM, John A Meinel wrote:



I might be wrong, but there may be something much more substantially
wrong than slow i/o.
John



Yes, I'm afraid of that too.  I just don't know what tools I should  
use to figure that out.  I have some 20 other databases on this  
system, same schema but varying sizes, and the small ones perform  
very well.  It feels like there is an O(n) increase in wait time that  
has recently become very noticeable on the largest of them.


-Dan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Dan Harris


On Jul 13, 2005, at 2:17 PM, Stephen Frost wrote:


Could you come up w/ a test case that others could reproduce where
explain isn't returning?


This was simply due to my n00bness :)  I had always been doing  
explain analyze, instead of just explain.  Next time one of these  
queries comes up, I will be sure to do the explain without analyze.


FYI that update query I mentioned in the initial thread just finished  
after updating 8.3 million rows.


-Dan


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-13 Thread Dan Harris


On Jul 14, 2005, at 12:12 AM, Greg Stark wrote:


Dan Harris <[EMAIL PROTECTED]> writes:



I keep the entire database vacuumed regularly.



How often is "regularly"?
Well, once every day, but there aren't a ton of inserts or updates  
going on a daily basis.  Maybe 1,000 total inserts?


Also, if you've done occasional massive batch updates like you  
describe here
you may need a VACUUM FULL or alternatively a CLUSTER command to  
compact the
table -- vacuum identifies the free space but if you've doubled the  
size of
your table with a large update that's a lot more free space than  
you want

hanging around waiting to be used.

I have a feeling I'm going to need to do a cluster soon.  I have done  
several mass deletes and reloads on it.




For example, as I'm writing this, I am running an UPDATE  
statement  that will
affect a small part of the table, and is querying on an  indexed  
boolean field.



...

update eventactivity set ftindex = false where ftindex = true;   
( added the

where clause because I don't want to alter where ftindex  is null )



It's definitely worthwhile doing an "EXPLAIN UPDATE..." to see if  
this even

used the index. It sounds like it did a sequential scan.



I tried that, and indeed it was using an index, although after  
reading Simon's post, I realize that was kind of dumb to have an  
index on a bool. I have since removed it.


Sequential scans during updates are especially painful. If there  
isn't free
space lying around in the page where the updated record lies then  
another page
has to be used or a new page added. If you're doing a massive  
update you can
exhaust the free space available making the update have to go back  
and forth
between the page being read and the end of the table where pages  
are being

written.


This is great info, thanks.





#

vmstat output ( as I am waiting for this to finish ):
procs ---memory-- ---swap-- -io --system--
cpu
r  b   swpd   freebuff   cache   si   sobibo   in 
cs  us sy id wa
0  1   5436 2823908  26140 918370401  2211   540  694
336   9  2 76 13




[I assume you ran "vmstat 10" or some other interval and then  
waited for at

least the second line? The first line outputted from vmstat is mostly
meaningless]


Yeah, this was at least 10 or so down the list ( the last one before  
ctrl-c )




Um. That's a pretty meager i/o rate. Just over 2MB/s. The cpu is  
76% idle
which sounds fine but that could be one processor pegged at 100%  
while the
others are idle. If this query is the only one running on the  
system then it

would behave just like that.
Well, none of my processors had ever reached 100% until I changed to  
ext2 today ( read below for more info )


Is it possible you have some foreign keys referencing these records  
that
you're updating? In which case every record being updated might be  
causing a
full table scan on another table (or multiple other tables). If  
those tables
are entirely in cache then it could cause these high cpu low i/o  
symptoms.




No foreign keys or triggers.


Ok, so I remounted this drive as ext2 shortly before sending my first  
email today.  It wasn't enough time for me to notice the ABSOLUTELY  
HUGE difference in performance change.  Ext3 must really be crappy  
for postgres, or at least is on this box.  Now that it's ext2, this  
thing is flying like never before.   My CPU utilization has  
skyrocketed, telling me that the disk IO was constraining it immensely.


I always knew that it might be a little faster, but the box feels  
like it can "breathe" again and things that used to be IO intensive  
and run for an hour or more are now running in < 5 minutes.  I'm a  
little worried about not having a journalized file system, but that  
performance difference will keep me from switching back ( at least to  
ext3! ).  Maybe someday I will try XFS.


I would be surprised if everyone who ran ext3 had this kind of  
problem, maybe it's specific to my kernel, raid controller, I don't  
know.  But, this is amazing.  It's like I have a new server.


Thanks to everyone for their valuable input and a big thanks to all  
the dedicated pg developers on here who make this possible!


-Dan


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] slow joining very large table to smaller ones

2005-07-14 Thread Dan Harris

I'm trying to improve the speed of this query:

explain select recordtext from eventactivity inner join ( select  
incidentid from k_r where id = 94 ) a using ( incidentid ) inner join  
( select incidentid from k_b where id = 107 ) b using ( incidentid );

  QUERY PLAN
 
--

Merge Join  (cost=2747.29..4249364.96 rows=11968693 width=35)
   Merge Cond: (("outer".incidentid)::text = "inner"."?column2?")
   ->  Merge Join  (cost=1349.56..4230052.73 rows=4413563 width=117)
 Merge Cond: (("outer".incidentid)::text = "inner"."?column2?")
 ->  Index Scan using eventactivity1 on eventactivity   
(cost=0.00..4051200.28 rows=44519781 width=49)

 ->  Sort  (cost=1349.56..1350.85 rows=517 width=68)
   Sort Key: (k_b.incidentid)::text
   ->  Index Scan using k_b_idx on k_b   
(cost=0.00..1326.26 rows=517 width=68)

 Index Cond: (id = 107)
   ->  Sort  (cost=1397.73..1399.09 rows=542 width=68)
 Sort Key: (k_r.incidentid)::text
 ->  Index Scan using k_r_idx on k_r  (cost=0.00..1373.12  
rows=542 width=68)

   Index Cond: (id = 94)
(13 rows)


There are many millions of rows in eventactivity.  There are a few  
ten-thousand rows in k_r and k_b.  There is an index on 'incidentid'  
in all three tables.  There should only be less than 100 rows matched  
in k_r and k_b total.  That part on its own is very very fast.  But,  
it should have those 100 or so incidentids extracted in under a  
second and then go into eventactivity AFTER doing that.  At least,  
that's my intention to make this fast.


Right now, it looks like pg is trying to sort the entire  
eventactivity table for the merge join which is taking several  
minutes to do.  Can I rephrase this so that it does the searching  
through k_r and k_b FIRST and then go into eventactivity using the  
index on incidentid?  It seems like that shouldn't be too hard to  
make fast but my SQL query skills are only average.


Thanks
-Dan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Quad Opteron stuck in the mud

2005-07-14 Thread Dan Harris


On Jul 14, 2005, at 9:47 AM, Alvaro Herrera wrote:


On Thu, Jul 14, 2005 at 12:28:05AM -0600, Dan Harris wrote:


.  Ext3 must really be crappy
for postgres, or at least is on this box.


Were you using the default journal settings for ext3?


Yes, I was.  Next time I get a chance to reboot this box, I will try  
writeback and compare the benchmarks to my previous config.  Thanks  
for the tip.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] slow joining very large table to smaller ones

2005-07-14 Thread Dan Harris


On Jul 14, 2005, at 9:42 AM, John A Meinel wrote:



You might try giving it a little bit more freedom with:

EXPLAIN ANALYZE
SELECT recordtext FROM eventactivity, k_r, k_b
 WHERE eventactivity.incidentid = k_r.incidentid
   AND eventactivity.incidentid = k_b.incidentid
   AND k_r.id = 94
   AND k_b.id = 107
-- AND k_r.incidentid = k_b.incidentid
;

I'm pretty sure that would give identical results, just let the  
planner

have a little bit more freedom about how it does it.
Also the last line is commented out, because I think it is redundant.



Ok, I tried this one.  My ssh keeps getting cut off by a router  
somewhere between me and the server due to inactivity timeouts, so  
all I know is that both the select and explain analyze are taking  
over an hour to run.  Here's the explain select for that one, since  
that's the best I can get.


explain select recordtext from eventactivity,k_r,k_b where  
eventactivity.incidentid = k_r.incidentid and  
eventactivity.incidentid = k_b.incidentid and k_r.id = 94 and k_b.id  
= 107;

  QUERY PLAN
 
--

Merge Join  (cost=9624.61..4679590.52 rows=151009549 width=35)
   Merge Cond: (("outer".incidentid)::text = "inner"."?column2?")
   ->  Merge Join  (cost=4766.92..4547684.26 rows=16072733 width=117)
 Merge Cond: (("outer".incidentid)::text = "inner"."?column2?")
 ->  Index Scan using eventactivity1 on eventactivity   
(cost=0.00..4186753.16 rows=46029271 width=49)

 ->  Sort  (cost=4766.92..4771.47 rows=1821 width=68)
   Sort Key: (k_b.incidentid)::text
   ->  Index Scan using k_b_idx on k_b   
(cost=0.00..4668.31 rows=1821 width=68)

 Index Cond: (id = 107)
   ->  Sort  (cost=4857.69..4862.39 rows=1879 width=68)
 Sort Key: (k_r.incidentid)::text
 ->  Index Scan using k_r_idx on k_r  (cost=0.00..4755.52  
rows=1879 width=68)

   Index Cond: (id = 94)
(13 rows)




You might also try:
EXPLAIN ANALYZE
SELECT recordtext
  FROM eventactivity JOIN k_r USING (incidentid)
  JOIN k_b USING (incidentid)
 WHERE k_r.id = 94
   AND k_b.id = 107
;



Similar results here.  The query is taking at least an hour to finish.

explain select recordtext from eventactivity join k_r using  
( incidentid ) join k_b using (incidentid ) where k_r.id = 94 and  
k_b.id = 107;


 QUERY PLAN
 
--

Merge Join  (cost=9542.77..4672831.12 rows=148391132 width=35)
   Merge Cond: (("outer".incidentid)::text = "inner"."?column2?")
   ->  Merge Join  (cost=4726.61..4542825.87 rows=15930238 width=117)
 Merge Cond: (("outer".incidentid)::text = "inner"."?column2?")
 ->  Index Scan using eventactivity1 on eventactivity   
(cost=0.00..4184145.43 rows=46000104 width=49)

 ->  Sort  (cost=4726.61..4731.13 rows=1806 width=68)
   Sort Key: (k_b.incidentid)::text
   ->  Index Scan using k_b_idx on k_b   
(cost=0.00..4628.92 rows=1806 width=68)

 Index Cond: (id = 107)
   ->  Sort  (cost=4816.16..4820.82 rows=1863 width=68)
 Sort Key: (k_r.incidentid)::text
 ->  Index Scan using k_r_idx on k_r  (cost=0.00..4714.97  
rows=1863 width=68)

   Index Cond: (id = 94)
(13 rows)




Also, if possible give us the EXPLAIN ANALYZE so that we know if the
planner is making accurate estimates. (You might send an EXPLAIN while
waiting for the EXPLAIN ANALYZE to finish)

You can also try disabling merge joins, and see how that changes  
things.




Are there any negative sideaffects of doing this?








Well, postgres is estimating around 500 rows each, is that way off?  
Try

just doing:
EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107;
EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94;

And see if postgres estimates the number of rows properly.

I assume you have recently VACUUM ANALYZEd, which means you might need
to update the statistics target (ALTER TABLE k_b ALTER COLUMN
incidientid SET STATISTICS 100) default is IIRC 10, ranges from  
1-1000,

higher is more accurate, but makes ANALYZE slower.




Right now, it looks like pg is trying to sort the entire   
eventactivity
table for the merge join which is taking several  minutes to do.   
Can I
rephrase this so that it does the searching  through k_r and k_b  
FIRST
and then go into eventactivity using the  index on incidentid?  It  
seems

like that shouldn't be too hard to  make fast but my SQL query skills
are only average.



To me, it looks like it is doing an index scan (on k_b.id) through k_b
first, sorting the results by incidentid, then merge joining that with
eventactivity.

I'm guessing you actually want it to merge k_b and k_r to get extra
selec

Re: [PERFORM] slow joining very large table to smaller ones

2005-07-14 Thread Dan Harris


On Jul 14, 2005, at 5:12 PM, John A Meinel wrote:


Dan Harris wrote:




Well, postgres is estimating around 500 rows each, is that way  
off?  Try

just doing:
EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107;
EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94;



Once again, do this and post the results. We might just need to tweak
your settings so that it estimates the number of rows correctly,  
and we

don't need to do anything else.



Ok, sorry I missed these the first time through:

explain analyze select incidentid from k_b where id = 107;
   QUERY PLAN
 

Index Scan using k_b_idx on k_b  (cost=0.00..1926.03 rows=675  
width=14) (actual time=0.042..298.394 rows=2493 loops=1)

   Index Cond: (id = 107)
Total runtime: 299.103 ms

select count(*) from k_b;
count

698350

( sorry! I think I said this one only had tens of thousands in it )


explain analyze select incidentid from k_r where id =  
94;   QUERY PLAN
 
-
Index Scan using k_r_idx on k_r  (cost=0.00..2137.61 rows=757  
width=14) (actual time=0.092..212.187 rows=10893 loops=1)

   Index Cond: (id = 94)
Total runtime: 216.498 ms
(3 rows)


select count(*) from k_r;
count

671670


That one is quite a bit slower, yet it's the same table structure and  
same index as k_b, also it has fewer records.


I did run VACUUM ANALYZE immediately before running these queries.   
It seems a lot better with the join_collapse set.




\
Well, the planner is powerful enough to flatten nested selects. To  
make

it less "intelligent" you can do:
SET join_collapse_limit 1;
or
SET join_collapse_limit 0;
Which should tell postgres to not try and get tricky with your query.
Again, *usually* the planner knows better than you do. So again  
just do

it to see what you get.



Ok, when join_collapse_limit = 1 I get this now:

explain analyze select recordtext from eventactivity join ( select  
incidentid from k_r join k_b using (incidentid) where k_r.id = 94 and  
k_b.id = 107 ) a  using (incidentid );
   
QUERY PLAN
 
---
Nested Loop  (cost=0.00..156509.08 rows=2948 width=35) (actual  
time=1.555..340.625 rows=24825 loops=1)
   ->  Nested Loop  (cost=0.00..5361.89 rows=6 width=28) (actual  
time=1.234..142.078 rows=366 loops=1)
 ->  Index Scan using k_b_idx on k_b  (cost=0.00..1943.09  
rows=681 width=14) (actual time=0.423..56.974 rows=2521 loops=1)

   Index Cond: (id = 107)
 ->  Index Scan using k_r_idx on k_r  (cost=0.00..5.01  
rows=1 width=14) (actual time=0.031..0.031 rows=0 loops=2521)
   Index Cond: ((k_r.id = 94) AND  
((k_r.incidentid)::text = ("outer".incidentid)::text))
   ->  Index Scan using eventactivity1 on eventactivity   
(cost=0.00..25079.55 rows=8932 width=49) (actual time=0.107..0.481  
rows=68 loops=366)
 Index Cond: ((eventactivity.incidentid)::text =  
("outer".incidentid)::text)

Total runtime: 347.975 ms

MUCH better!  Maybe you can help me understand what I did and if I  
need to make something permanent to get this behavior from now on?








If you have analyzed recently can you do:
SELECT relname, reltuples FROM pg_class WHERE relname='eventactivity';

It is a cheaper form than "SELECT count(*) FROM eventactivity" to  
get an

approximate estimate of the number of rows. But if it isn't too
expensive, please also give the value from SELECT count(*) FROM
eventactivity.

Again, that helps us know if your tables are up-to-date.



Sure:

select relname, reltuples from pg_class where relname='eventactivity';
relname|  reltuples
---+-
eventactivity | 3.16882e+07

select count(*) from eventactivity;
  count
--
31871142













I don't know how selective your keys are, but one of these queries
should probably structure it better for the planner. It depends  
a  lot on

how selective your query is.




eventactivity currently has around 36 million rows in it. There   
should
only be maybe 200-300 incidentids at most that will be matched   
with the
combination of k_b and k_r.  That's why I was thinking I  could  
somehow

get a list of just the incidentids that matched the id  = 94 and id =
107 in k_b and k_r first. Then, I would only need to  grab a few  
hundred

out of 36 million rows from eventactivity.




Well, you can also try:
SELECT count(*) FROM k_b JOIN k_r USING (i

Re: [PERFORM] slow joining very large table to smaller ones

2005-07-14 Thread Dan Harris


On Jul 14, 2005, at 7:15 PM, John A Meinel wrote:



Is the distribution of your rows uneven? Meaning do you have more rows
with a later id than an earlier one?



There are definitely some id's that will have many times more than  
the others.  If I group and count them, the top 10 are fairly  
dominant in the table.




Hmm.. How to do it permanantly? Well you could always issue "set
join_collapse set 1; select * from "
But obviously that isn't what you prefer. :)

I think there are things you can do to make merge join more expensive
than a nested loop, but I'm not sure what they are.


Maybe someone else has some ideas to encourage this behavior for  
future work?  Setting it on a per-connection basis is doable, but  
would add some burden to us in code.




What I really don't understand is that the estimates dropped as well.
The actual number of estimate rows drops to 3k instead of > 1M.
The real question is why does the planner think it will be so  
expensive?




select count(*) from k_b join k_r using (incidentid) where k_b.id=107
and k_r.id=94;
count
---
   373




Well, this says that they are indeed much more selective.
Each one has > 1k rows, but together you end up with only 400.



Is this a bad thing?  Is this not "selective enough" to make it much  
faster?


Overall, I'm much happier now after seeing the new plan come about,  
if I can find a way to make that join_collapse behavior permanent, I  
can certainly live with these numbers.


Thanks again for your continued efforts.

-Dan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] slow joining very large table to smaller ones

2005-07-15 Thread Dan Harris


On Jul 14, 2005, at 10:12 PM, John A Meinel wrote:


My biggest question is why the planner things the Nested Loop would be
so expensive.
Have you tuned any of the parameters? It seems like something is  
out of

whack. (cpu_tuple_cost, random_page_cost, etc...)



here's some of my postgresql.conf.  Feel free to blast me if I did  
something idiotic here.


shared_buffers = 5
effective_cache_size = 1348000
random_page_cost = 3
work_mem = 512000
max_fsm_pages = 8
log_min_duration_statement = 6
fsync = true ( not sure if I'm daring enough to run without this )
wal_buffers = 1000
checkpoint_segments = 64
checkpoint_timeout = 3000


# FOR PG_AUTOVACUUM --#
stats_command_string = true
stats_row_level = true


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] slow joining very large table to smaller ones

2005-07-15 Thread Dan Harris


On Jul 15, 2005, at 9:09 AM, Dan Harris wrote:



On Jul 14, 2005, at 10:12 PM, John A Meinel wrote:



My biggest question is why the planner things the Nested Loop  
would be

so expensive.
Have you tuned any of the parameters? It seems like something is  
out of

whack. (cpu_tuple_cost, random_page_cost, etc...)




here's some of my postgresql.conf.  Feel free to blast me if I did  
something idiotic here.


shared_buffers = 5
effective_cache_size = 1348000
random_page_cost = 3
work_mem = 512000
max_fsm_pages = 8
log_min_duration_statement = 6
fsync = true ( not sure if I'm daring enough to run without this )
wal_buffers = 1000
checkpoint_segments = 64
checkpoint_timeout = 3000


# FOR PG_AUTOVACUUM --#
stats_command_string = true
stats_row_level = true



Sorry, I forgot to re-post my hardware specs.

HP DL585
4 x 2.2 GHz Opteron
12GB RAM
SmartArray RAID controller, 1GB hardware cache, 4x73GB 10k SCSI in  
RAID 0+1

ext2 filesystem

Also, there are 30 databases on the machine, 27 of them are identical  
schemas. 
 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Really bad diskio

2005-07-15 Thread Dan Harris


On Jul 15, 2005, at 2:39 PM, Ron Wills wrote:


Hello all

  I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
an 3Ware SATA raid.


Operating System?  Which file system are you using?  I was having a  
similar problem just a few days ago and learned that ext3 was the  
culprit.


-Dan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Coraid/AoE device experience?

2005-07-25 Thread Dan Harris
Lately, I've been reading a lot about these new Coraid AoE RAID  
devices ( http://www.coraid.com ).  They tout it as being fast and  
cheap and better than iSCSI due to the lack of TCP/IP over the wire.   
Is it likely that a 15-drive RAID 10 Linux software RAID would  
outperform a 4-drive 10k SCSI RAID 0+1 for a heavy-loaded database?   
If not software RAID, how about their dedicated RAID controller blade?


I'm definitely IO bound right now and starving for spindles.  Does  
this make sense or is it too good to be true?


Thanks
-Dan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[PERFORM] faster INSERT with possible pre-existing row?

2005-07-26 Thread Dan Harris
I am working on a process that will be inserting tens of million rows  
and need this to be as quick as possible.


The catch is that for each row I could potentially insert, I need to  
look and see if the relationship is already there  to prevent  
multiple entries.  Currently I am doing a SELECT before doing the  
INSERT, but I recognize the speed penalty in doing to operations.  I  
wonder if there is some way I can say "insert this record, only if it  
doesn't exist already".  To see if it exists, I would need to compare  
3 fields instead of just enforcing a primary key.


Even if this could be a small increase per record, even a few percent  
faster compounded over the whole load could be a significant reduction.


Thanks for any ideas you might have.

-Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Fwd: Help with view performance problem

2005-07-28 Thread Dan Harris


On Jul 28, 2005, at 8:38 AM, Chris Hoover wrote:



I did some more testing, and ran the explain analyze on the problem.
In my session I did a set enable_hashjoin = false and then ran the
analyze.  This caused it to use the indexes as I have been expecting
it to do.

Now, how can I get it to use the indexes w/o manipulating the
environment?  What make postgresql want to sequentially scan and use a
hash join?

thanks,

Chris

explain analyze with set_hashjoin=false;
prob_db=#explain analyze select * from clm_com;




I had something similar to this happen recently.  The planner was  
choosing a merge join and seq scan because my 'random_page_cost' was  
set too high.  I had it at 3 , and ended up settling at 1.8 to get it  
to correctly use my indices.  Once that change was in place, the  
planner did the 'right' thing for me.


Not sure if this will help you, but it sounds similar.

-Dan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Table locking problems?

2005-08-09 Thread Dan Harris
I thought I would send this to pg-performance since so many people  
helped me with my speed issues recently.  I was definitely IO- 
bottlenecked.


Since then, I have installed 2 RAID arrays with 7 15k drives in them  
in RAID 0+1 as well as add a new controller card with 512MB of cache  
on it.  I also created this new partition on the RAID as XFS instead  
of ext3.


These changes have definitely improved performance, but I am now  
finding some trouble with UPDATE or DELETE queries "hanging" and  
never releasing their locks.  As this happens, other statements queue  
up behind it.  It seems to occur at times of very high loads on the  
box.  Is my only option to kill the query ( which usually takes down  
the whole postmaster with it! ouch ).


Could these locking issues be related to the other changes I made?  
I'm really scared that this is related to choosing XFS, but I sure  
hope not.   How should I go about troubleshooting the "problem"  
queries?  They don't seem to be specific to a single table or single  
database.


I'm running 8.0.1 on kernel 2.6.12-3 on 64-bit Opterons if that  
matters..



-Dan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Table locking problems?

2005-08-09 Thread Dan Harris


On Aug 9, 2005, at 1:08 PM, Tom Lane wrote:


"Joshua D. Drake" <[EMAIL PROTECTED]> writes:


My experience is that when this type of thing happens it is typically
specific queries that cause the problem. If you turn on statement
logging you can get the exact queries and debug from there.





Here are some things to look for:





Is it a large table (and thus large indexes) that it is updating?
Is the query using indexes?
Is the query modifying ALOT of rows?



Another thing to look at is foreign keys.  Dan could be running into
problems with an update on one side of an FK being blocked by locks
on the associated rows on the other side.

regards, tom lane



Tom, Steve, Josh:

Thank you for your ideas.  The updates are only on a single table, no  
joins.  I had stats collection turned off.  I have turned that on  
again so that I can try and catch one while the problem is  
occurring.  The last table it did this on was about 3 million  
records.  4 single-column indexes on it.


The problem I had with statement logging is that if the query never  
finishes, it doesn't get logged as far as I can tell.  So everything  
that did get logged was normal and would run with no isses in psql by  
copy and pasting it.  The rows updated will certainly vary by query.   
I really need to "catch it in the act" with stats collection on so I  
can get the query from pg_stat_activity.  Once I get it, I will play  
with explains and see if I can reproduce it outside the wild.


Thanks again for your help.

-Dan


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Table locking problems?

2005-08-09 Thread Dan Harris


On Aug 10, 2005, at 12:49 AM, Steve Poe wrote:


Dan,

Do you mean you did RAID 1 + 0 (RAID 10) or RAID 0 + 1? Just a
clarification, since RAID 0 is still a single-point of failure even if
RAID1 is on top of RAID0.


Well, you tell me if I stated incorrectly.  There are two raid  
enclosures with 7 drives in each.  Each is on its own bus on a dual- 
channel controller.  Each box has a stripe across its drives and the  
enclosures are mirrors of each other.  I understand the controller  
could be a single point of failure, but I'm not sure I understand  
your concern about the RAID structure itself.




How many users are connected when your update / delete queries are
hanging? Have you done an analyze verbose on those queries?


Most of the traffic is from programs we run to do analysis of the  
data and managing changes.  At the time I noticed it this morning,  
there were 10 connections open to the database.  That rarely goes  
above 20 concurrent.  As I said in my other response, I believe that  
the log will only contain the query at the point the query finishes,  
so if it never finishes...




Have you made changes to the postgresql.conf? kernel.vm settings? IO
scheduler?


I set shmmax appropriately for my shared_buffers setting, but that's  
the only kernel tweak.




If you're not doing so already, you may consider running sar  
(iostat) to

monitor when the hanging occurs if their is a memory / IO bottleneck
somewhere.



I will try that.  Thanks



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Table locking problems?

2005-08-09 Thread Dan Harris


On Aug 9, 2005, at 3:51 PM, John A Meinel wrote:


Dan Harris wrote:


On Aug 10, 2005, at 12:49 AM, Steve Poe wrote:


Dan,

Do you mean you did RAID 1 + 0 (RAID 10) or RAID 0 + 1? Just a
clarification, since RAID 0 is still a single-point of failure  
even if

RAID1 is on top of RAID0.

Well, you tell me if I stated incorrectly.  There are two raid   
enclosures with 7 drives in each.  Each is on its own bus on a  
dual- channel controller.  Each box has a stripe across its drives  
and the  enclosures are mirrors of each other.  I understand the  
controller  could be a single point of failure, but I'm not sure I  
understand  your concern about the RAID structure itself.




In this configuration, if you have a drive fail on both  
controllers, the entire RAID dies. Lets label them A1-7, B1-7,  
because you stripe within a set, if a single one of A dies, and a  
single one of B dies, you have lost your entire mirror.


The correct way of doing it, is to have A1 be a mirror of B1, and  
then stripe above that. Since you are using 2 7-disk enclosures,  
I'm not sure how you can do it well, since it is not an even number  
of disks. Though if you are using software RAID, there should be no  
problem.


The difference is that in this scenario, *all* of the A drives can  
die, and you haven't lost any data. The only thing you can't lose  
is a matched pair (eg losing both A1 and B1 will cause complete  
data loss)


I believe the correct notation for this last form is RAID 1 + 0  
(RAID10) since you have a set of RAID1 drives, with a RAID0 on-top  
of them.




I have read up on the difference now. I don't understand why it's a  
"single point of failure".  Technically any array could be a "single  
point" depending on your level of abstraction.   In retrospect, I  
probably should have gone 8 drives in each and used RAID 10 instead  
for the better fault-tolerance,  but it's online now and will require  
some planning to see if I want to reconfigure that in the future.  I  
wish HP's engineer would have promoted that method instead of 0+1..


-Dan


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] Speedier count(*)

2005-08-10 Thread Dan Harris
I have a web page for my customers that shows them count of records  
and some min/max date ranges in each table of a database, as this is  
how we bill them for service.  They can log in and check the counts  
at any time.  I'd like for the counts to be as fresh as possible by  
keeping this dynamic, but I will use a periodic 'snapshot'/cron job  
if that is the only option to speed this up.   I have thought about  
using the table statistics, but the estimate error is probably  
unacceptable because of the billing purposes.


For some reason, the SQL Server we migrated the app from can return  
count(*) in a split second on multi-million row tables, even though  
it is a MUCH slower box hardware-wise, but it's now taking many  
seconds to run. I have read in the archives the problems MVCC brings  
into the count(*) dilemma forcing Pg to run a seq scan to get  
counts.  Does SQLServer not use MVCC or have they found another  
approach for arriving at this number?  Compounding all the min/max  
and counts from other tables and all those queries take about a  
minute to run. The tables will contain anywhere from 1 million to 40  
million rows.


Also, I am using "select ... group by ... order by .. limit 1" to get  
the min/max since I have already been bit by the issue of min() max()  
being slower.



-Dan



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Speedier count(*)

2005-08-11 Thread Dan Harris

Thanks for all the great ideas.  I have more options to evaluate now.

-Dan


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Query plan looks OK, but slow I/O - settings advice?

2005-08-19 Thread Dan Harris


On Aug 19, 2005, at 12:55 AM, Jeffrey W. Baker wrote:


On Tue, 2005-08-16 at 10:46 -0700, Roger Hand wrote:
  Have you considered booting your
machine with elevator=deadline?


Although I'm not the OP for this problem, I thought I'd try it out.   
WOW.. this should be in a Pg tuning guide somewhere. I added this to  
my server tonight just for kicks and saw a pronounced improvement in  
IO performance.  Thank you very much for mentioning this on the list.


I didn't have a long enough maintenance window to do solid  
benchmarking, but I can say for certain that the change was  
noticeable, especially in VACUUM operations.


Specs for the server:

PG 8.0.1
Linux 2.6.12-3 kernel
4xOpteron 2.2
12GB RAM
16-drive RAID 10
XFS mounted with noatime
pg_xlog on separate RAID controller

-Dan


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] extremly low memory usage

2005-08-20 Thread Dan Harris


On Aug 19, 2005, at 3:01 PM, Jeremiah Jahn wrote:

Rebuild in progress with just ext3 on the raid array...will see if  
this

helps the access times.


From my recent experiences, I can say ext3 is probably not a great  
choice for Pg databases.  If you check the archives you'll see  
there's a lot of discussion about various journalling filesystems and  
ext3 usually(always?) comes up on the bottom as far as performance  
goes.  If you insist on using it, I would at least recommend the  
noatime option in fstab and using data=writeback to get the faster of  
the journal modes.


XFS seems to be a trusted choice, followed by Reiser and JFS both  
with the occasional controversy when the comparisons pop up.


-Dan


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Poor performance on HP Package Cluster

2005-09-01 Thread Dan Harris
Do you have any sources for that information?  I am running dual  
SmartArray 6402's in my DL585 and haven't noticed anything poor about  
their performance.


On Sep 1, 2005, at 2:24 PM, Luke Lonergan wrote:

Are you using the built-in HP SmartArray RAID/SCSI controllers?  If  
so, that

could be your problem, they are known to have terrible and variable
performance with Linux.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Monitoring Postgresql performance

2005-09-28 Thread Dan Harris


On Sep 28, 2005, at 8:32 AM, Arnau wrote:


Hi all,

  I have been "googling" a bit searching info about a way to  
monitor postgresql (CPU & Memory, num processes, ... )


You didn't mention your platform, but I have an xterm open pretty  
much continuously for my DB server that runs plain old top.  I have  
customized my settings enough that I can pretty much see anything I  
need to from there.


-Dan


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Ultra-cheap NVRAM device

2005-10-03 Thread Dan Harris


On Oct 3, 2005, at 5:02 AM, Steinar H. Gunderson wrote:

I thought this might be interesting, not the least due to the  
extremely low

price ($150 + the price of regular DIMMs):





This has been posted before, and the main reason nobody got very  
excited is that:


a) it only uses the PCI bus to provide power to the device, not for I/O
b) It is limited to SATA bandwidth
c) The benchmarks did not prove it to be noticeably faster than a  
good single SATA drive


A few of us were really excited at first too, until seeing the  
benchmarks..


-Dan



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Ultra-cheap NVRAM device

2005-10-03 Thread Dan Harris


On Oct 3, 2005, at 5:02 AM, Steinar H. Gunderson wrote:

I thought this might be interesting, not the least due to the  
extremely low

price ($150 + the price of regular DIMMs):




Replying before my other post came through.. It looks like their  
benchmarks are markedly improved since the last article I read on  
this.  There may be more interest now..


-Dan


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] quad or dual core Intel CPUs

2007-02-13 Thread Dan Harris

Arjen van der Meijden wrote:



But be aware that there can be substantial and unexpected differences on 
this relatively new platform due to simply changing the OS, like we saw 
when going from linux 2.6.15 to 2.6.18, as you can see here:

http://tweakers.net/reviews/657/2



Having upgraded to 2.6.18 fairly recently, I am *very* interested in 
what caused the throughput to drop in 2.6.18?  I haven't done any 
benchmarking on my system to know if it affected my usage pattern 
negatively, but I am curious if anyone knows why this happened?


-Dan

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] General advice on user functions

2007-02-21 Thread Dan Harris
I have a new task of automating the export of a very complex Crystal 
Report.  One thing I have learned in the last 36 hours is that the 
export process to PDF is really, really, slooww..


Anyway, that is none of your concern.  But, I am thinking that I can 
somehow utilize some of PG's strengths to work around the bottleneck in 
Crystal.  The main problem seems to be that tens of thousands of rows of 
data must be summarized in the report and calculations made.  Based on 
my recent experience, I'd say that this task would be better suited to 
PG than relying on Crystal Reports to do the summarizing.


The difficulty I'm having is that the data needed is from about 50 
different "snapshots" of counts over time.  The queries are very simple, 
however I believe I am going to need to combine all of these queries 
into a single function that runs all 50 and then returns just the 
count(*) of each as a separate "column" in a single row.


I have been Googling for hours and reading about PL/pgsql functions in 
the PG docs and I have yet to find examples that returns multiple items 
in a single row.  I have seen cases that return "sets of", but that 
appears to be returning multiple rows, not columns.  Maybe this I'm 
barking up the wrong tree?


Here's the gist of what I need to do:

1) query count of rows that occurred between 14 months ago and 12 months 
ago for a given criteria, then count the rows that occurred between 2 
months ago and current.  Repeat for 50 different where clauses.


2) return each count(*) as a "column" so that in the end I can say:

select count_everything( ending_date );

and have it return to me:

count_a_lastyear   count_a_last60count_b_lastyearcount_b_last60
   ----
 100150   200 250

I'm not even sure if a function is what I'm after, maybe this can be 
done in a view?  I am embarrassed to ask something that seems like it 
should be easy, but some key piece of knowledge is escaping me on this.


I don't expect someone to write this for me, I just need a nudge in the 
right direction and maybe a URL or two to get me started.


Thank you for reading this far.

-Dan

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] General advice on user functions

2007-02-21 Thread Dan Harris

Thank you all for your ideas.  I appreciate the quick response.

-Dan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Determining server load from client

2007-03-20 Thread Dan Harris
I've found that it would be helpful to be able to tell how busy my 
dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before 
pounding it with some OLAP-type queries.  Specifically, I have a 
multi-threaded client program that needs to run several thousand 
sequential queries.  I broke it into threads to take advantage of the 
multi-core architecture of the server hardware.  It would be very nice 
if I could check the load of the server at certain intervals to throttle 
the number of concurrent queries and mitigate load problems when other 
processes might be already inducing a significant load.


I have seen some other nice back-end things exposed through PG functions 
( e.g. database size on disk ) and wondered if there was anything 
applicable to this.  Even if it can't return the load average proper, is 
there anything else in the pg_* tables that might give me a clue how 
"busy" the server is for a period of time?


I've thought about allowing an ssh login without a keyphrase to log in 
and capture it.  But, the client process is running as an apache user. 
Giving the apache user a shell login to the DB box does not seem like a 
smart idea for obvious security reasons...


So far, that's all I can come up with, other than a dedicated socket 
server daemon on the DB machine to do it.


Any creative ideas are welcomed :)

Thanks

-Dan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Determining server load from client

2007-03-20 Thread Dan Harris

Dan Harris wrote:
I've found that it would be helpful to be able to tell how busy my 
dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before 
pounding it with some OLAP-type queries. 

..snip

Thank you all for your great ideas!  I'm going to try the perl function 
as that seems like a very elegant way of doing it.


-Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Planner doing seqscan before indexed join

2007-03-28 Thread Dan Harris

8.0.3 - Linux 2.6.18..
Freshly vacuumed and analyzed

This database has been humming along fine for a while now, but I've got one of 
those sticky queries that is taking
much too long to finish.

After some digging, I've found that the planner is choosing to apply a 
necessary seq scan to the table.  Unfortunately,
it's scanning the whole table, when it seems that it could have joined it to a 
smaller table first and reduce the
amount of rows it would have to scan dramatically ( 70 million to about 5,000 ).

The table "eventactivity" has about 70million rows in it, index on "incidentid"
The table "keyword_incidents" is a temporary table and has incidentid as its 
primary key.  It contains
5125 rows. eventmain and eventgeo both have 2.1 million. My hope is that I can 
convince the planner to do the
 join to keyword_incidents *first* and then do the seq scan for the LIKE condition.  Instead, it seems that it's seqscanning the 
whole 70 million rows first and then doing the join, which takes a lot longer than I'd like to wait for it.  Or, maybe I'm

misreading the explain output?

Thanks again

-Dan
-
Here's the query:

explain analyze 

select 
  *
from 

  keyword_incidents, 

  eventactivity, 

  eventmain, 

  eventgeo 

 where 

  eventmain.incidentid = keyword_incidents.incidentid and 

  eventgeo.incidentid = keyword_incidents.incidentid and 

  eventactivity.incidentid = keyword_incidents.incidentid 


  and (  recordtext like '%JOSE CHAVEZ%'   )
order by eventmain.entrydate limit 1;

---
 Limit  (cost=2388918.07..2388918.08 rows=1 width=455) (actual 
time=81771.186..81771.292 rows=26 loops=1)
   ->  Sort  (cost=2388918.07..2388918.08 rows=1 width=455) (actual 
time=81771.180..81771.215 rows=26 loops=1)
 Sort Key: eventmain.entrydate
 ->  Nested Loop  (cost=0.00..2388918.06 rows=1 width=455) (actual 
time=357.389..81770.982 rows=26 loops=1)
   ->  Nested Loop  (cost=0.00..2388913.27 rows=1 width=230) 
(actual time=357.292..81767.385 rows=26 loops=1)
 ->  Nested Loop  (cost=0.00..2388909.33 rows=1 width=122) 
(actual time=357.226..81764.501 rows=26 loops=1)
   ->  Seq Scan on eventactivity  (cost=0.00..2388874.46 rows=7 width=84) (actual time=357.147..81762.582 
rows=27 loops=1)

 Filter: ((recordtext)::text ~~ '%JOSE 
CHAVEZ%'::text)
   ->  Index Scan using keyword_incidentid_pkey on keyword_incidents  (cost=0.00..4.97 rows=1 width=38) 
(actual time=0.034..0.036 rows=1 loops=27)

 Index Cond: (("outer".incidentid)::text = 
(keyword_incidents.incidentid)::text)
 ->  Index Scan using eventgeo_incidentid_idx on eventgeo  (cost=0.00..3.93 rows=1 width=108) (actual 
time=0.076..0.081 rows=1 loops=26)

   Index Cond: (("outer".incidentid)::text = 
(eventgeo.incidentid)::text)
   ->  Index Scan using eventmain_incidentid_idx on eventmain  (cost=0.00..4.78 rows=1 width=225) (actual 
time=0.069..0.075 rows=1 loops=26)

 Index Cond: (("outer".incidentid)::text = 
(eventmain.incidentid)::text)
 Total runtime: 81771.529 ms
(15 rows)

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Finding bloated indexes?

2007-04-13 Thread Dan Harris
Is there a pg_stat_* table or the like that will show how bloated an index is? 
I am trying to squeeze some disk space and want to track down where the worst 
offenders are before performing a global REINDEX on all tables, as the database 
is rougly 400GB on disk and this takes a very long time to run.


I have been able to do this with tables, using a helpful view posted to this 
list a few months back, but I'm not sure if I can get the same results on indexes.


Thanks

-Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] View is not using a table index

2007-04-24 Thread Dan Shea
We have a table which we want to normalize and use the same SQL to
perform selects using a view.  
The old table had 3 columns in it's index
(region_id,wx_element,valid_time). 
The new table meteocode_elmts has a similar index but the region_id is a
reference to another table region_lookup and wx_element to table
meteocode_elmts_lookup.  This will make our index and table
significantly smaller.  
As stated ablove we want to use the same SQL query to check the view.  
The problem is we have not been able to set up the view so that it
references the "rev" index.  It just uses the region_id but ignores the
wx_element, therefore the valid_time is also ignored.  The rev index now
consists of region_id(reference to  region_lookup
table),wx_element(reference to meteocode_elmts_lookup) and valid_time.

We are using Postgresql 7.4.0.  Below is the relevant views and tables
plus an explain analyze of the query to the old table and the view.

Old table forceastelement
phoenix=# \d forecastelement 
  Table "public.forecastelement" 
 Column |Type | Modifiers 
+-+---
 origin | character varying(10)   | not null 
 timezone   | character varying(99)   | not null 
 region_id  | character varying(20)   | not null 
 wx_element | character varying(99)   | not null 
 value  | character varying(99)   | not null 
 flag   | character(3)| not null 
 units  | character varying(99)   | not null 
 valid_time | timestamp without time zone | not null 
 issue_time | timestamp without time zone | not null 
 next_forecast  | timestamp without time zone | not null  reception_time
| timestamp without time zone | not null
Indexes: 
"forecastelement_rwv_idx" btree (region_id, wx_element, valid_time) 

New and view nad tables are
phoenix=# \d fcstelmt_view 
   View "public.fcstelmt_view" 
 Column |Type | Modifiers 
+-+---
 origin | character varying(10)   | 
 timezone   | character varying(10)   | 
 region_id  | character varying(99)   | 
 wx_element | character varying(99)   | 
 value  | character varying(99)   | 
 flag   | character(3)| 
 unit   | character varying   | 
 valid_time | timestamp without time zone | 
 issue_time | timestamp without time zone | 
 next_forecast  | timestamp without time zone |  reception_time |
timestamp without time zone | 

View definition: 
 SELECT meteocode_bltns.origin, meteocode_bltns.timezone,
region_lookup.region_id, meteocode_elmts_lookup.wx_element,
meteocode_elmts.value, meteocode_bltns.flag, ( SELECT
meteocode_units_lookup.unit FROM meteocode_units_lookup WHERE
meteocode_units_lookup.id = meteocode_elmts.unit_id) AS unit,
meteocode_elmts.valid_time, meteocode_bltns.issue_time,
meteocode_bltns.next_forecast, meteocode_bltns.reception_time FROM
meteocode_bltns, meteocode_elmts, region_lookup, meteocode_elmts_lookup
WHERE meteocode_bltns.meteocode_id = meteocode_elmts.meteocode AND
region_lookup.id = meteocode_elmts.reg_id AND meteocode_elmts_lookup.id
= meteocode_elmts.wx_element_id;

phoenix=# \d meteocode_elmts 
 Table "public.meteocode_elmts" 
Column |Type | Modifiers 
---+-+---
 meteocode | integer | 
 value | character varying(99)   | not null 
 unit_id   | integer | 
 valid_time| timestamp without time zone | not null 
 lcleffect | integer | 
 reg_id| integer | 
 wx_element_id | integer | 
Indexes: 
"rev" btree (reg_id, wx_element_id, valid_time) phoenix=# \d
meteocode_bltns 
  Table "public.meteocode_bltns" 
 Column |Type |
Modifiers 
+-+-
+-+-
+-+-
 meteocode_id   | integer | not null default
nextval('"meteocode_bltns_idseq"'::text) 
 origin | character varying(10)   | not null 
 header | character varying(20)   | not null 
 timezone   | character varying(10)   | not null 
 flag   | character(3)| not null 
 initial| character varying(40)   | not null 
 issue_time | timestamp without time zone | not null 
 next_forecast  | timestamp without time zone | not null  reception_time
| timestamp without time zone | not null
Indexes: 
"meteocode_bltns_meteocode_id_idx" btree (meteocode_id) 

phoenix=# \d region_lookup 
 Table "public.region_lookup

Re: [PERFORM] View is not using a table index

2007-04-24 Thread Dan Shea
Version is  PWFPM_DEV=# select version();
version


 PostgreSQL 7.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.3
20030502 (Red Hat Linux 3.2.3-20)
(1 row)

We used the rpm source from postgresql-7.4-0.5PGDG.

You make it sound so easy.  Our database size is at 308 GB.  We actually
have 8.2.3 running and would like to transfer in the future.  We have to
investigate the best way to do it.

Dan.


-Original Message-
From: Richard Huxton [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 24, 2007 11:42 AM
To: Shea,Dan [NCR]
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] View is not using a table index

Dan Shea wrote:
> We have a table which we want to normalize and use the same SQL to 
> perform selects using a view.
> The old table had 3 columns in it's index 
> (region_id,wx_element,valid_time).
> The new table meteocode_elmts has a similar index but the region_id is

> a reference to another table region_lookup and wx_element to table 
> meteocode_elmts_lookup.  This will make our index and table 
> significantly smaller.
> As stated ablove we want to use the same SQL query to check the view.

> The problem is we have not been able to set up the view so that it 
> references the "rev" index.  It just uses the region_id but ignores 
> the wx_element, therefore the valid_time is also ignored.  The rev 
> index now consists of region_id(reference to  region_lookup 
> table),wx_element(reference to meteocode_elmts_lookup) and valid_time.
> 
> We are using Postgresql 7.4.0.  Below is the relevant views and tables

> plus an explain analyze of the query to the old table and the view.

Please say it's not really 7.4.0 - you're running 7.4.xx actually,
aren't you, where xx is quite a high number?

> phoenix=# \d region_lookup 
>  Table "public.region_lookup" 
>   Column   | Type  | Modifiers 
> ---+---+---
>  id| integer   | not null 
>  region_id | character varying(99) |
> Indexes: 
> "region_lookup_pkey" primary key, btree (id)
> 
> phoenix=# \d meteocode_elmts_lookup 
>  Table "public.meteocode_elmts_lookup" 
>Column   | Type  | Modifiers 
> +---+---
>  id | integer   | not null 
>  wx_element | character varying(99) | not null
> Indexes: 
> "meteocode_elmts_lookup_pkey" primary key, btree (id) 
> "wx_element_idx" btree (wx_element)

Anyway, you're joining to these tables and testing against the text
values without any index useful to the join.

Try indexes on (wx_element, id) and (region_id,id) etc. Re-analyse the
tables and see what that does for you.

Oh - I'd expect an index over the timestamps might help too.

Then, if you've got time try setting up an 8.2 installation, do some
basic configuration and transfer the data. I'd be surprised if you
didn't get some noticeable improvements just from the version number
increase.
-- 
   Richard Huxton
   Archonet Ltd

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning

2007-04-27 Thread Dan Harris

Michael Stone wrote:

On Fri, Apr 27, 2007 at 09:27:49AM -0400, Carlos Moreno wrote:
Notice that the second part of my suggestion covers this --- have 
additional

switches to initdb


If the person knows all that, why wouldn't they know to just change the 
config parameters?




Exactly..  What I think would be much more productive is to use the great amount 
of information that PG tracks internally and auto-tune the parameters based on 
it.  For instance:


Why does the user need to manually track max_fsm_pages and max_fsm_relations?  I 
bet there are many users who have never taken the time to understand what this 
means and wondering why performance still stinks after vacuuming their database 
( spoken from my own experience )


How about work_mem?  shared_buffers?  column statistics sizes? random_page_cost?

Couldn't some fairly simple regression tests akin to a VACUUM process spot 
potential problems?  "Hey, it looks like you need more fsm_relations.. I bumped 
that up automatically for you".  Or "These indexes look bloated, shall I 
automatically reindex them for you?"


I'm sure there are many more examples, that with some creative thinking, could 
be auto-adjusted to match the usage patterns of the database. PG does an 
excellent job of exposing the variables to the users, but mostly avoids telling 
the user what to do or doing it for them.  Instead, it is up to the user to know 
where to look, what to look for, and how to react to things to improve 
performance.  This is not all bad, but it is assuming that all users are hackers 
( which used to be true ), but certainly doesn't help when the average SQLServer 
admin tries out Postgres and then is surprised at the things they are now 
responsible for managing.  PG is certainly *not* the only database to suffer 
from this syndrome, I know..


I like to think of my systems as good employees.  I don't want to have to 
micromanage everything they do.  I want to tell them "here's what I want done", 
and assuming I made a good hiring choice, they will do it and take some liberty 
to adjust parameters where needed to achieve the spirit of the goal, rather than 
 blindly do something inefficiently because I failed to explain to them the 
absolute most efficient way to accomplish the task.


Granted, there are some people who don't like the developers making any 
assumptions about their workload.  But this doesn't have to be an either/or 
proposition.  I don't think any control needs to be abandoned.  But 
self-adjusting defaults seem like an achievable goal ( I know, I know, "show us 
the patch" ).  I just don't know if this feeling has resonated well between new 
users and long-term developers.  I know it must be grating to have to answer the 
same questions over and over and over "have you analyzed?  Did you leave 
postgresql.conf at the defaults??".  Seems like a win-win for both sides, IMHO.


In closing, I am not bashing PG!  I love it and swear by it.  These comments are 
purely from an advocacy perspective.  I'd love to see PG user base continue to grow.


My .02

-Dan



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning

2007-04-27 Thread Dan Harris

Bill Moran wrote:

In response to Dan Harris <[EMAIL PROTECTED]>:


Why does the user need to manually track max_fsm_pages and max_fsm_relations?  I 
bet there are many users who have never taken the time to understand what this 
means and wondering why performance still stinks after vacuuming their database 
( spoken from my own experience )


But there are two distinct routes that can be taken if there's not enough
fsm space: add fsm space or vacuum more frequently.  I don't want the system
to eat up a bunch of memory for fsm entries if my workload indicates that
I can easily vacuum more frequently.


There's no magic bullet here, but heuristics should be able to tell us you can 
"easily vacuum more frequently"  And again, I said these things would be 
*optional*.  Like an item in postgresql.conf 
"i_have_read_the_manual_and_know_what_this_all_means = false #default false". 
If you change it to true, you have all the control you're used to and nothing 
will get in your way.





How about work_mem?  shared_buffers?  column statistics sizes? random_page_cost?


The only one that seems practical (to me) is random_page_cost.  The others are
all configuration options that I (as a DBA) want to be able to decide for
myself.  For example, I have some dedicated PG servers that I pretty much
max those values out at, to let PG know that it can use everything on the
system -- but I also have some shared use machines with PG, where I carefully
constrain those values so that PG doesn't muscle other daemons out of their
share of the RAM (work_mem is probably the best example)



Just because you carefully constrain it does not preclude the ability for 
program logic to maintain statistics to do what I suggested.



It would be nice to have some kind of utility that could tell me what
random_page_cost should be, as I've never felt comfortable tweaking it.
Like some utility to run that would say "based on the seek tests I just
ran, you should set random_page_cost to x".  Of course, if such a thing
existed, it could just fill in the value for you.  But I haven't figured
out how to pick a good value for that setting, so I have no idea how to
suggest to have it automatically set.


Me either, but I thought if there's a reason it's user-settable, there must be 
some demonstrable method for deciding what is best.




Couldn't some fairly simple regression tests akin to a VACUUM process spot 
potential problems?  "Hey, it looks like you need more fsm_relations.. I bumped 
that up automatically for you".  Or "These indexes look bloated, shall I 
automatically reindex them for you?"


A lot of that stuff does happen.  A vacuum verbose will tell you what it
thinks you should do, but I don't _want_ it to do it automatically.  What
if I create huge temporary tables once a week for some sort of analysis that
overload the fsm space?  And if I'm dropping those tables when the analysis
is done, do I want the fsm space constantly adjusting?


I understand *you* don't want it done automatically.  But my suspicion is that 
there are a lot more newbie pg admins who would rather let the system do 
something sensible as a default.  Again, you sound defensive that somehow my 
ideas would take power away from you.  I'm not sure why that is, but certainly 
I'm not suggesting that.  An auto-pilot mode is not a bad idea just because a 
few pilots don't want to use it.




Plus, some is just impossible.  shared_buffers requires a restart.  Do you
want your DB server spontaneously restarting because it thought more
buffers might be nice?


Well, maybe look at the bigger picture and see if it can be fixed to *not* 
require a program restart?  Or.. take effect on the next pid that gets created? 
 This is a current limitation, but doesn't need to be one for eternity does it?




I'm sure there are many more examples, that with some creative thinking, could 
be auto-adjusted to match the usage patterns of the database. PG does an 
excellent job of exposing the variables to the users, but mostly avoids telling 
the user what to do or doing it for them.  Instead, it is up to the user to know 
where to look, what to look for, and how to react to things to improve 
performance.  This is not all bad, but it is assuming that all users are hackers 
( which used to be true ), but certainly doesn't help when the average SQLServer 
admin tries out Postgres and then is surprised at the things they are now 
responsible for managing.  PG is certainly *not* the only database to suffer 
from this syndrome, I know..


I expect the suffering is a result of the fact that databases are non-trivial
pieces of software, and there's no universally simple way to set them up
and make them run well.


Speaking as a former SQL Server admin ( from day 1 of the Sybase fork up to 
version 2000 ), I can say there *is* a way to

Re: [PERFORM]

2007-05-08 Thread Dan Harris

Orhan Aglagul wrote:

Hi Everybody,

I was trying to see how many inserts per seconds my application could 
handle on various machines.



I read that postgres does have issues with MP Xeon (costly context 
switching). But I still think that with fsync=on 65 seconds is ridiculous.


CPU is unlikely your bottleneck..  You failed to mention anything about your I/O 
setup.  More details in this regard will net you better responses.  However, an 
archive search for insert performance will probably be worthwhile, since this 
type of question is repeated about once a month.





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Background vacuum

2007-05-09 Thread Dan Harris

Daniel Haensse wrote:

Dear list,

I'm running postgres on a tomcat server. The vacuum is run every hour
(cronjob) which leads to a performance drop of the tomcat applications.
I played around with renice command and I think it is possible to reduce
this effect which a renice. The problem is how can I figure out the PID
of the postmaster performing the vacuum(automated)? Has anybody a nice
solution to change process priority? A shell script, maybe even for java?



While this may technically work, I think it lacks a key point.  'nice' ( at 
least the versions I'm familiar with ) do not adjust I/O priority.  VACUUM is 
bogging things down because of the extra strain on I/O.  CPU usage shouldn't 
really be much of a factor.


Instead, I would recommend looking at vacuum_cost_delay and the related settings 
to make vacuum lower priority than the queries you care about.  This should be a 
cleaner solution for you.


-Dan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Seq Scan

2007-06-01 Thread Dan Harris

Tyler Durden wrote:

Hi,
I'm having some problems in performance in a simple select count(id)
from I have 700 000 records in one table, and when I do:

# explain select (id) from table_name;
-[ RECORD 1 
]
QUERY PLAN | Seq Scan on table_name  (cost=0.00..8601.30 rows=266730 
width=4)


I had created an index for id(btree), but still shows "Seq Scan".
What I'm doing wrong?

Thanks,
Tyler

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


select count(*) will *always* do a sequential scan, due to the MVCC 
architecture.  See archives for much discussion about this.


-Dan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] [ADMIN] reclaiming disk space after major updates

2007-06-08 Thread Dan Harris

Andrew Sullivan wrote:

On Thu, Jun 07, 2007 at 03:26:56PM -0600, Dan Harris wrote:
They don't always have to be in a single transaction, that's a good idea to 
break it up and vacuum in between, I'll consider that.  Thanks


If you can do it this way, it helps _a lot_.  I've had to do this
sort of thing, and breaking into groups of a couple thousand or so
really made the difference.

A



One more point in my original post.. For my own education, why does VACUUM FULL 
prevent reads to a table when running (I'm sure there's a good reason)?  I can 
certainly understand blocking writes, but if I could still read from it, I'd 
have no problems at all!


-Dan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[PERFORM] PITR Backups

2007-06-21 Thread Dan Gorman

Hi -
  I'm looking at ways to do clean PITR backups. Currently we're  
pg_dumping our data in some cases when compressed is about 100GB.  
Needless to say it's slow and IO intensive on both the host and the  
backup server.


  All of our databases are on NetApp storage and I have been looking  
at SnapMirror (PITR RO copy ) and FlexClone (near instant RW volume  
replica) for backing up our databases. The problem is because there  
is no write-suspend or even a 'hot backup mode' for postgres it's  
very plausible that the database has data in RAM that hasn't been  
written and will corrupt the data. NetApp suggested that if we do a  
SnapMirror, we do a couple in succession ( < 1s) so should one be  
corrupt, we try the next one. They said oracle does something similar.


  Is there a better way to quiesce the database without shutting it  
down? Some of our databases are doing about 250,000 commits/min.


Best Regards,
Dan Gorman


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] PITR Backups

2007-06-22 Thread Dan Gorman
] 2007-06-21  
00:40:39 PDTWARNING:  page 26713 of relation 1663/16384/76719 was  
uninitialized
Jun 21 00:40:39 sfmedstorageha001 postgres[3757]: [32-1] 2007-06-21  
00:40:39 PDTWARNING:  page 17306 of relation 1663/16384/76710 was  
uninitialized
Jun 21 00:40:39 sfmedstorageha001 postgres[3757]: [33-1] 2007-06-21  
00:40:39 PDTWARNING:  page 26706 of relation 1663/16384/76719 was  
uninitialized
Jun 21 00:40:39 sfmedstorageha001 postgres[3757]: [34-1] 2007-06-21  
00:40:39 PDTWARNING:  page 800226 of relation 1663/16384/33204 was  
uninitialized
Jun 21 00:40:39 sfmedstorageha001 postgres[3757]: [35-1] 2007-06-21  
00:40:39 PDTWARNING:  page 28915 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:40:39 sfmedstorageha001 postgres[3757]: [36-1] 2007-06-21  
00:40:39 PDTWARNING:  page 26710 of relation 1663/16384/76719 was  
uninitialized
Jun 21 00:40:39 sfmedstorageha001 postgres[3757]: [37-1] 2007-06-21  
00:40:39 PDTWARNING:  page 28903 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:40:39 sfmedstorageha001 postgres[3757]: [38-1] 2007-06-21  
00:40:39 PDTWARNING:  page 28902 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:40:39 sfmedstorageha001 postgres[3757]: [39-1] 2007-06-21  
00:40:39 PDTWARNING:  page 28910 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:40:39 sfmedstorageha001 postgres[3757]: [40-1] 2007-06-21  
00:40:39 PDTPANIC:  WAL contains references to invalid pages
Jun 21 00:40:39 sfmedstorageha001 postgres[3755]: [1-1] 2007-06-21  
00:40:39 PDTLOG:  startup process (PID 3757) was terminated by signal 6
Jun 21 00:40:39 sfmedstorageha001 postgres[3755]: [2-1] 2007-06-21  
00:40:39 PDTLOG:  aborting startup due to startup process failure
Jun 21 00:40:39 sfmedstorageha001 postgres[3756]: [1-1] 2007-06-21  
00:40:39 PDTLOG:  logger shutting down



On Jun 22, 2007, at 12:30 AM, Toru SHIMOGAKI wrote:



Steve Atkins wrote:

- When we take a PITR base backup with hardware level snapshot  
operation
  (not filesystem level) which a lot of storage vender provide,  
the backup data
  can be corrupted as Dan said. During recovery we can't even  
read it,

  especially if meta-data was corrupted.

I can't see any explanation for how this could happen, other
than your hardware vendor is lying about snapshot ability.


All of the hardware vendors I asked always said:

 "The hardware level snapshot has nothing to do with filesystem  
condition and of course with what data has been written from  
operating system chache to the hard disk platter. It just copies  
byte data on storage to the other volume.


So, if any data is written during taking snapshot, we can't  
assurance data correctness *strictly* .


In Oracle, no table data is written between BEGIN BACKUP and END  
BACKUP, and it is not a problem REDO is written..."


I'd like to know the correct information if the explanation has any  
mistakes, or a good way to avoid the probrem.


I think there are users who want to migrate Oracle to PostgreSQL  
but can't because of the problem as above.



Best regards,

--
Toru SHIMOGAKI<[EMAIL PROTECTED]>
NTT Open Source Software Center


---(end of  
broadcast)---

TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] PITR Backups

2007-06-22 Thread Dan Gorman
This snapshot is done at the LUN (filer) level, postgres is un-aware  
we're creating a backup, so I'm not sure how pg_start_backup() plays  
into this ...


Regards,
Dan Gorman

On Jun 22, 2007, at 3:55 AM, Simon Riggs wrote:


On Fri, 2007-06-22 at 11:30 +0900, Toru SHIMOGAKI wrote:

Tom Lane wrote:

Dan Gorman <[EMAIL PROTECTED]> writes:
   All of our databases are on NetApp storage and I have been  
looking

at SnapMirror (PITR RO copy ) and FlexClone (near instant RW volume
replica) for backing up our databases. The problem is because there
is no write-suspend or even a 'hot backup mode' for postgres it's
very plausible that the database has data in RAM that hasn't been
written and will corrupt the data.



Alternatively, you can use a PITR base backup as suggested here:
http://www.postgresql.org/docs/8.2/static/continuous-archiving.html


I think Dan's problem is important if we use PostgreSQL to a large  
size database:


- When we take a PITR base backup with hardware level snapshot  
operation
  (not filesystem level) which a lot of storage vender provide,  
the backup data
  can be corrupted as Dan said. During recovery we can't even read  
it,

  especially if meta-data was corrupted.

- If we don't use hardware level snapshot operation, it takes long  
time to take
  a large backup data, and a lot of full-page-written WAL files  
are made.


So, I think users need a new feature not to write out heap pages  
during taking a

backup.


Your worries are unwarranted, IMHO. It appears Dan was taking a  
snapshot

without having read the procedure as clearly outlined in the manual.

pg_start_backup() flushes all currently dirty blocks to disk as  
part of

a checkpoint. If you snapshot after that point, then you will have all
the data blocks required from which to correctly roll forward. On its
own, the snapshot is an inconsistent backup and will give errors as  
Dan
shows. It is only when the snapshot is used as the base backup in a  
full

continuous recovery that the inconsistencies are removed and the
database is fully and correctly restored.

pg_start_backup() is the direct analogue of Oracle's ALTER DATABASE
BEGIN BACKUP. Snapshots work with Oracle too, in much the same way.

After reviewing the manual, if you honestly think there is a problem,
please let me know and I'll work with you to investigate.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



---(end of  
broadcast)---

TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PITR Backups

2007-06-22 Thread Dan Gorman
Ah okay. I understand now. So how can I signal postgres I'm about to  
take a backup ? (read doc from previous email ? )


Regards,
Dan Gorman

On Jun 22, 2007, at 4:38 AM, Simon Riggs wrote:


On Fri, 2007-06-22 at 04:10 -0700, Dan Gorman wrote:

This snapshot is done at the LUN (filer) level, postgres is un-aware
we're creating a backup, so I'm not sure how pg_start_backup() plays
into this ...


Postgres *is* completely unaware that you intend to take a backup,  
that

is *exactly* why you must tell the server you intend to make a backup,
using pg_start_backup() and pg_stop_backup(). That way Postgres will
flush its buffers, so that they are present on storage when you  
make the

backup.

Is the procedure for Oracle or any other transactional RDBMS any
different?

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



---(end of  
broadcast)---

TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] PITR Backups

2007-06-25 Thread Dan Gorman
It's the latter, is snapshot of the durable state of the storage  
system (e.g. it will never be corrupted)


Regards,
Dan Gorman

On Jun 22, 2007, at 11:02 AM, Tom Lane wrote:


"Simon Riggs" <[EMAIL PROTECTED]> writes:

On Fri, 2007-06-22 at 13:12 -0400, Tom Lane wrote:

If you saw a problem I'd be inclined to question whether
there is some upstream component (OS or disk controller) that's
reordering writes.


Given thats exactly what they do, constantly, I don't think its  
safe to

say that it works since we cannot verify whether that has happened or
not.


If he's trying to snapshot at a level of hardware that's behind a
write-caching disk controller, I agree that that's untrustworthy.

If not, ie if he's snapshotting the actual durable state of the  
storage

system, then any problems in the snapshot indicate a problem with the
database's ability to recover from a crash.  So I don't think you  
should

tell him to not worry.

regards, tom lane




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] PITR Backups

2007-06-25 Thread Dan Gorman
I took several snapshots. In all cases the FS was fine. In one case  
the db looked like on recovery it thought there were outstanding  
pages to be written to disk as seen below and the db wouldn't start.


Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [9-1] 2007-06-21  
00:39:43 PDTLOG:  redo done at 71/99870670
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [10-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28905 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [11-1] 2007-06-21  
00:39:43 PDTWARNING:  page 13626 of relation 1663/16384/76716 did not  
exist
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [12-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28904 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [13-1] 2007-06-21  
00:39:43 PDTWARNING:  page 26711 of relation 1663/16384/76719 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [14-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28900 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [15-1] 2007-06-21  
00:39:43 PDTWARNING:  page 3535208 of relation 1663/16384/33190 did  
not exist
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [16-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28917 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [17-1] 2007-06-21  
00:39:43 PDTWARNING:  page 3535207 of relation 1663/16384/33190 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [18-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28916 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [19-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28911 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [20-1] 2007-06-21  
00:39:43 PDTWARNING:  page 26708 of relation 1663/16384/76719 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [21-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28914 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [22-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28909 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [23-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28908 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [24-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28913 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [25-1] 2007-06-21  
00:39:43 PDTWARNING:  page 26712 of relation 1663/16384/76719 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [26-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28918 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [27-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28912 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [28-1] 2007-06-21  
00:39:43 PDTWARNING:  page 3535209 of relation 1663/16384/33190 did  
not exist
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [29-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28907 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [30-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28906 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [31-1] 2007-06-21  
00:39:43 PDTWARNING:  page 26713 of relation 1663/16384/76719 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [32-1] 2007-06-21  
00:39:43 PDTWARNING:  page 17306 of relation 1663/16384/76710 did not  
exist
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [33-1] 2007-06-21  
00:39:43 PDTWARNING:  page 26706 of relation 1663/16384/76719 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [34-1] 2007-06-21  
00:39:43 PDTWARNING:  page 800226 of relation 1663/16384/33204 did  
not exist
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [35-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28915 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [36-1] 2007-06-21  
00:39:43 PDTWARNING:  page 26710 of relation 1663/16384/76719 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [37-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28903 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [38-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28902 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [39-1] 2007-06-21  
00:39:43 PDTWARNING:  page 28910 of relation 1663/16384/76718 was  
uninitialized
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [40-1] 2007-06-21  
00:39:43 PDTPANIC:  WAL contains references to invalid pages
J

Re: [PERFORM] PITR Backups

2007-06-25 Thread Dan Gorman

Greg,

PG 8.2.4

Regards,
Dan Gorman

On Jun 25, 2007, at 9:02 AM, Gregory Stark wrote:


"Dan Gorman" <[EMAIL PROTECTED]> writes:

I took several snapshots. In all cases the FS was fine. In one  
case the db
looked like on recovery it thought there were outstanding  pages  
to be written

to disk as seen below and the db wouldn't start.

Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [9-1] 2007-06-21  
00:39:43

PDTLOG:  redo done at 71/99870670
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [10-1]  
2007-06-21 00:39:43
PDTWARNING:  page 28905 of relation 1663/16384/76718 was   
uninitialized


What version of Postgres did you say this was?

--
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com





---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] PITR Backups

2007-06-25 Thread Dan Gorman

Thanks for the pointers to a) make it readable and b) log min messages

I didn't however keep the snapshots around. I could try and re-set  
this scenario up. I was in the middle of doing some data migration  
with Netapp and wanted to just 'test' it to make sure it was sane.


If you guys would like me to try to 'break' it again and keep the db  
around for further testing let me know.


Regards,
Dan Gorman


On Jun 25, 2007, at 9:34 AM, Tom Lane wrote:


Dan Gorman <[EMAIL PROTECTED]> writes:

Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [9-1] 2007-06-21
00:39:43 PDTLOG:  redo done at 71/99870670
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [10-1] 2007-06-21
00:39:43 PDTWARNING:  page 28905 of relation 1663/16384/76718 was
uninitialized
... lots of these ...
Jun 21 00:39:43 sfmedstorageha001 postgres[3506]: [40-1] 2007-06-21
00:39:43 PDTPANIC:  WAL contains references to invalid pages


(BTW, you'll find putting a space at the end of log_line_prefix
does wonders for log readability.)

Reformatting and sorting, we have

WARNING: page 3535207 of relation 1663/16384/33190 was uninitialized
WARNING: page 3535208 of relation 1663/16384/33190 did not exist
WARNING: page 3535209 of relation 1663/16384/33190 did not exist

WARNING: page 800226 of relation 1663/16384/33204 did not exist

WARNING: page 17306 of relation 1663/16384/76710 did not exist

WARNING: page 13626 of relation 1663/16384/76716 did not exist

WARNING: page 28900 of relation 1663/16384/76718 was uninitialized
WARNING: page 28902 of relation 1663/16384/76718 was uninitialized
WARNING: page 28903 of relation 1663/16384/76718 was uninitialized
WARNING: page 28904 of relation 1663/16384/76718 was uninitialized
WARNING: page 28905 of relation 1663/16384/76718 was uninitialized
WARNING: page 28906 of relation 1663/16384/76718 was uninitialized
WARNING: page 28907 of relation 1663/16384/76718 was uninitialized
WARNING: page 28908 of relation 1663/16384/76718 was uninitialized
WARNING: page 28909 of relation 1663/16384/76718 was uninitialized
WARNING: page 28910 of relation 1663/16384/76718 was uninitialized
WARNING: page 28911 of relation 1663/16384/76718 was uninitialized
WARNING: page 28912 of relation 1663/16384/76718 was uninitialized
WARNING: page 28913 of relation 1663/16384/76718 was uninitialized
WARNING: page 28914 of relation 1663/16384/76718 was uninitialized
WARNING: page 28915 of relation 1663/16384/76718 was uninitialized
WARNING: page 28916 of relation 1663/16384/76718 was uninitialized
WARNING: page 28917 of relation 1663/16384/76718 was uninitialized
WARNING: page 28918 of relation 1663/16384/76718 was uninitialized

WARNING: page 26706 of relation 1663/16384/76719 was uninitialized
WARNING: page 26708 of relation 1663/16384/76719 was uninitialized
WARNING: page 26710 of relation 1663/16384/76719 was uninitialized
WARNING: page 26711 of relation 1663/16384/76719 was uninitialized
WARNING: page 26712 of relation 1663/16384/76719 was uninitialized
WARNING: page 26713 of relation 1663/16384/76719 was uninitialized

So the problems were pretty localized, probably at the ends of these
files.  Can you go back to the source database and check which
tables these are --- match the last number cited in each line
against pg_class.relfilenode?  Are they tables or indexes, and
about how big are they?

A possible explanation is we stopped scanning WAL before reaching
records that truncated or dropped these tables.  But it's not clear  
why.

Could we see the last few log lines before the "redo done" one?

regards, tom lane

---(end of  
broadcast)---

TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] PITR Backups

2007-06-25 Thread Dan Gorman
No, however, I will attach the postgreql.conf so everyone can look at  
other settings just in case.


postgresql.conf
Description: Binary data


Regards,
Dan Gorman

On Jun 25, 2007, at 10:07 AM, Gregory Stark wrote:


"Simon Riggs" <[EMAIL PROTECTED]> writes:


WARNING: page 28900 of relation 1663/16384/76718 was uninitialized
WARNING: page 28902 of relation 1663/16384/76718 was uninitialized



WARNING: page 26706 of relation 1663/16384/76719 was uninitialized
WARNING: page 26708 of relation 1663/16384/76719 was uninitialized


Those two are interesting because we appear to have two valid  
pages in
the middle of some uninitialized ones. That implies were not  
looking at

an unapplied truncation.


You don't have fsync off do you? That could explain missing pages  
at the end
of a file like this too. And it would explain how you could have  
two written

in the midst of others that are missing.

--
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] best use of an EMC SAN

2007-07-11 Thread Dan Gorman
We do something similar here. We use Netapp and I carve one aggregate  
per data volume. I generally keep the pg_xlog on the same "data" LUN,  
but I don't mix other databases on the same aggregate.


In the NetApp world because they use RAID DP (dual parity) you have a  
higher wastage of drives, however, you are guaranteed that an  
erroneous query won't clobber the IO of another database.


In my experience, NetApp has utilities that set "IO priority" but  
it's not granular enough as it's more like using "renice" in unix. It  
doesn't really make that big of a difference.


My recommendation, each database gets it's own aggregate unless the  
IO footprint is very low.


Let me know if you need more details.

Regards,
Dan Gorman

On Jul 11, 2007, at 6:03 AM, Dave Cramer wrote:

Assuming we have 24 73G drives is it better to make one big metalun  
and carve it up and let the SAN manage the where everything is, or  
is it better to specify which spindles are where.


Currently we would require 3 separate disk arrays.

one for the main database, second one for WAL logs, third one we  
use for the most active table.


Problem with dedicating the spindles to each array is that we end  
up wasting space. Are the SAN's smart enough to do a better job if  
I create one large metalun and cut it up ?


Dave

---(end of  
broadcast)---

TIP 4: Have you searched our list archives?

  http://archives.postgresql.org




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] importance of fast disks with pg

2007-07-17 Thread Dan Harris

Thomas Finneid wrote:

Hi

During the somes I did I noticed that it does not necessarily seem to be 
true that one needs the fastest disks to have a pg system that is fast.


It seems to me that its more important to:
- choose the correct methods to use for the operation
- tune the pg memory settings
- tune/disable pg xlog/wal etc

It also seems to me that fast disks are more important for db systems of 
the OLTP type applications with real concurrency of both readers and 
writes across many, possibly larger, tables etc.


Are the above statements close to having any truth in them?

regards

thomas


I'd say that "it depends".  We run an OLAP workload on 350+ gigs of database on 
a system with 64GB of RAM.  I can tell you for certain that fetching non-cached 
data is very sensitive to disk throughput!


Different types of workloads will find different bottlenecks in the system..

-Dan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread Dan Harris
Today, I looked at 'top' on my PG server and saw a pid that reported 270 hours 
of CPU time.  Considering this is a very simple query, I was surprised to say 
the least.  I was about to just kill the pid, but I figured I'd try and see 
exactly what it was stuck doing for so long.


Here's the strace summary as run for a few second sample:

% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 97.250.671629  92  7272   semop
  1.760.012171 40630   recvfrom
  0.570.003960  6660   gettimeofday
  0.360.002512  2890   sendto
  0.050.000317  1032   lseek
  0.010.49   148   select
-- --- --- - - 
100.000.690638  7532   total

Here's the query:

select id from eventkeywords where word = '3322'

If I run the query manually, it completes in about 500ms, which is very 
reasonable.

There are 408563 rows in this table.  I just noticed there is no index on word ( 
there should be! ).  Would this have caused the problem?


This is 8.0.12

Linux sunrise 2.6.15-26-amd64-server #1 SMP Fri Sep 8 20:33:15 UTC 2006 x86_64 
GNU/Linux


Any idea what might have set it into this loop?

-Dan

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread Dan Harris

Tom Lane wrote:

Dan Harris <[EMAIL PROTECTED]> writes:

Here's the strace summary as run for a few second sample:



% time seconds  usecs/call callserrors syscall
-- --- --- - - 
  97.250.671629  92  7272   semop
   1.760.012171 40630   recvfrom
   0.570.003960  6660   gettimeofday
   0.360.002512  2890   sendto
   0.050.000317  1032   lseek
   0.010.49   148   select
-- --- --- - - 
100.000.690638  7532   total



Here's the query:
select id from eventkeywords where word = '3322'


How sure are you that (a) that's really what it's doing and (b) you are
not observing multiple executions of the query?  There are no recvfrom
calls in the inner loops of the backend AFAIR, so this looks to me like
the execution of 30 different queries.  The number of semops is
distressingly high, but that's a contention issue not an
amount-of-runtime issue.  


You were absolutely right.  This is one connection that is doing a whole lot of 
( slow ) queries.  I jumped the gun on this and assumed it was a single query 
taking this long.  Sorry to waste time and bandwidth.


Since you mentioned the number of semops is distressingly high, does this 
indicate a tuning problem?  The machine has 64GB of RAM and as far as I can tell 
about 63GB is all cache.  I wonder if this is a clue to an undervalued 
memory-related setting somewhere?


-Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Performance problems with large telemetric datasets on 7.4.2

2007-08-03 Thread Dan Langille
On 3 Aug 2007 at 6:52, Sven Clement wrote:

> Hello everybody,
> 
> as I'm new to this list I hope that it is the right place to post this
> and also the right format, so if I'm committing an error, I apologize
> in advance.
> 
> First the background of my request:
> 
> I'm currently employed by an enterprise which has approx. 250 systems
> distributed worldwide which are sending telemetric data to the main
> PostgreSQL. The remote systems are generating about 10 events per
> second per system which accumulates to about 2500/tps. The data is
> stored for about a month before it is exported and finally deleted
> from the database. On the PostgreSQL server are running to databases
> one with little traffic (about 750K per day) and the telemetric
> database with heavy write operations all around the day (over 20
> million per day). We already found that the VACUUM process takes
> excessively long and as consequence the database is Vacuumed
> permanently.
> 
> The hardware is a IBM X306m Server, 3.2 GHz HT (Pentium IV), 1 GB RAM
> and 2x 250 GB HDD (SATA-II) with ext3 fs, one of the HDD is dedicated
> to database. OS is Debian 3.1 Sarge with PostgreSQL 7.4.7
> (7.4.7-6sarge1) with the libpq frontend library.
> 
> Now the problem:
> 
> The problem we are experiencing is that our queries are slowing down
> continuously even if we are performing queries on the index which is
> the timestamp of the event, a simple SELECT query with only a simple
> WHERE clause (< or >) takes very long to complete. So the database
> becomes unusable for production use as the data has to be retrieved
> very quickly if we want to act based on the telemetric data.

Have you confirmed via explain (or explain analyse) that the index is 
being used?

> So I'm asking me if it is useful to update to the actual 8.2 version
> and if we could experience performance improvement only by updating.

There are other benefits from upgrading, but you may be able to solve 
this problem without upgrading.

-- 
Dan Langille - http://www.langille.org/
Available for hire: http://www.freebsddiary.org/dan_langille.php



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Performance problem with table containing a lot of text (blog)

2007-08-28 Thread Dan Harris

Kari Lavikka wrote:

Hello!

Some background info.. We have a blog table that contains about eight 
million blog entries. Average length of an entry is 1200 letters. 
Because each 8k page can accommodate only a few entries, every query 
that involves several entries causes several random seeks to disk.  We 
are having problems with queries like:


1) give me a list of months when I have written someting
2) give me id's of entries I have written on month X year X
3) give me the number of blog entries my friends have written since last
   time


I didn't see your schema, but couldn't these problems be solved by storing the 
article id, owner id, and blog date in a separate table?  It seems that if you 
don't actually need the content of the blogs, all of those questions could be 
answered by querying a very simple table with minimal I/O overhead.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[PERFORM] pg_dump blocking create database?

2007-09-12 Thread Dan Harris
My PG server came to a screeching halt yesterday.  Looking at top saw a very 
large number of "startup waiting" tasks.   A pg_dump was running and one of my 
scripts had issued a CREATE DATABASE command.  It looks like the CREATE DATABASE 
was exclusive but was having to wait for the pg_dump to finish, causing a 
massive traffic jam of locks behind it.


Once I killed the pg_dump process, things returned to normal.

Version 8.0.12.  Is this a bug? It also concerns me because my workload is 
likely to expose this problem again.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Newbie question about degraded performance on delete statement.

2007-10-02 Thread Dan Langille
On 2 Oct 2007 at 23:55, Giulio Cesare Solaroli wrote:

> What I have observed are impossibly high time on delete statements on
> some tables.
> 
> The delete statement is very simple:
> delete from table where pk = ?
> 
> The explain query report a single index scan on the primary key index,
> as expected.
> 
> I have run vacuum using the pgAdmin tool, but to no avail.
> 
> I have also dropped and recreated the indexes, again without any benefit.
> 
> I have later created a copy of the table using the "create table
> table_copy as select * from table" syntax.
> 
> Matching the configuration of the original table also on the copy
> (indexes and constraints), I was able to delete the raws from the new
> table with regular performances, from 20 to 100 times faster than
> deleting from the original table.

There may be more to that original table.  What about triggers?  
rules?  Perhaps there other things going on in the background.

-- 
Dan Langille - http://www.langille.org/
Available for hire: http://www.freebsddiary.org/dan_langille.php



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Got that new server, now it's time for config!

2010-03-22 Thread Dan Harris

On 3/22/10 4:36 PM, Carlo Stonebanks wrote:

Here we go again!

Can anyone see any obvious faults?

Carlo

maintenance_work_mem = 256MB
I'm not sure how large your individual tables are, but you might want to 
bump this value up to get faster vacuums.

max_fsm_relations = 1000

I think this will definitely need to be increased

work_mem = 64MB
Most data warehousing loads I can think of will need more work_mem, but 
this depends on how large of data sets you are planning to sort.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [HACKERS] MIT benchmarks pgsql multicore (up to 48)performance

2010-10-05 Thread Dan Ports
I wasn't involved in this work but I do know a bit about it. Sadly, the
work on Postgres performance was cut down to under a page, complete
with the amazing offhand mention of "rewriting PostgreSQL's lock
manager". Here are a few more details...

The benchmarks in this paper are all about stressing the kernel. The
database is entirely in memory -- it's stored on tmpfs rather than on
disk, and it fits within shared_buffers. The workload consists of index
lookups and inserts on a single table. You can fill in all the caveats
about what conclusions can and cannot be drawn from this workload.

The big takeaway for -hackers, I think, is that lock manager
performance is going to be an issue for large multicore systems, and
the uncontended cases need to be lock-free. That includes cases where
multiple threads are trying to acquire the same lock in compatible
modes.

Currently even acquiring a shared heavyweight lock requires taking out
an exclusive LWLock on the partition, and acquiring shared LWLocks
requires acquiring a spinlock. All of this gets more expensive on
multicores, where even acquiring spinlocks can take longer than the
work being done in the critical section.

Their modifications to Postgres should be available in the code that
was published last night. As I understand it, the approach is to
implement LWLocks with atomic operations on a counter that contains
both the exclusive and shared lock count. Heavyweight locks do
something similar but with counters for each lock mode packed into a
word.

Note that their implementation of the lock manager omits some features
for simplicity, like deadlock detection, 2PC, and probably any
semblance of portability. (These are the sort of things we're allowed
to do in the research world! :-)

The other major bottleneck they ran into was a kernel one: reading from
the heap file requires a couple lseek operations, and Linux acquires a
mutex on the inode to do that. The proper place to fix this is
certainly in the kernel but it may be possible to work around in
Postgres.

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] large dataset with write vs read clients

2010-10-07 Thread Dan Harris

 On 10/7/10 11:47 AM, Aaron Turner wrote:



Basically, each connection is taking about 100MB resident.  As we need
to increase the number of threads to be able to query all the devices
in the 5 minute window, we're running out of memory.
I think the first thing to do is look into using a connection pooler 
like pgpool to reduce your connection memory overhead.


-Dan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Dan Harris

 On 10/11/10 8:02 PM, Scott Carey wrote:

would give you a 1MB read-ahead.  Also, consider XFS and its built-in 
defragmentation.  I have found that a longer lived postgres DB will get extreme
file fragmentation over time and sequential scans end up mostly random.  
On-line file defrag helps tremendously.

We just had a corrupt table caused by an XFS online defrag.  I'm scared 
to use this again while the db is live.  Has anyone else found this to 
be safe?  But, I can vouch for the fragmentation issue, it happens very 
quickly in our system.


-Dan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Dan Harris

 On 10/12/10 10:44 AM, Scott Carey wrote:

On Oct 12, 2010, at 8:39 AM, Dan Harris wrote:


  On 10/11/10 8:02 PM, Scott Carey wrote:

would give you a 1MB read-ahead.  Also, consider XFS and its built-in 
defragmentation.  I have found that a longer lived postgres DB will get extreme
file fragmentation over time and sequential scans end up mostly random.  
On-line file defrag helps tremendously.


We just had a corrupt table caused by an XFS online defrag.  I'm scared
to use this again while the db is live.  Has anyone else found this to
be safe?  But, I can vouch for the fragmentation issue, it happens very
quickly in our system.


What version?  I'm using the latest CentoOS extras build.

We've been doing online defrag for a while now on a very busy database with>  
8TB of data.  Not that that means there are no bugs...

It is a relatively simple thing in xfs -- it writes a new file to temp in a way 
that allocates contiguous space if available, then if the file has not been 
modified since it was re-written it is essentially moved on top of the other 
one.  This should be safe provided the journaling and storage is safe, etc.

I'm not sure how to figure out what version of XFS we're on.. but it's 
Linux kernel 2.6.24-24 x86_64 on Ubuntu Server 8.04.3.  Postgres version 8.3


We're due for an upgrade on that server soon so we'll do some more 
testing once we upgrade.  Right now we are just living with the 
fragmentation.  I'm glad to hear the regular on-line defrag is working 
successfully, at least that gives me hope we can rely on it in the future.


-Dan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Dan Harris

 On 10/12/10 4:33 PM, Neil Whelchel wrote:

On Tuesday 12 October 2010 08:39:19 Dan Harris wrote:

   On 10/11/10 8:02 PM, Scott Carey wrote:

would give you a 1MB read-ahead.  Also, consider XFS and its built-in
defragmentation.  I have found that a longer lived postgres DB will get
extreme file fragmentation over time and sequential scans end up mostly
random.  On-line file defrag helps tremendously.

We just had a corrupt table caused by an XFS online defrag.  I'm scared
to use this again while the db is live.  Has anyone else found this to
be safe?  But, I can vouch for the fragmentation issue, it happens very
quickly in our system.

-Dan

I would like to know the details of what was going on that caused your
problem. I have been using XFS for over 9 years, and it has never caused any
trouble at all in a production environment. Sure, I had many problems with it
on the test bench, but in most cases the issues were very clear and easy to
avoid in production. There were some (older) XFS tools that caused some
problems, but that is in the past, and as time goes on, it seems take less and
less planning to make it work properly.
-Neil-

There were roughly 50 transactions/sec going on at the time I ran it.  
xfs_db reported 99% fragmentation before it ran ( we haven't been 
running it via cron ).  The operation completed in about 15 minutes ( 
360GB of used data on the file system ) with no errors.  Everything 
seemed fine until the next morning when a user went to query a table we 
got a message about a "missing" file inside the pg cluster.  We were 
unable to query the table at all via psql.  It was a bit of a panic 
situation so we restored that table from backup immediately and the 
problem was solved without doing more research.


This database has been running for years with no problem ( and none 
since ), that was the first time I tried to do an on-line defrag and 
that was the only unusual variable introduced into the system at that 
time so it was a strong enough correlation for me to believe that caused 
it.  Hopefully this was just a corner case..


-Dan


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Help with bulk read performance

2010-11-08 Thread Dan Schaffer

Hello

We have an application that needs to do bulk reads of ENTIRE Postgres tables very quickly (i.e. select * from table).  We have 
observed that such sequential scans run two orders of magnitude slower than observed raw disk reads (5 MB/s versus 100 MB/s).  Part 
of this is due to the storage overhead we have observed in Postgres.  In the example below, it takes 1 GB to store 350 MB of nominal 
data.  However that suggests we would expect to get 35 MB/s bulk read rates.


Observations using iostat and top during these bulk reads suggest that the queries are CPU bound, not I/O bound.  In fact, repeating 
the queries yields similar response times.  Presumably if it were an I/O issue the response times would be much shorter the second 
time through with the benefit of caching.


We have tried these simple queries using psql, JDBC, pl/java stored procedures, and libpq.  In all cases the client code ran on the 
same box as the server.

We have experimented with Postgres 8.1, 8.3 and 9.0.

We also tried playing around with some of the server tuning parameters such as 
shared_buffers to no avail.

Here is uname -a for a machine we have tested on:

Linux nevs-bdb1.fsl.noaa.gov 2.6.18-194.17.1.el5 #1 SMP Mon Sep 20 07:12:06 EDT 
2010 x86_64 x86_64 x86_64 GNU/Linux

A sample dataset that reproduces these results looks like the following (there 
are no indexes):

Table "bulk_performance.counts"
 Column |  Type   | Modifiers
+-+---
 i1 | integer |
 i2 | integer |
 i3 | integer |
 i4 | integer |

There are 22 million rows in this case.

We HAVE observed that summation queries run considerably faster.  In this case,

select sum(i1), sum(i2), sum(i3), sum(i4) from bulk_performance.counts

runs at 35 MB/s.


Our business logic does operations on the resulting data such that the output is several orders of magnitude smaller than the input. 
 So we had hoped that by putting our business logic into stored procedures (and thus drastically reducing the amount of data 
flowing to the client) our throughput would go way up.  This did not happen.


So our questions are as follows:

Is there any way using stored procedures (maybe C code that calls SPI directly) or some other approach to get close to the expected 
35 MB/s doing these bulk reads?  Or is this the price we have to pay for using SQL instead of some NoSQL solution.  (We actually 
tried Tokyo Cabinet and found it to perform quite well. However it does not measure up to Postgres in terms of replication, data 
interrogation, community support, acceptance, etc).


Thanks

Dan Schaffer
Paul Hamer
Nick Matheson
<>
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Help with bulk read performance

2010-12-15 Thread Dan Schaffer

Hi,
My name is Dan and I'm a co-worker of Nick Matheson who initially submitted this question (because the mail group had me blacklisted 
for awhile for some reason).



Thank you for all of the suggestions.  We were able to improve out bulk read performance from 3 MB/s to 60 MB/s (assuming the data 
are NOT in cache in both cases) by doing the following:


1. Storing the data in a "bytea" column instead of an "array" column.
2. Retrieving the data via the Postgres 9 CopyManager#copyOut(String sql, 
OutputStream stream)  method

The key to the dramatic improvement appears to be the reduction in packing and unpacking time on the server and client, 
respectively.  The server packing occurs when the retrieved data are packed into a bytestream for sending across the network. 
Storing the data as a simple byte array reduces this time substantially.  The client-side unpacking time is spent generating a 
ResultSet object. By unpacking the bytestream into the desired arrays of floats by hand instead, this time became close to negligible.


The only downside of storing the data in byte arrays is the loss of transparency.  That is, a simple "select *" of a few rows shows 
bytes instead of floats.  We hope to mitigate this by writing a simple stored procedures that unpacks the bytes into floats.


A couple of other results:

If the data are stored as a byte array but retrieve into a ResultSet, the unpacking time goes up by an order of magnitude and the 
observed total throughput is 25 MB/s.  If the data are stored in a Postgres float array and unpacked into a byte stream, the 
observed throughput is 20 MB/s.


Dan (and Nick)

Andy Colson wrote:

On 12/14/2010 9:41 AM, Jim Nasby wrote:

On Dec 14, 2010, at 9:27 AM, Andy Colson wrote:

Is this the same thing Nick is working on?  How'd he get along?

http://archives.postgresql.org/message-id/4cd1853f.2010...@noaa.gov


So it is. The one I replied to stood out because no one had replied to 
it; I didn't see the earlier email.

--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net





Oh.. I didn't even notice the date... I thought it was a new post.

But still... (and I'll cc Nick on this)  I'd love to hear an update on 
how this worked out.


Did you get it to go fast?  What'd you use?  Did the project go over 
budget and did you all get fired?  COME ON MAN!  We need to know! :-)


-Andy

<>
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Which RAID Controllers to pick/avoid?

2011-02-02 Thread Dan Birken
I'm setting up a dedicated linux postgres box with 2x300GB 15k SAS drive in
a RAID 1, though if future load dictates we would like to be able to upgrade
to RAID 10.  The hosting provider offers the following options for a RAID
controller (all are the same price):

 ADAPTEC 3405 RAID Controller
 ADAPTEC 4800 RAID Controller
 LSI MegaRaid 8308 RAID Controller
 ADAPTEC 5405 RAID Controller
 ADAPTEC 5405Z RAID Controller
 ADAPTEC 5805 RAID Controller
 ADAPTEC 5805Z RAID Controller

However, they can't guarantee that any particular RAID controller would be
in stock when they are building the machine, so basically I would like to
know if any of these cards are sufficiently better or worse than the others
that I should either a) wait for a particular card or b) avoid a card.

Also, I am planning on replicating this box to a standby machine with
streaming replication.  Given this, is there any reason not to use the write
cache on the RAID controller assuming it has battery backup?  My though
being even in a worst case scenario if the BBU fails and the master DB gets
corrupted, the standby (not using any write cache) would still be ok sans a
few seconds of data (assuming the replication was keeping up, which would be
monitored).

-Dan


Re: [PERFORM] Which RAID Controllers to pick/avoid?

2011-02-02 Thread Dan Birken
Thank you everybody for the detailed answers, the help is well appreciated.

A couple of follow-up questions:
- Is the supercap + flash memory considered superior to the BBU in practice?
 Is that type of system well tested?
- Is the linux support of the LSI and Adaptec cards comparable?

-Dan

On Wed, Feb 2, 2011 at 10:15 PM, Greg Smith  wrote:

>  Scott Marlowe wrote:
>
> On Wed, Feb 2, 2011 at 10:46 PM, Greg Smith  
>  wrote:
>
>
>  example.  Scott Marlowe was griping recently about a similar issue in some
> of the LSI models, too.  I suspect it's a problem impacting several of the
> larger RAID cards that use the big Intel IOP processors for their RAID
> computations, given that's the part with the heatsink on it.
>
>
>  Specifically the LSI  in a case that gave very low amounts of air
> flow over the RAID card.  The case right above the card was quite hot,
> and the multilane cable was warm enough to almost burn my fingers whe
>
>
> Interesting...that shoots down my theory.  Now that I check, the LSI 
> uses their SAS1078 controller, which is based on a PowerPC 440
> processor--it's not one of the Intel IOP processors at all.  The 8308 Dan
> has as an option is using the very popular Intel IOP333 instead, which is
> also used in some Areca 1200 series cards (1220/1230/1260).
>
> The Adaptec 5405 and 5805 cards both use the Intel IOP348, as does the
> Areca 1680.  Areca puts a fan right on it; Adaptec does not.I suspect
> the only reason the 5805 cards have gotten more reports of overheating than
> the 5405 ones is just because having more drives typically connected
> increases their actual workload.  I don't think there's actually any
> difference between the cooling situation between the two otherwise.
>
>
> --
> Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
> PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
> "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
>
>


Re: [PERFORM] Linux I/O schedulers - CFQ & random seeks

2011-03-04 Thread Dan Harris

On 3/4/11 11:03 AM, Wayne Conrad wrote:

On 03/04/11 10:34, Glyn Astill wrote:
> I'm wondering (and this may be a can of worms) what peoples opinions 
are on these schedulers?


When testing our new DB box just last month, we saw a big improvement 
in bonnie++ random I/O rates when using the noop scheduler instead of 
cfq (or any other).  We've got RAID 10/12 on a 3ware card w/ 
battery-backed cache; 7200rpm drives.  Our file system is XFS with 
noatime,nobarrier,logbufs=8,logbsize=256k.  How much is "big?"  I 
can't find my notes for it, but I recall that the difference was large 
enough to surprise us.  We're running with noop in production right 
now.  No complaints.


Just another anecdote, I found that the deadline scheduler performed the 
best for me.  I don't have the benchmarks anymore but deadline vs cfq 
was dramatically faster for my tests.  I posted this to the list years 
ago and others announced similar experiences.  Noop was a close 2nd to 
deadline.


XFS (noatime,nodiratime,nobarrier,logbufs=8)
391GB db cluster directory
BBU Caching RAID10 12-disk SAS
128GB RAM
Constant insert stream
OLAP-ish query patterns
Heavy random I/O


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] big joins not converging

2011-03-10 Thread Dan Ancona

Hi postgressers -

As part of my work with voter file data, I pretty regularly have to  
join one large-ish (over 500k rows) table to another. Sometimes this  
is via a text field (countyname) + integer (voter id). I've noticed  
sometimes this converges and sometimes it doesn't, seemingly  
regardless of how I index things. So I'm looking for general thoughts  
on the joining of large tables, but also running into a specific issue  
with the following slightly different query:


This one is between two tables that are a 754k row list of voters and  
a 445k row list of property owners. (I'm trying to find records where  
the owner name matches the voter name at the same address.) I have  
btree single column indices built on all the relevant fields, and  
multicolumn indices built across all the columns I'm matching. The  
full schemas of both tables are below. The machine is an older-ish (3  
years ago) dual-core pentium w/ 4GB RAM running FreeBSD, more details  
below.


This is the query I've come up with so far:

explain analyze
update vanalameda set ownerflag = 'exact'
  from aralameda where
  vanalameda.streetno ~~ aralameda.streetnum and
  vanalameda.streetname ~~ aralameda.streetname and
  vanalameda.lastname ~~ aralameda.ownername and
  vanalameda.firstname ~~ aralameda.ownername;

If I include the analyze, this didn't complete after running  
overnight. If I drop the analyze and just explain, I get this:


"Nested Loop  (cost=46690.74..15384448712.74 rows=204 width=204)"
"  Join Filter: (((vanalameda.streetno)::text ~~  
(aralameda.streetnum)::text) AND ((vanalameda.streetname)::text ~~  
(aralameda.streetname)::text) AND ((vanalameda.lastname)::text ~~  
(aralameda.ownername)::text) AND ((vanalameda.firstname)::text ~~  
(aralameda.ownername)::text))"
"  ->  Seq Scan on vanalameda  (cost=0.00..26597.80 rows=734780  
width=204)"

"  ->  Materialize  (cost=46690.74..58735.87 rows=444613 width=113)"
"->  Seq Scan on aralameda  (cost=0.00..38647.13 rows=444613  
width=113)"


One general question: does the width of the tables (i.e. the numbers  
of columns not being joined and the size of those fields) matter? The  
tables do have a lot of extra columns that I could slice out.


Thanks so much!

Dan

System:
client: pgadmin III, Mac OS

server:
select version();
PostgreSQL 8.3.7 on i386-portbld-freebsd7.2, compiled by GCC cc (GCC)  
4.2.1 20070719  [FreeBSD]

(installed from freebsd package system, default configuration)

%sysctl -a | egrep -i 'hw.machine|hw.model|hw.ncpu'
hw.machine: i386
hw.model: Genuine Intel(R) CPU2160  @ 1.80GHz
hw.ncpu: 2
hw.machine_arch: i386

w/ 4GB RAM, 1 1GB disk, no RAID.

Here's the tables...

  Table "public.aralameda"
 Column  | Type  | Modifiers
-+---+---
 dt000o039001010 | character varying(13) |
 o3901010| character varying(15) |
 dt17| character varying(2)  |
 dt046   | character varying(3)  |
 streetnum   | character varying(10) |
 streetname  | character varying(50) |
 unitnum | character varying(10) |
 city| character varying(30) |
 zip | character varying(5)  |
 unk3| character varying(1)  |
 crap1   | character varying(12) |
 crap2   | character varying(12) |
 crap3   | character varying(12) |
 crap4   | character varying(12) |
 crap5   | character varying(12) |
 crap6   | character varying(12) |
 crap7   | character varying(12) |
 crap8   | character varying(12) |
 crap9   | character varying(12) |
 crap10  | character varying(12) |
 dt2009  | character varying(4)  |
 dt066114| character varying(6)  |
 crap11  | character varying(8)  |
 crap12  | character varying(8)  |
 ownername   | character varying(50) |
 careofname  | character varying(50) |
 unk4| character varying(1)  |
 maddr1  | character varying(60) |
 munitnum| character varying(10) |
 mcitystate  | character varying(30) |
 mzip| character varying(5)  |
 mplus4  | character varying(4)  |
 dt40| character varying(2)  |
 dt4 | character varying(1)  |
 crap13  | character varying(8)  |
 d   | character varying(1)  |
 dt0500  | character varying(4)  |
 unk6| character varying(1)  |
 crap14  | character varying(8)  |
 unk7| character varying(1)  |
Indexes:
"arall" btree (streetnum, streetname, ownername)
"aroname" btree (ownername)
"arstreetname" btree (streetname)
"arstreetnum" btree (streetnum)

 Table "public.vanalameda"
Column | Type  | Mo

Re: [PERFORM] big joins not converging

2011-03-10 Thread Dan Ancona

On Mar 10, 2011, at 3:48 PM, fork wrote:
[much thoughtfulness]


Steve Atkins  blighty.com> writes:
[also much thoughtfulness]


Steve and fork -- thank you, this is super helpful. I meant to tweak  
that exact search before sending this around, sorry if that was  
confusing. That was meant to be a place holder for [some set of  
matches that works]. And yes, "not converging" was incorrect, I did  
mean "not finishing." But together from your answers it sounds pretty  
clear that there's no particularly obvious easy solution that I'm  
missing; this really is kind of tricky. This is a choice between  
developing some in-house capacity for this and sending people to  
various vendors so we'll probably lean on the vendors for now, at  
least while we work on it. I've gotten my head partway around PL/PGSQL  
functions, I may give that another try.


And you're right fork, Record Linkage is in fact an entire academic  
discipline! I had no idea, this is fascinating and helpful:


http://en.wikipedia.org/wiki/Record_linkage

Thanks so much!

Dan



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Reliability recommendations

2006-02-24 Thread Dan Gorman

All,

Was that sequential reads? If so, yeah you'll get 110MB/s? How big  
was the datafile size? 8MB? Yeah, you'll get 110MB/s. 2GB? No, they  
can't sustain that. There are so many details missing from this test  
that it's hard to have any context around it :)


I was getting about 40-50MB/s on a PV with 14 disks on a RAID10 in  
real world usage. (random IO and fully saturating a Dell 1850 with 4  
concurrent threads (to peg the cpu on selects) and raw data files)


Best Regards,
Dan Gorman

On Feb 24, 2006, at 4:29 PM, Mark Kirkwood wrote:


Luke Lonergan wrote:

I'd be more shocked if this weren't also true of nearly all SCSI  
HW RAID
adapters of this era.  If you had ordered an HP DL380 server you'd  
get about

the same performance.
BTW - I don't think there's anything reasonable about 50-55 MB/s  
from 6

disks, I'd put the minimum for this era machine at 5 x 30 = 150MB/s.


He was quoting for 6 disk RAID 10 - I'm thinking 3 x 30MB/s = 90MB/ 
s is probably more correct? Having aid that, your point is still  
completely correct - the performance @55MB/s is poor (e.g. my *ata*  
system with 2 disk RAID0 does reads @110MB/s).


cheers

Mark

---(end of  
broadcast)---

TIP 4: Have you searched our list archives?

  http://archives.postgresql.org



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[PERFORM] Help optimizing a slow index scan

2006-03-16 Thread Dan Harris

explain analyze
select distinct eventmain.incidentid, eventmain.entrydate, 
eventgeo.long, eventgeo.lat, eventgeo.geox, eventgeo.geoy

from eventmain, eventgeo
where
   eventmain.incidentid = eventgeo.incidentid and
   ( long > -104.998027962962 and long < -104.985957781349 ) and
   ( lat > 39.7075542720006 and lat < 39.7186195832938 ) and
   eventmain.entrydate > '2006-1-1 00:00' and
   eventmain.entrydate <= '2006-3-17 00:00'
order by
   eventmain.entrydate;

  QUERY 
PLAN  


-
Unique  (cost=121313.81..121330.72 rows=451 width=178) (actual 
time=723719.761..723726.875 rows=1408 loops=1)
  ->  Sort  (cost=121313.81..121314.94 rows=451 width=178) (actual 
time=723719.755..723721.807 rows=1408 loops=1)
Sort Key: eventmain.entrydate, eventmain.disposition, 
eventmain.incidentid, eventgeo.reportingarea, eventgeo.beatid, 
eventmain.finaltype, eventmain.casenumber, eventgeo.eventlocation, 
eventmain.insertdate, eventmain.priority, eventgeo.long, eventgeo.lat, 
eventgeo.geox, eventgeo.geoy
->  Nested Loop  (cost=0.00..121293.93 rows=451 width=178) 
(actual time=1916.230..723712.900 rows=1408 loops=1)
  ->  Index Scan using eventgeo_lat_idx on eventgeo  
(cost=0.00..85488.05 rows=10149 width=76) (actual time=0.402..393376.129 
rows=22937 loops=1)
Index Cond: ((lat > 39.7075542720006::double 
precision) AND (lat < 39.7186195832938::double precision))
Filter: ((long > -104.998027962962::double 
precision) AND (long < -104.985957781349::double precision))
  ->  Index Scan using eventmain_incidentid_idx on 
eventmain  (cost=0.00..3.52 rows=1 width=119) (actual 
time=14.384..14.392 rows=0 loops=22937)
Index Cond: ((eventmain.incidentid)::text = 
("outer".incidentid)::text)
Filter: ((entrydate > '2006-01-01 
00:00:00'::timestamp without time zone) AND (entrydate <= '2006-03-17 
00:00:00'::timestamp without time zone))


Total runtime:  >>> 723729.238 ms(!) <<<



I'm trying to figure out why it's consuming so much time on the index 
scan for eventgeo_lat_idx.  Also, I have an index on "long" that the 
planner does not appear to find helpful.


There are 3.3 million records in eventmain and eventgeo.  The server has 
a reasonably fast RAID10 setup with 16x 15k RPM drives and 12GB of RAM ( 
11GB listed as "cache" by vmstat ).  Running version 8.0.2 on linux 
kernel 2.6.12.


I have just vacuum analyze'd both tables, rebuilt the eventgeo_lat_idx 
index and reran the query multiple times to see if caching helped ( it 
didn't help much ).   The server seems to be fine utilizing other fields 
from this table but using "long" and "lat" seem to drag it down 
significantly.


 Is it because there's such slight differences between the records, 
since they are all within a few hundredths of a degree from each other?


Thanks for your time and ideas.

-Dan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Help optimizing a slow index scan

2006-03-16 Thread Dan Harris

Dan Harris wrote:
explain analyze 
 doh.. sorry to reply to my own post.  But I messed up copying some 
of the fields into the select statement that you'll see in the "Sort 
Key" section of the analyze results.  The mistake was mine.  Everything 
else is "normal" between the query and the plan.


-Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Help optimizing a slow index scan

2006-03-17 Thread Dan Harris

Markus Bertheau wrote:

Have you tried using a GIST index on lat & long? These things are
meant for two-dimensional data, whereas btree doesn't handle
two-dimensional data that well. How many rows satisfy either of the
long / lat condition?

  

According to the analyze, less than 500 rows matched.  I'll look into 
GIST indexes, thanks for the feedback.


-Dan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Help optimizing a slow index scan

2006-03-17 Thread Dan Harris

Dan Harris wrote:

Markus Bertheau wrote:

Have you tried using a GIST index on lat & long? These things are
meant for two-dimensional data, whereas btree doesn't handle
two-dimensional data that well. How many rows satisfy either of the
long / lat condition?

 

According to the analyze, less than 500 rows matched.  I'll look into 
GIST indexes, thanks for the feedback.


-Dan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


When I try to create a GIST index, I get the following error:

create index eventgeo_lat_idx on eventgeo using GIST (lat);

ERROR:  data type double precision has no default operator class for 
access method "gist"
HINT:  You must specify an operator class for the index or define a 
default operator class for the data type.


I'm not sure what a "default operator class" is, exactly..

-Dan

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Help optimizing a slow index scan

2006-03-17 Thread Dan Harris

Merlin Moncure wrote:


As others will probably mention, effective queries on lot/long which
is a spatial problem will require r-tree or gist.  I don't have a lot
of experience with exotic indexes but this may be the way to go.

One easy optimization to consider making is to make an index on either
(incidentid, entrydate) or (incident_id,long) which ever is more
selective.

This is 'yet another query' that would be fun to try out and tweak
using the 8.2 upcoming row-wise comparison.

merlin
  
Thanks to everyone for your suggestions.  One problem I ran into is that 
apparently my version doesn't support the GIST index that was 
mentioned.  "function 'box' doesn't exist" ).. So I'm guessing that both 
this as well as the Earth Distance contrib require me to add on some 
more pieces that aren't there.


Furthermore, by doing so, I am tying my queries directly to 
"postgres-isms".  One of the long term goals of this project is to be 
able to fairly transparently support any ANSI SQL-compliant back end 
with the same code base.  If I had full control over the query designs, 
I could make stored procedures to abstract this.  However, I have to 
deal with a "gray box" third-party reporting library that isn't so 
flexible.  I'll certainly consider going with something 
postgre-specific, but only as a last resort.


I tried the multi-column index as mentioned above but didn't see any 
noticeable improvement in elapsed time, although the planner did use the 
new index.


What is the real reason for the index not being very effective on these 
columns?  Although the numbers are in a very limited range, it seems 
that the records would be very selective as it's not terribly common for 
multiple rows to share the same coords.


Is the "8.2. upcoming row-wise comparison" something that would be 
likely to help me?


Thanks again for your input

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] vacuum full seems to hang on very small table

2006-04-04 Thread Dan Harris
I have a table with 1 live row that I found has 115000 dead rows in it ( 
from a testing run ).  I'm trying to VACUUM FULL the table and it has 
run for over 18 hours without completion.  Considering the hardware on 
this box and the fact that performance seems reasonable in all other 
aspects, I'm confused as to why this would happen.  The database other 
than this table is quite large ( 70 gigs on disk ) and I would expect to 
take days to complete but I just did 'vacuum full table_stats'.  That 
should only do that table, correct?  I'm running 8.0.3.


 Table "public.table_stats"
  Column|Type | Modifiers
-+-+---
count_cfs   | integer |
count_ncfs  | integer |
count_unitactivity  | integer |
count_eventactivity | integer |
min_eventmain   | timestamp without time zone |
max_eventmain   | timestamp without time zone |
min_eventactivity   | timestamp without time zone |
max_eventactivity   | timestamp without time zone |
geocoding_hitrate   | double precision|
recent_load | timestamp without time zone |
count_eventmain | integer |


This is the table structure.

Any ideas where to begin troubleshooting this?

Thanks.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[PERFORM] Encouraging multi-table join order

2006-04-10 Thread Dan Harris
I have a query that is intended to select from multiple "small tables" 
to get a limited subset of "incidentid" and then join with a "very 
large" table.  One of the operations will require a sequential scan, but 
the planner is doing the scan on the very large table before joining the 
small ones, resulting in a huge amount of disk I/O.  How would I make 
this query join the large table only after narrowing down the possible 
selections from the smaller tables?  This is running on version 8.0.3.


Thanks for any ideas.

-Dan


QUERY

explain analyze
   select distinct
eventmain.incidentid,
   eventmain.entrydate,
   eventgeo.eventlocation,
recordtext as retdata
from
eventactivity
join (
select
incidentid
 from
k_h
where
id = 33396 and
k_h.entrydate >= '2006-1-1 00:00' and
k_h.entrydate < '2006-4-8 00:00'
) id_keywords using ( incidentid ) ,
  
eventmain,

eventgeo
 where
eventmain.incidentid = eventactivity.incidentid and
eventmain.incidentid = eventgeo.incidentid and
( ' ' || recordtext || ' ' like '%HAL%' ) and
eventactivity.entrydate >= '2006-1-1 00:00' and
eventactivity.entrydate < '2006-4-8 00:00'
order by
eventmain.entrydate limit 1;



EXPLAIN ANALYZE OUTPUT

Limit  (cost=2521191.65..2521191.90 rows=6 width=187) (actual 
time=1360935.787..1361072.277 rows=1400 loops=1)
  ->  Unique  (cost=2521191.65..2521191.90 rows=6 width=187) (actual 
time=1360935.779..1361067.853 rows=1400 loops=1)
->  Sort  (cost=2521191.65..2521191.66 rows=6 width=187) 
(actual time=1360935.765..1360958.258 rows=16211 loops=1)
  Sort Key: eventmain.entrydate, eventmain.incidentid, 
eventactivity.recordtext, eventgeo.eventlocation
  ->  Nested Loop  (cost=219.39..2521191.57 rows=6 
width=187) (actual time=1123.115..1360579.798 rows=16211 loops=1)
->  Nested Loop  (cost=219.39..2521173.23 rows=6 
width=154) (actual time=1105.773..1325907.716 rows=16211 loops=1)
  ->  Hash Join  (cost=219.39..2521153.37 
rows=6 width=66) (actual time=1069.476..1289608.261 rows=16211 loops=1)
Hash Cond: (("outer".incidentid)::text 
= ("inner".incidentid)::text)
->  Seq Scan on eventactivity  
(cost=0.00..2518092.06 rows=1532 width=52) (actual 
time=57.205..1288514.530 rows=2621 loops=1)
  Filter: ' '::text || 
(recordtext)::text) || ' '::text) ~~ '%HAL%'::text) AND (entrydate >= 
'2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate < 
'2006-04-08 00:00:00'::timestamp without time zone))
->  Hash  (cost=217.53..217.53 rows=741 
width=14) (actual time=899.128..899.128 rows=0 loops=1)
  ->  Index Scan using k_h_id_idx 
on k_h  (cost=0.00..217.53 rows=741 width=14) (actual 
time=55.097..893.883 rows=1162 loops=1)

Index Cond: (id = 33396)
Filter: ((entrydate >= 
'2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate < 
'2006-04-08 00:00:00'::timestamp without time zone))
  ->  Index Scan using eventmain_incidentid_idx 
on eventmain  (cost=0.00..3.30 rows=1 width=88) (actual 
time=1.866..2.227 rows=1 loops=16211)
Index Cond: 
((eventmain.incidentid)::text = ("outer".incidentid)::text)
->  Index Scan using eventgeo_incidentid_idx on 
eventgeo  (cost=0.00..3.04 rows=1 width=75) (actual time=1.770..2.126 
rows=1 loops=16211)
  Index Cond: ((eventgeo.incidentid)::text = 
("outer".incidentid)::text)

Total runtime: 1361080.787 ms
(19 rows)


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Encouraging multi-table join order

2006-04-10 Thread Dan Harris

Tom Lane wrote:

That's very strange --- the estimated cost of the seqscan is high enough
that the planner should have chosen a nestloop with inner indexscan on
the big table.  I'm not sure about the join-order point, but the hash
plan for the first join seems wrong in any case.

Um, you do have an index on eventactivity.incidentid, right?  What's the
datatype(s) of the incidentid columns?  What happens to the plan if you
turn off enable_hashjoin and enable_mergejoin?

regards, tom lane
  
Yes, eventactivity.incidentid is indexed.  The datatype is varchar(40).  
Although, by checking this, I noticed that k_h.incidentid was 
varchar(100).  Perhaps the difference in length between the keys caused 
the planner to not use the fastest method?  I have no defense as to why 
those aren't the same.. I will make them so and check.


Here's the EXPLAIN analyze with enable_hashjoin = off and 
enable_mergejoin = off :


Limit  (cost=4226535.73..4226544.46 rows=698 width=82) (actual 
time=74339.016..74356.521 rows=888 loops=1)
  ->  Unique  (cost=4226535.73..4226544.46 rows=698 width=82) (actual 
time=74339.011..74354.073 rows=888 loops=1)
->  Sort  (cost=4226535.73..4226537.48 rows=698 width=82) 
(actual time=74339.003..74344.031 rows=3599 loops=1)
  Sort Key: eventmain.entrydate, eventmain.incidentid, 
eventgeo.eventlocation, eventactivity.recordtext
  ->  Nested Loop  (cost=0.00..4226502.76 rows=698 
width=82) (actual time=921.325..74314.959 rows=3599 loops=1)
->  Nested Loop  (cost=0.00..4935.61 rows=731 
width=72) (actual time=166.354..14638.308 rows=1162 loops=1)
  ->  Nested Loop  (cost=0.00..2482.47 rows=741 
width=50) (actual time=150.396..7348.013 rows=1162 loops=1)
->  Index Scan using k_h_id_idx on k_h  
(cost=0.00..217.55 rows=741 width=14) (actual time=129.540..1022.243 
rows=1162 loops=1)

  Index Cond: (id = 33396)
  Filter: ((entrydate >= 
'2006-01-01 00:00:00'::timestamp without time zone) AND (entrydate < 
'2006-04-08 00:00:00'::timestamp without time zone))
->  Index Scan using 
eventgeo_incidentid_idx on eventgeo  (cost=0.00..3.04 rows=1 width=36) 
(actual time=5.260..5.429 rows=1 loops=1162)
  Index Cond: 
((eventgeo.incidentid)::text = ("outer".incidentid)::text)
  ->  Index Scan using eventmain_incidentid_idx 
on eventmain  (cost=0.00..3.30 rows=1 width=22) (actual 
time=5.976..6.259 rows=1 loops=1162)
Index Cond: 
((eventmain.incidentid)::text = ("outer".incidentid)::text)
->  Index Scan using eventactivity1 on 
eventactivity  (cost=0.00..5774.81 rows=20 width=52) (actual 
time=29.768..51.334 rows=3 loops=1162)
  Index Cond: (("outer".incidentid)::text = 
(eventactivity.incidentid)::text)
  Filter: ' '::text || (recordtext)::text) 
|| ' '::text) ~~ '%HAL%'::text) AND (entrydate >= '2006-01-01 
00:00:00'::timestamp without time zone) AND (entrydate < '2006-04-08 
00:00:00'::timestamp without time zone))




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Encouraging multi-table join order

2006-04-11 Thread Dan Harris

Tom Lane wrote:


So it's estimating 5775 cost units per probe into eventactivity, which
is pretty high --- it must think that a lot of rows will be retrieved by
the index (way more than the 20 or so it thinks will get past the filter
condition). 



 What does the pg_stats entry for eventactivity.incidentid
contain?
select * from pg_stats where tablename = 'eventactivity' and 
attname='incidentid';
schemaname |   tablename   |  attname   | null_frac | avg_width | 
n_distinct |
most_common_vals 
|
most_common_freqs
|  
histogram_bounds  | 
correlation

+---++---+---++-+-++-
public | eventactivity | incidentid | 0 |14 |   
8157 | 
{P043190299,P051560740,P052581036,P052830218,P053100679,P053190889,P060370845,P042070391,P042690319,P043290117} 
| 
{0.0017,0.0017,0.0017,0.0017,0.0017,0.0017,0.0017,0.0013,0.0013,0.0013} 
| 
{P022140319,P030471058,P033090308,P041961082,P042910689,P050311006,P051350254,P052261148,P053270945,P060240316,P061000287} 
|0.241737



  It might be worth increasing the statistics target for that
column to try to get a better estimate.
  
How high should I set this?  I read the default is 10, but I'm not sure 
if doubling this would make a difference or if I should be doing a much 
larger number. There's approx 45 million rows in the table, if that matters.



Thanks again,
Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Encouraging multi-table join order

2006-04-11 Thread Dan Harris

Tom Lane wrote:

What the stats entry is saying is that the most common entries occur
about 75000 times apiece (0.0017 * 45e6), which is what's scaring
the planner here ;-).  I think those frequencies are artificially high
though.  The default statistics sample size is 3000 rows (300 *
statistics target, actually), so those numbers correspond to 5 or 4
rows in the sample, which is probably just random chance.

Try increasing the stats targets for this table to 100, then re-ANALYZE
and see what you get.  The most_common_freqs entries might drop as much
as a factor of 10.

regards, tom lane
  


Tom:

I believe this was the problem.  I upped the statistics to 100, for a 
sample size of 30k and now the planner does the correct nested 
loop/index scan and takes only 30 seconds!  This is a HUGE performance 
increase.


I wonder why the estimates were so far off the first time?  This table 
has been ANALYZED regularly ever since creation.


Once again, thank you and all of the developers for your hard work on 
PostgreSQL.  This is by far the most pleasant management experience of 
any database I've worked on.


-Dan


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Running on an NFS Mounted Directory

2006-04-26 Thread Dan Gorman

We have gotten very good performance from netapp and postgres 7.4.11 .

I was able to push about 100MB/s over gigE, but that was limited by  
our netapp.


DAS will generally always be faster, but if for example you have 2  
disks vs. 100 NFS mounted ,NFS will be faster.


NFS is very reliable and I would stay away from iscsi.



Regards,
Dan Gorman

On Apr 26, 2006, at 7:35 PM, Steve Wampler wrote:


On Wed, Apr 26, 2006 at 10:06:58PM -0400, Ketema Harris wrote:
I was wondering if there were any performance issues with having a  
data
directory that was an nfs mounted drive?  Say like a SAN or NAS  
device? Has

anyone done this before?


My understanding is that NFS is pretty poor in performance in general,
so I would expect it to be particularly bad for a DB.  You might run
some (non-DB) performance tests to get a feel for how bad it might me.
(Someone once told me that NFS topped out at around 12MB/s, but I  
don't

know if that's really true [they were trying to sell a competitive
networked filesystem]).

In any event, you're at least limited by ethernet speeds, if not more.

--
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of  
broadcast)---

TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Running on an NFS Mounted Directory

2006-04-27 Thread Dan Gorman

So do NAS's

Dan

On Apr 27, 2006, at 6:42 AM, Ketema Harris wrote:


The SAN has the snapshot capability.


On 4/27/06 9:31 AM, "Bruno Wolff III" <[EMAIL PROTECTED]> wrote:


On Thu, Apr 27, 2006 at 09:06:48 -0400,
  Ketema Harris <[EMAIL PROTECTED]> wrote:
Yes, your right, I meant not have to do the backups from the db  
server
itself.  I can do that within the storage device now, by  
allocating space
for it, and letting the device copy the data files on some  
periodic basis.


Only if the database server isn't running or your SAN provides a  
way to

provide a snapshot of the data at a particular instant in time.




---(end of  
broadcast)---

TIP 4: Have you searched our list archives?

   http://archives.postgresql.org




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


  1   2   >