Re: [PERFORM] size of pg_dump files containing bytea values

2006-07-14 Thread Florian Weimer
* Greg Stark:

> Didn't byteas used to get printed as hex?

No, they didn't.  It would be useful to support hexadecimal BYTEA
literals, though.  Unfortunately, X'DEADBEEF' has already been taken
by bit strings.

-- 
Florian Weimer<[EMAIL PROTECTED]>
BFK edv-consulting GmbH   http://www.bfk.de/
Durlacher Allee 47tel: +49-721-96201-1
D-76131 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Kill a session

2006-07-14 Thread Markus Schaber
Hi, Tom,

Tom Lane wrote:
> Our expectation is that all or at least most queries should respond to
> SIGINT or SIGTERM interrupts pretty rapidly, say on a less-than-a-second
> timescale.  However there are various loops in the backend that fail to
> execute CHECK_FOR_INTERRUPTS sufficiently often :-(. 

The same is true for user-defined C funtions.

The PostGIS GEOS geometry functions come to mind, for complex
geometries, they can need hours to complete. And as GEOS is a 3rd-Party
library, I don't see an easy way to make them CHECK_FOR_INTERRUPTS.

Does anybody know how this is for plpgsql, pljava and plpython?


HTH,
Markus
-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Kill a session

2006-07-14 Thread Craig A. James

Tom Lane wrote:

"Craig A. James" <[EMAIL PROTECTED]> writes:

Bottom line is that I was expecting "instant death" with SIGTERM, but
instead got an agonizing, drawn out -- but safe -- death of the query.


What was the query exactly?

Our expectation is that all or at least most queries should respond to
SIGINT or SIGTERM interrupts pretty rapidly, say on a less-than-a-second
timescale.  However there are various loops in the backend that fail to
execute CHECK_FOR_INTERRUPTS sufficiently often :-(.  We've been
gradually finding and fixing these, and will be glad to fix your case
if you provide enough details to pin it down.  You might be interested
in this current thread about a similar problem:

http://archives.postgresql.org/pgsql-patches/2006-07/msg00039.php


Thanks, this is good information.  The qsort is a distinct possibility.  The 
query is a big

  insert into some_hitlist (select id from another_hitlist join data_table on 
(...))

where the hitlists are unindexed.  So it may be using a merge-join with qsort.  
When I have a few minutes, I'll turn on logging in the app and find the exact 
SQL, and run an EXPLAIN ANALYZE and see what's really happening.

It's also possible that the INSERT itself is the problem, or adds to the 
problem.  The SIGINT may come after a few million rows have been inserted, so 
it would have to clean that up, right?

Thanks,
Craig


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Kill a session

2006-07-14 Thread Tom Lane
"Craig A. James" <[EMAIL PROTECTED]> writes:
> It's also possible that the INSERT itself is the problem, or adds to the 
> problem.  The SIGINT may come after a few million rows have been inserted, so 
> it would have to clean that up, right?

No, because we don't use UNDO.  The next VACUUM would have a bit of a
mess to clean up, but you wouldn't pay for it at the time of the abort.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Self-join query and index usage

2006-07-14 Thread worky . workerson
I'm doing a self join of some shipping data and wanted to get the best
query possible.  The interesting table is the event table, and it has
the following structure:

  startnode int,
  endnode int,
  weight int,
  starttime timestamp,
  endtime timestamp

and the query that I would like to run is:

SELECT e1.endnode, count(*), sum(e1.weight) AS weight1, sum(e2.weight)
AS weight2
FROM event e1, event e2
WHERE e1.endnode = e2.startnode AND e1.starttime < e2.starttime AND
e2.starttime < e1.endtime
GROUP BY e1.endnode

Assuming that I have indexes on all the columns, should this query be
able to make use of the indexes on starttime and endtime?

The "best" plan that I could see is a merge join between a sorted
sequential scan on e2.startnode and an index scan on e1.endnode, which
I figure takes care of the "e1.endnode = e2.startnode".  The join
filter is then "e1.starttime < e2.starttime AND e2.starttime <
e1.endtime" ... does this use an index?  Can the planner to use a
bitmap index scan to use the indexes on the start/endtimes in the join?

Table is about 3GB.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Self-join query and index usage

2006-07-14 Thread Tom Lane
[EMAIL PROTECTED] writes:
> and the query that I would like to run is:

> SELECT e1.endnode, count(*), sum(e1.weight) AS weight1, sum(e2.weight)
> AS weight2
> FROM event e1, event e2
> WHERE e1.endnode = e2.startnode AND e1.starttime < e2.starttime AND
> e2.starttime < e1.endtime
> GROUP BY e1.endnode

> Assuming that I have indexes on all the columns, should this query be
> able to make use of the indexes on starttime and endtime?

This is just really poorly suited for btree indexes.  What you're
looking for is an interval overlap test, which is something that can be
handled by rtree or gist indexes, but you'll need to change the form of
the query ...

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend