Re: [PERFORM] How to improve db performance with $7K?

2005-03-29 Thread PFC

With hardware tuning, I am sure we can do better than 35Mb per sec. Also
WTF ?
My Laptop does 19 MB/s (reading <10 KB files, reiser4) !
A recent desktop 7200rpm IDE drive
# hdparm -t /dev/hdc1
/dev/hdc1:
 Timing buffered disk reads:  148 MB in  3.02 seconds =  49.01 MB/sec
# ll "DragonBall 001.avi"
-r--r--r--1 peufeu   users218M mar  9 20:07 DragonBall 001.avi
# time cat "DragonBall 001.avi" >/dev/null
real0m4.162s
user0m0.020s
sys 0m0.510s
(the file was not in the cache)
=> about 52 MB/s (reiser3.6)
So, you have a problem with your hardware...
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Simon Riggs
On Tue, 2005-03-29 at 01:48 -0700, Karim A Nassar wrote:
> > Each value has 1/13th of the table, which is too many rows per value to
> > make an IndexScan an efficient way of deleting rows from the table.
> 
> But, the original question was that the delete that was taking a long time
> was on a different table. I tried to delete 150 rows from a table with 750
> rows, which is FK referenced from this large table. If I understand
> correctly, Tom suggested that the length of time was due to a sequential
> scan being done on the large table for each value being deleted from the
> small one.

> For this FK check, there only need be one referring id to invalidate the
> delete. ISTM that for any delete with a FK reference, the index could
> always be used to search for a single value in the referring table
> (excepting very small tables). Why then must a sequential scan be
> performed in this case, and/or in general? 

My understanding was that you were doing a DELETE on the smaller table
and that this was doing a DELETE on the measurement table because you
had the FK defined as ON DELETE CASCADE. You are right - only a single
row is sufficient to RESTRICT the DELETE. But for an ON UPDATE/DELETE
action of CASCADE then you will want to touch all rows referenced, so a
SeqScan is a perfectly valid consequence of such actions.
I think now that you are using the default action, rather than
specifically requesting CASCADE?

Stephan, Tom: 
The SQL generated for RI checking by the RI triggers currently applies a
limit at execution time, not at prepare time. i.e. there is no LIMIT
clause in the SQL. 

We know whether the check will be limit 1 or limit 0 at prepare time, so
why not add a LIMIT clause to the SQL so it changes the plan, not just
the number of rows returned when the check query executes?
(I note that PREPARE does allow you to supply a LIMIT 1 clause).

That is *ought* to have some effect on the plan used by the RI check
queries. In costsize.c:cost_index we would have tuples_fetched==1 and it
would be hard (but not impossible) for the index cost to ever be more
than the cost of a SeqScan. 

...but, I see no way for OidFunctionCall8 to ever return an answer of
"always just 1 row, no matter how big the relation"...so tuples_fetched
is always proportional to the size of the relation. Are unique indexes
treated just as very-low-selectivity indexes? - they're a very similar
situation in terms of forcing an absolute, not relative, number of rows
returned.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] How to improve db performance with $7K?

2005-03-29 Thread Dave Cramer
Yeah, 35Mb per sec is slow for a raid controller, the 3ware mirrored is
about 50Mb/sec, and striped is about 100
Dave
PFC wrote:

With hardware tuning, I am sure we can do better than 35Mb per sec. Also

WTF ?
My Laptop does 19 MB/s (reading <10 KB files, reiser4) !
A recent desktop 7200rpm IDE drive
# hdparm -t /dev/hdc1
/dev/hdc1:
 Timing buffered disk reads:  148 MB in  3.02 seconds =  49.01 MB/sec
# ll "DragonBall 001.avi"
-r--r--r--1 peufeu   users218M mar  9 20:07 DragonBall 
001.avi

# time cat "DragonBall 001.avi" >/dev/null
real0m4.162s
user0m0.020s
sys 0m0.510s
(the file was not in the cache)
=> about 52 MB/s (reiser3.6)
So, you have a problem with your hardware...
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


--
Dave Cramer
http://www.postgresintl.com
519 939 0336
ICQ#14675561
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Stephan Szabo
On Tue, 29 Mar 2005, Simon Riggs wrote:

> On Tue, 2005-03-29 at 01:48 -0700, Karim A Nassar wrote:
> > > Each value has 1/13th of the table, which is too many rows per value to
> > > make an IndexScan an efficient way of deleting rows from the table.
> >
> > But, the original question was that the delete that was taking a long time
> > was on a different table. I tried to delete 150 rows from a table with 750
> > rows, which is FK referenced from this large table. If I understand
> > correctly, Tom suggested that the length of time was due to a sequential
> > scan being done on the large table for each value being deleted from the
> > small one.
>
> > For this FK check, there only need be one referring id to invalidate the
> > delete. ISTM that for any delete with a FK reference, the index could
> > always be used to search for a single value in the referring table
> > (excepting very small tables). Why then must a sequential scan be
> > performed in this case, and/or in general?
>
> My understanding was that you were doing a DELETE on the smaller table
> and that this was doing a DELETE on the measurement table because you
> had the FK defined as ON DELETE CASCADE. You are right - only a single
> row is sufficient to RESTRICT the DELETE. But for an ON UPDATE/DELETE
> action of CASCADE then you will want to touch all rows referenced, so a
> SeqScan is a perfectly valid consequence of such actions.
> I think now that you are using the default action, rather than
> specifically requesting CASCADE?
>
> Stephan, Tom:
> The SQL generated for RI checking by the RI triggers currently applies a
> limit at execution time, not at prepare time. i.e. there is no LIMIT
> clause in the SQL.
>
> We know whether the check will be limit 1 or limit 0 at prepare time, so
> why not add a LIMIT clause to the SQL so it changes the plan, not just
> the number of rows returned when the check query executes?

Because IIRC, FOR UPDATE and LIMIT at least historically didn't play
nicely together, so you could sometimes get a result where if the first
row was locked, the FOR UPDATE would wait on it, but if it was deleted by
the other transaction you could get 0 rows back in the trigger.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Stephan Szabo
On Tue, 29 Mar 2005, Stephan Szabo wrote:

> On Tue, 29 Mar 2005, Simon Riggs wrote:
>
> > On Tue, 2005-03-29 at 01:48 -0700, Karim A Nassar wrote:
> > > > Each value has 1/13th of the table, which is too many rows per value to
> > > > make an IndexScan an efficient way of deleting rows from the table.
> > >
> > > But, the original question was that the delete that was taking a long time
> > > was on a different table. I tried to delete 150 rows from a table with 750
> > > rows, which is FK referenced from this large table. If I understand
> > > correctly, Tom suggested that the length of time was due to a sequential
> > > scan being done on the large table for each value being deleted from the
> > > small one.
> >
> > > For this FK check, there only need be one referring id to invalidate the
> > > delete. ISTM that for any delete with a FK reference, the index could
> > > always be used to search for a single value in the referring table
> > > (excepting very small tables). Why then must a sequential scan be
> > > performed in this case, and/or in general?
> >
> > My understanding was that you were doing a DELETE on the smaller table
> > and that this was doing a DELETE on the measurement table because you
> > had the FK defined as ON DELETE CASCADE. You are right - only a single
> > row is sufficient to RESTRICT the DELETE. But for an ON UPDATE/DELETE
> > action of CASCADE then you will want to touch all rows referenced, so a
> > SeqScan is a perfectly valid consequence of such actions.
> > I think now that you are using the default action, rather than
> > specifically requesting CASCADE?
> >
> > Stephan, Tom:
> > The SQL generated for RI checking by the RI triggers currently applies a
> > limit at execution time, not at prepare time. i.e. there is no LIMIT
> > clause in the SQL.
> >
> > We know whether the check will be limit 1 or limit 0 at prepare time, so
> > why not add a LIMIT clause to the SQL so it changes the plan, not just
> > the number of rows returned when the check query executes?
>
> Because IIRC, FOR UPDATE and LIMIT at least historically didn't play
> nicely together, so you could sometimes get a result where if the first
> row was locked, the FOR UPDATE would wait on it, but if it was deleted by
> the other transaction you could get 0 rows back in the trigger.

If there were some way to pass a "limit" into SPI_prepare that was treated
similarly to a LIMIT clause for planning purposes but didn't actually
change the output plan to only return that number of rows, we could use
that.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Tom Lane
Stephan Szabo <[EMAIL PROTECTED]> writes:
> On Tue, 29 Mar 2005, Simon Riggs wrote:
>> The SQL generated for RI checking by the RI triggers currently applies a
>> limit at execution time, not at prepare time. i.e. there is no LIMIT
>> clause in the SQL.
>> 
>> We know whether the check will be limit 1 or limit 0 at prepare time, so
>> why not add a LIMIT clause to the SQL so it changes the plan, not just
>> the number of rows returned when the check query executes?

> Because IIRC, FOR UPDATE and LIMIT at least historically didn't play
> nicely together, so you could sometimes get a result where if the first
> row was locked, the FOR UPDATE would wait on it, but if it was deleted by
> the other transaction you could get 0 rows back in the trigger.

Yeah, this is still true.  It would probably be a good idea to change it
but I haven't looked into exactly what would be involved.  The basic
problem is that the FOR UPDATE filter needs to execute before LIMIT
instead of after, so presumably the FOR UPDATE shenanigans in execMain.c
would need to be pushed into a separate plan node that could go
underneath the LIMIT node.

Originally this would have led to even more broken behavior --- locks
taken on rows that weren't returned --- because the original coding of
the LIMIT node tended to pull one more row from the lower plan than it
would actually return.  But we fixed that.

I think having such a node might allow us to support FOR UPDATE in
subqueries, as well, but I haven't looked at the details.  (Whether that
is a good idea is another question --- the problem of pulling rows that
aren't nominally necessary, and thereby locking them, would apply in
spades.)

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> ...but, I see no way for OidFunctionCall8 to ever return an answer of
> "always just 1 row, no matter how big the relation"...so tuples_fetched
> is always proportional to the size of the relation. Are unique indexes
> treated just as very-low-selectivity indexes?

Yeah.  It is not the job of amcostestimate to estimate the number of
rows, only the index access cost.  (IIRC there is someplace in the
planner that explicitly considers unique indexes as a part of developing
selectivity estimates ... but it's not that part.)

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] How to improve db performance with $7K?

2005-03-29 Thread Cott Lang
On Mon, 2005-03-28 at 17:36 +, Steve Poe wrote:

> I agree with you. Unfortunately, I am not the developer of the 
> application. The vendor uses ProIV which connects via ODBC.  The vendor 
> could certain do some tuning and create more indexes where applicable. I 
> am encouraging the vendor to take a more active role and we work 
> together on this.

I've done a lot browsing through pg_stat_activity, looking for queries
that either hang around for a while or show up very often, and using
explain to find out if they can use some assistance.

You may also find that a dump and restore with a reconfiguration to
mirrored drives speeds you up a lot - just from the dump and restore.

> With hardware tuning, I am sure we can do better than 35Mb per sec. Also 
> moving the top 3 or 5 tables and indexes to their own slice of a RAID10 
> and moving pg_xlog to its own drive will help too.

If your database activity involves a lot of random i/o, 35Mb per second
wouldn't be too bad.

While conventional wisdom is that pg_xlog on its own drives (I know you
meant plural :) ) is a big boost, in my particular case I could never
get a a measurable boost that way. Obviously, YMMV.




---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Tom Lane
Stephan Szabo <[EMAIL PROTECTED]> writes:
> If there were some way to pass a "limit" into SPI_prepare that was treated
> similarly to a LIMIT clause for planning purposes but didn't actually
> change the output plan to only return that number of rows, we could use
> that.

Hmm ... the planner does have the ability to do that sort of thing (we
use it for cursors).  SPI_prepare doesn't expose the capability.
Perhaps adding a SPI_prepare variant that does expose it would be the
quickest route to a solution.

I get a headache every time I look at the RI triggers ;-).  Do they
always know at the time of preparing a plan which way it will be used?

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Simon Riggs
On Tue, 2005-03-29 at 05:50 -0800, Stephan Szabo wrote:
> On Tue, 29 Mar 2005, Simon Riggs wrote:
> 
> > On Tue, 2005-03-29 at 01:48 -0700, Karim A Nassar wrote:
> > > > Each value has 1/13th of the table, which is too many rows per value to
> > > > make an IndexScan an efficient way of deleting rows from the table.
> > >
> > > But, the original question was that the delete that was taking a long time
> > > was on a different table. I tried to delete 150 rows from a table with 750
> > > rows, which is FK referenced from this large table. If I understand
> > > correctly, Tom suggested that the length of time was due to a sequential
> > > scan being done on the large table for each value being deleted from the
> > > small one.
> >
> > > For this FK check, there only need be one referring id to invalidate the
> > > delete. ISTM that for any delete with a FK reference, the index could
> > > always be used to search for a single value in the referring table
> > > (excepting very small tables). Why then must a sequential scan be
> > > performed in this case, and/or in general?
> >
> > My understanding was that you were doing a DELETE on the smaller table
> > and that this was doing a DELETE on the measurement table because you
> > had the FK defined as ON DELETE CASCADE. You are right - only a single
> > row is sufficient to RESTRICT the DELETE. But for an ON UPDATE/DELETE
> > action of CASCADE then you will want to touch all rows referenced, so a
> > SeqScan is a perfectly valid consequence of such actions.
> > I think now that you are using the default action, rather than
> > specifically requesting CASCADE?
> >
> > Stephan, Tom:
> > The SQL generated for RI checking by the RI triggers currently applies a
> > limit at execution time, not at prepare time. i.e. there is no LIMIT
> > clause in the SQL.
> >
> > We know whether the check will be limit 1 or limit 0 at prepare time, so
> > why not add a LIMIT clause to the SQL so it changes the plan, not just
> > the number of rows returned when the check query executes?
> 
> Because IIRC, FOR UPDATE and LIMIT at least historically didn't play
> nicely together, so you could sometimes get a result where if the first
> row was locked, the FOR UPDATE would wait on it, but if it was deleted by
> the other transaction you could get 0 rows back in the trigger.
> 

Well, sorry to ask more...

...but surely we only need FOR UPDATE clause if we are performing a
CASCADE action? whereas we only want the LIMIT 1 clause if we are NOT
performing a CASCADE action? That way the two clauses are mutually
exclusive and the problem you outline should never (need to) occur.

The current code doesn't seem to vary the check query according to the
requested FK action...

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> If action is NO ACTION or RESTRICT then 
>   we need to SELECT at most 1 row that matches the criteria
>   which means we can use LIMIT 1

> If action is CASCADE, SET NULL, SET DEFAULT then
>   we need to UPDATE or DELETE all rows that match the criteria
>   which means we musnt use LIMIT and need to use FOR UPDATE

Huh?  UPDATE/DELETE don't use FOR UPDATE.  I think you have failed
to break down the cases sufficiently.  In particular it matters which
side of the RI constraint you are working from ...

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Simon Riggs
On Tue, 2005-03-29 at 09:56 -0500, Tom Lane wrote:
> Stephan Szabo <[EMAIL PROTECTED]> writes:
> > If there were some way to pass a "limit" into SPI_prepare that was treated
> > similarly to a LIMIT clause for planning purposes but didn't actually
> > change the output plan to only return that number of rows, we could use
> > that.
> 
> Hmm ... the planner does have the ability to do that sort of thing (we
> use it for cursors).  SPI_prepare doesn't expose the capability.
> Perhaps adding a SPI_prepare variant that does expose it would be the
> quickest route to a solution.
> 
> I get a headache every time I look at the RI triggers ;-).  Do they
> always know at the time of preparing a plan which way it will be used?

If action is NO ACTION or RESTRICT then 
we need to SELECT at most 1 row that matches the criteria
which means we can use LIMIT 1

If action is CASCADE, SET NULL, SET DEFAULT then
we need to UPDATE or DELETE all rows that match the criteria
which means we musnt use LIMIT and need to use FOR UPDATE

We know that at CONSTRAINT creation time, which always occurs before
plan preparation time.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Simon Riggs
On Tue, 2005-03-29 at 10:31 -0500, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > If action is NO ACTION or RESTRICT then 
> > we need to SELECT at most 1 row that matches the criteria
> > which means we can use LIMIT 1
> 
> > If action is CASCADE, SET NULL, SET DEFAULT then
> > we need to UPDATE or DELETE all rows that match the criteria
> > which means we musnt use LIMIT and need to use FOR UPDATE
> 
> Huh?  UPDATE/DELETE don't use FOR UPDATE.  I think you have failed
> to break down the cases sufficiently.  In particular it matters which
> side of the RI constraint you are working from ...

OK... too quick, sorry. I'll hand over to Stephan for a better and more
exhaustive explanation/analysis... but AFAICS we *can* always know the
correct formulation of the query prepare time, whether or not we do
currently.

Best Regards, Simon Riggs




---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Stephan Szabo

On Tue, 29 Mar 2005, Simon Riggs wrote:

> On Tue, 2005-03-29 at 10:31 -0500, Tom Lane wrote:
> > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > If action is NO ACTION or RESTRICT then
> > >   we need to SELECT at most 1 row that matches the criteria
> > >   which means we can use LIMIT 1
> >
> > > If action is CASCADE, SET NULL, SET DEFAULT then
> > >   we need to UPDATE or DELETE all rows that match the criteria
> > >   which means we musnt use LIMIT and need to use FOR UPDATE
> >
> > Huh?  UPDATE/DELETE don't use FOR UPDATE.  I think you have failed
> > to break down the cases sufficiently.  In particular it matters which
> > side of the RI constraint you are working from ...
>
> OK... too quick, sorry. I'll hand over to Stephan for a better and more
> exhaustive explanation/analysis... but AFAICS we *can* always know the
> correct formulation of the query prepare time, whether or not we do
> currently.

We currently use FOR UPDATE on the NO ACTION check, because otherwise we
might get back a row that's already marked for deletion by a concurrent
transaction.  I think that's intended to wait and succeed, not fail.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Stephan Szabo
On Tue, 29 Mar 2005, Tom Lane wrote:

> Stephan Szabo <[EMAIL PROTECTED]> writes:
> > If there were some way to pass a "limit" into SPI_prepare that was treated
> > similarly to a LIMIT clause for planning purposes but didn't actually
> > change the output plan to only return that number of rows, we could use
> > that.
>
> Hmm ... the planner does have the ability to do that sort of thing (we
> use it for cursors).  SPI_prepare doesn't expose the capability.
> Perhaps adding a SPI_prepare variant that does expose it would be the
> quickest route to a solution.
>
> I get a headache every time I look at the RI triggers ;-).  Do they

Me too, honestly.

> always know at the time of preparing a plan which way it will be used?

I believe so.  I think each saved plan pretty much lives for a single
trigger type/argument set and is basically used in only one place.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Simon Riggs
On Tue, 2005-03-29 at 09:40 -0500, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > ...but, I see no way for OidFunctionCall8 to ever return an answer of
> > "always just 1 row, no matter how big the relation"...so tuples_fetched
> > is always proportional to the size of the relation. Are unique indexes
> > treated just as very-low-selectivity indexes?
> 
> Yeah.  It is not the job of amcostestimate to estimate the number of
> rows, only the index access cost.  (IIRC there is someplace in the
> planner that explicitly considers unique indexes as a part of developing
> selectivity estimates ... but it's not that part.)

Well, I mention this because costsize.c:cost_index *does* calculate the
number of rows returned. If unique indexes are handled elsewhere then
this would not cause problems for them...but for LIMIT queries..?

cost_index gets the selectivity then multiplies that by number of tuples
in the relation to calc tuples_fetched, so it can use that in the
Mackert & Lohman formula. There's no consideration of the query limits.

That implies to me that LIMIT queries are not considered correctly in
the M&L formula and thus we are more likely to calculate a too-high cost
for using an index in those circumstancesand thus more likely to
SeqScan for medium sized relations?

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> That implies to me that LIMIT queries are not considered correctly in
> the M&L formula and thus we are more likely to calculate a too-high cost
> for using an index in those circumstancesand thus more likely to
> SeqScan for medium sized relations?

You misunderstand how LIMIT is handled.  The plan structure is

LIMIT ...
regular plan ...

and so the strategy is to plan and cost the regular plan as though it
would be carried out in full, and then take an appropriate fraction
of that at the LIMIT stage.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] How to improve db performance with $7K?

2005-03-29 Thread Greg Stark

Dave Cramer <[EMAIL PROTECTED]> writes:

> PFC wrote:
> >
> > My Laptop does 19 MB/s (reading <10 KB files, reiser4) !
>
> Yeah, 35Mb per sec is slow for a raid controller, the 3ware mirrored is
> about 50Mb/sec, and striped is about 100

Well you're comparing apples and oranges here. A modern 7200rpm drive should
be capable of doing 40-50MB/s depending on the location of the data on the
disk. 

But that's only doing sequential access of data using something like dd and
without other processes intervening and causing seeks. In practice it seems a
busy databases see random_page_costs of about 4 which for a drive with 10ms
seek time translates to only about 3.2MB/s.

I think the first order of business is getting pg_xlog onto its own device.
That alone should remove a lot of the seeking. If it's an ext3 device I would
also consider moving the journal to a dedicated drive as well. (or if they're
scsi drives or you're sure the raid controller is safe from write caching then
just switch file systems to something that doesn't journal data.)


-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Million of rows

2005-03-29 Thread Vinicius Bernardi
Hi everybody...

I'm new hear, and I will try to explain my problem, and maybe I can
get a help...

I'm writing a software for 3 years, and this software has the position
GPS from vehicles, and other informations.

My problem starts when I had to store all the data about the vehicles,
about 1 or 2 months.

Actually I had a table called DADO_LIDO, that I write all information
and the primary key is DATA (GPS DAY+HOUR) and the VEHICLE IDENTIFY.

Each vehicle trasmit 1 position by 30 seconds, so I have something
like 2000 rows per vehicle/day. I already has 2 clients one with 4000
vehicles, and the other with 500 vehicles.

My application was made in delphi using ZEOS that's permit me testing
in mysql and postgres.

I allready has the two databases.

But now the problem starts when I has to select data from this
vehicles about the history ( I store only 2 months ) something like 40
or 50 millions of data about 500 vehicles.

Using the keys VEHICLE_ID and GPS_TIME, the perfomance is very low...

I need some ideas for a better perfomance in this table

using selects by 
PERIOD / VEHICLE
PERIOD / VEHICLES
PERIOD / VEHICLE / ( a bit test in 3 integer columns using logical operators )

Thanks for any help

Vinicius Marques De Bernardi

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-29 Thread Simon Riggs
On Tue, 2005-03-29 at 12:31 -0500, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > That implies to me that LIMIT queries are not considered correctly in
> > the M&L formula and thus we are more likely to calculate a too-high cost
> > for using an index in those circumstancesand thus more likely to
> > SeqScan for medium sized relations?
> 
> You misunderstand how LIMIT is handled.  

Huh? Well, not this time. (Though my error rate is admittedly high.)

> The plan structure is
> 
>   LIMIT ...
>   regular plan ...
> 
> and so the strategy is to plan and cost the regular plan as though it
> would be carried out in full, and then take an appropriate fraction
> of that at the LIMIT stage.

To cost it as if it would be carried out in full and then not execute in
full is the same thing as saying it overestimates the actual execution
cost. Which can lead to selection of SeqScan plan when the IndexScan
would have been cheaper, all things considered.

...it could work like this

LIMIT 
regular plan (plan chosen knowing that LIMIT follows)

so that the LIMIT would be considered in the M&L formula.

Not that I am driven by how other systems work, but both DB2 and Oracle
allow this form of optimization.

There's not a huge benefit in sending LIMIT 1 through on the FK check
queries unless they'd be taken into account in the planning.

Anyway, I'm not saying I know how to do this yet/ever, just to say it is
possible to use the information available to better effect.

This looks like a TODO item to me? Thoughts?

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Million of rows

2005-03-29 Thread Michael Fuhr
On Tue, Mar 29, 2005 at 03:33:24PM -0300, Vinicius Bernardi wrote:
>
> But now the problem starts when I has to select data from this
> vehicles about the history ( I store only 2 months ) something like 40
> or 50 millions of data about 500 vehicles.
> 
> Using the keys VEHICLE_ID and GPS_TIME, the perfomance is very low...

Please post an example query and the EXPLAIN ANALYZE output.  The
table definition might be useful too.

> I need some ideas for a better perfomance in this table

Do you have indexes where you need them?  Do you cluster on any of
the indexes?  Do you VACUUM and ANALYZE the database regularly?
Have you investigated whether you need to increase the statistics
on any columns?  Have you tuned postgresql.conf?  What version of
PostgreSQL are you using?

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Million of rows

2005-03-29 Thread Vinicius Bernardi
At now I have this system runing in a mysql, all the examples I have
are in mysql, but the biggest client that will start now, we will use
PostgreSQL, so I need a way to do those questions in postgres...
Ideas like TABLESPACES or anothe things...

Just looking for start ideas...

Thanks

Vinicius Marques De Bernardi


On Tue, 29 Mar 2005 12:08:15 -0700, Michael Fuhr <[EMAIL PROTECTED]> wrote:
> On Tue, Mar 29, 2005 at 03:33:24PM -0300, Vinicius Bernardi wrote:
> >
> > But now the problem starts when I has to select data from this
> > vehicles about the history ( I store only 2 months ) something like 40
> > or 50 millions of data about 500 vehicles.
> >
> > Using the keys VEHICLE_ID and GPS_TIME, the perfomance is very low...
> 
> Please post an example query and the EXPLAIN ANALYZE output.  The
> table definition might be useful too.
> 
> > I need some ideas for a better perfomance in this table
> 
> Do you have indexes where you need them?  Do you cluster on any of
> the indexes?  Do you VACUUM and ANALYZE the database regularly?
> Have you investigated whether you need to increase the statistics
> on any columns?  Have you tuned postgresql.conf?  What version of
> PostgreSQL are you using?
> 
> --
> Michael Fuhr
> http://www.fuhr.org/~mfuhr/
>

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] VACUUM on duplicate DB gives FSM and total pages discrepancies

2005-03-29 Thread Karim Nassar
I recently pg_dumpall'd my DB from a (used for testing) v20z install of
postgresql 8.0.1, and restored it to my production (but not yet in
service) v40z running the same version. The test DB has had multiple
millions of rows created/dropped during testing. The results of VACUUM
VERBOSE are giving me pause:

* v40z (hardly used after restore):

orfs=# vacuum full analyze verbose;

INFO:  free space map: 114 relations, 84 pages stored; 1824 total pages needed
DETAIL:  Allocated FSM size: 1000 relations + 30 pages = 1864 kB shared 
memory.
VACUUM


* v20z (after having undergone extensive tweaking, deletes, and inserts):

orfs=# vacuum full analyze verbose;

INFO:  free space map: 53 relations, 13502 pages stored; 9776 total pages needed
DETAIL:  Allocated FSM size: 1000 relations + 30 pages = 1864 kB shared 
memory.
VACUUM

1) Should I be concerned about the total pages needed? ISTM that using
that many more pages can't help but degrade relative performance on my
testing machine.

2) How is it that my FSM has different numbers of relations?

3) Are either of these affects normal for an oft-used (or not) DB?

FWIW:
 v40z  v20z
maintenance_work_mem   262144 16384 
shared_buffers  3  1000

Thanks,
-- 
Karim Nassar
Department of Computer Science
Box 15600, College of Engineering and Natural Sciences
Northern Arizona University,  Flagstaff, Arizona 86011
Office: (928) 523-5868 -=- Mobile: (928) 699-9221


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match