Re: [PERFORM] Large # of rows in query extremely slow, not using

2004-09-20 Thread Manfred Koizar
On Fri, 17 Sep 2004 19:23:44 -0500, Stephen Crowley
<[EMAIL PROTECTED]> wrote:
>Seq Scan [...] rows=265632
>  Filter: ((date = '2004-09-07'::date) AND ((stock)::text = 'MSFT'::text))
>Total runtime: 412703.000 ms
>
>random_page_cost and effective_cache_size are both default, 8 and 1000

Usually random_page_cost is 4.0 by default.  And your
effective_cache_size setting is far too low for a modern machine.

>"Index Scan [...] rows=159618
>"  Index Cond: ((date = '2004-09-07'::date) AND ((stock)::text = 'ORCL'::text))"
>"Total runtime: 201009.000 ms"

Extrapolating this to 265000 rows you should be able to get the MSFT
result in ca. 330 seconds, if you can persuade the planner to choose an
index scan.  Fiddling with random_page_cost and effective_cache_size
might do the trick.

>So now  this in all in proportion and works as expected.. the question
>is, why would the fact that it needs to be vaccumed cause such a huge
>hit in performance? When i vacuumed it did free up nearly 25% of the
>space.

So before the VACCUM a seq scan would have taken ca. 550 seconds.  Your
MSFT query with LIMIT 10 took ca. 350 seconds.  It's not implausible to
assume that more than half of the table had to be scanned to find the
first ten rows matching the filter condition.

Servus
 Manfred

---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] O_DIRECT setting

2004-09-20 Thread Guy Thornley
A recent comment on this (or perhaps another?) mailing list about Sun boxen
and the directio mount option has prompted me to read about O_DIRECT on the
open() manpage.

Has anybody tried this option? Ever taken any performance measurements?
I assume the way postgres manages its buffer memory (dealing with 8kB pages)
would be compatible with the restrictions:

Under  Linux  2.4 transfer  sizes,  and the alignment of user buffer
and file offset must all be multiples of the logical block size of
the file system.

According to the manpage, O_DIRECT implies O_SYNC:

File I/O is done directly to/from user space buffers.  The I/O is
synchronous, i.e., at the completion of the read(2) or write(2)
system call, data is guaranteed to have been transferred.

At the moment I am fairly interested in trying this, and I would spend some
time with it, but I have my hands full with other projects. I'd imagine this
is more use with the revamped buffer manager in PG8.0 than the 7.x line, but
we are not using PG8.0 here yet.

Would people be interested in a performance benchmark? I need some benchmark
tips :)

Incidentally, postgres heap files suffer really, really bad fragmentation,
which affects sequential scan operations (VACUUM, ANALYZE, REINDEX ...)
quite drastically. We have in-house patches that somewhat alleiviate this,
but they are not release quality. Has anybody else suffered this?

Guy Thornley


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] vacuum full & max_fsm_pages question

2004-09-20 Thread Patrick Hatcher



 
Hello.
Couple of questions:
 
 

- Q1: Today I decided to do a vacuum full verbose 
analyze on a large table that has been giving me slow performance.  And 
then I did it again.  I noticed that after each run the values in my 
indexes and estimate row version changed.  What really got me 
wondering is the fact my indexes report more rows than are in the table and then 
the estimated rows is less than the actual amount.
 
The table is a read-only table that is updated 
1/wk.  After updating it is vacuumed full.  I've also tried reindexing 
but the numbers still change.
Is this normal?  Below is a partial output for 
4 consecutive vacuum full analyzes.  No data was added nor was there anyone 
in the table.
 
- Q2: I have about a dozen 5M plus row 
tables.  I currently have my max_fsm_pages set to 300,000.  As you can 
see in vacuum full output I supplied, one table is already over this 
amount.  Is there a limit on the size of max_fsm_pages?
 
 
CONF settings:
# - Memory -
 
shared_buffers = 
2000   # min 16, at 
least max_connections*2, 8KB eachsort_mem = 
12288    
# min 64, size in KB#vacuum_mem = 
8192  
# min 1024, size in KB
 
# - Free Space Map -
 
max_fsm_pages = 
30  # min 
max_fsm_relations*16, 6 bytes eachmax_fsm_relations = 
500 # min 100, ~50 bytes 
each
 
Vacuum full information
#after second vacuum full
INFO:  index "emaildat_fkey" now contains 
8053743 row versions in 25764 pagesDETAIL:  1895 index row versions 
were removed.0 index pages have been deleted, 0 are currently 
reusable.CPU 2.38s/0.42u sec elapsed 11.11 sec.INFO:  analyzing 
"cdm.cdm_email_data"INFO:  "cdm_email_data": 65882 pages, 3000 rows 
sampled, 392410 estimated total rows
 
 
#after third vacuum full
INFO:  index "emaildat_fkey" now contains 
8052738 row versions in 25769 pagesDETAIL:  890 index row versions were 
removed.0 index pages have been deleted, 0 are currently reusable.CPU 
2.08s/0.32u sec elapsed 4.36 sec.INFO:  analyzing 
"cdm.cdm_email_data"INFO:  "cdm_email_data": 65874 pages, 3000 rows 
sampled, 392363 estimated total rows
 
 
#after REINDEX and  vacuum full
INFO:  index "emaildat_fkey" now contains 
8052369 row versions in 25771 pagesDETAIL:  521 index row versions were 
removed.0 index pages have been deleted, 0 are currently reusable.CPU 
1.37s/0.35u sec elapsed 4.79 sec.INFO:  analyzing 
"cdm.cdm_email_data"INFO:  "cdm_email_data": 65869 pages, 3000 rows 
sampled, 392333 estimated total rows
 
#After vacuum full(s)
mdc_oz=# select count(*) from 
cdm.cdm_email_data;  count- 5433358(1 
row)
 
TIA
Patrick


Re: [PERFORM] vacuum full & max_fsm_pages question

2004-09-20 Thread Robert Treat
On Tuesday 21 September 2004 00:01, Patrick Hatcher wrote:
> Hello.
> Couple of questions:>
> - Q1: Today I decided to do a vacuum full verbose analyze on a large table
> that has been giving me slow performance.  And then I did it again.  I
> noticed that after each run the values in my indexes and estimate row
> version changed.  What really got me wondering is the fact my indexes
> report more rows than are in the table and then the estimated rows is less
> than the actual amount.
>
> The table is a read-only table that is updated 1/wk.  After updating it is
> vacuumed full.  I've also tried reindexing but the numbers still change. Is
> this normal?  Below is a partial output for 4 consecutive vacuum full
> analyzes.  No data was added nor was there anyone in the table.
>

This looks normal to me for a pre 7.4 database, if I am right your running on 
7.2? Basically your indexes are overgrown, so each time you run vacuum you 
are shrinking the number of pages involved, which will change the row counts, 
and correspondingly change the count on the table as the sampled pages 
change. 


> - Q2: I have about a dozen 5M plus row tables.  I currently have my
> max_fsm_pages set to 300,000.  As you can see in vacuum full output I
> supplied, one table is already over this amount.  Is there a limit on the
> size of max_fsm_pages?
>

The limit is based on your memory... each page = 6 bytes.  But according to 
the output below you are not over 30 pages yet on that table (though you 
might be on some other tables.)

>
> CONF settings:
> # - Memory -
>
> shared_buffers = 2000   # min 16, at least max_connections*2, 8KB
> each sort_mem = 12288# min 64, size in KB
> #vacuum_mem = 8192  # min 1024, size in KB
>
> # - Free Space Map -
>
> max_fsm_pages = 30  # min max_fsm_relations*16, 6 bytes each
> max_fsm_relations = 500 # min 100, ~50 bytes each
>
>
> Vacuum full information
> #after second vacuum full
> INFO:  index "emaildat_fkey" now contains 8053743 row versions in 25764
> pages DETAIL:  1895 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 2.38s/0.42u sec elapsed 11.11 sec.
> INFO:  analyzing "cdm.cdm_email_data"
> INFO:  "cdm_email_data": 65882 pages, 3000 rows sampled, 392410 estimated
> total rows
>
>
> #after third vacuum full
> INFO:  index "emaildat_fkey" now contains 8052738 row versions in 25769
> pages DETAIL:  890 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 2.08s/0.32u sec elapsed 4.36 sec.
> INFO:  analyzing "cdm.cdm_email_data"
> INFO:  "cdm_email_data": 65874 pages, 3000 rows sampled, 392363 estimated
> total rows
>
>
> #after REINDEX and  vacuum full
> INFO:  index "emaildat_fkey" now contains 8052369 row versions in 25771
> pages DETAIL:  521 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 1.37s/0.35u sec elapsed 4.79 sec.
> INFO:  analyzing "cdm.cdm_email_data"
> INFO:  "cdm_email_data": 65869 pages, 3000 rows sampled, 392333 estimated
> total rows
>
> #After vacuum full(s)
> mdc_oz=# select count(*) from cdm.cdm_email_data;
>   count
> -
>  5433358
> (1 row)
>

I do think the count(*) seems a bit off based on the vacuum output above. I'm 
guessing you either have blocking transactions in the way or your not giving 
us a complete copy/paste of the session involved. 

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings