Re: [PERFORM] Statement Timeout at User Level

2008-09-18 Thread Albe Laurenz
Gauri Kanekar wrote:
> Is it possible to put Statement timeout at User Level.
> Like If i have a user like 'guest', Can i put a statement 
> timeout for it.

If only all problems were that easily solved!

ALTER ROLE guest SET statement_timeout=1;

This will cause all statements longer than 10 seconds and issued
by "guest" to be aborted.

Yours,
Laurenz Albe

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] RAID arrays and performance

2008-09-18 Thread Matthew Wakeling

On Tue, 29 Jan 2008, Gregory Stark wrote:

So, this is FYI, and also an added encouragement to implement fadvise
prefetching in some form or another. How's that going by the way?


I have a patch which implements it for the low hanging fruit of bitmap index
scans. it does it using an extra trip through the buffer manager which is the
least invasive approach but not necessarily the best.


Gregory - what's the status of that patch at the moment? Will it be making 
it into a new version of Postgres, or are we waiting for it to be 
implemented fully?


It's just that our system is doing a lot of bitmap index scans at the 
moment, and it'd help to be able to spread them across the 16 discs in 
the RAID array. It's the bottleneck in our system at the moment.


Matthew

--
The email of the species is more deadly than the mail.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Why does this query write to the disk?

2008-09-18 Thread Nikolas Everett
List,

I'm a bit confused as to why this query writes to the disk:
SELECT count(*)
FROMbigbigtable
WHERE customerid IN (SELECT customerid FROM
smallcustomertable)
AND x !=
'special'

AND y IS NULL

It writes a whole bunch of data to the disk that has the tablespace where
bigbigtable lives as well as writes a little data to the main disk.  It
looks like its is actually WAL logging these writes.

Here is the EXPLAIN ANALYZE:
Aggregate  (cost=46520194.16..46520194.17 rows=1 width=0) (actual
time=4892191.995..4892191.995 rows=1 loops=1)
  ->  Hash IN Join  (cost=58.56..46203644.01 rows=126620058 width=0) (actual
time=2.938..4840349.573 rows=79815986 loops=1)
Hash Cond: ((bigbigtable.customerid)::text =
(smallcustomertable.customerid)::text)
->  Seq Scan on bigbigtable  (cost=0.00..43987129.60 rows=126688839
width=11) (actual time=0.011..4681248.143 rows=128087340 loops=1)
  Filter: ((y IS NULL) AND ((x)::text <> 'special'::text))
->  Hash  (cost=35.47..35.47 rows=1847 width=18) (actual
time=2.912..2.912 rows=1847 loops=1)
  ->  Seq Scan on smallcustomertable  (cost=0.00..35.47
rows=1847 width=18) (actual time=0.006..1.301 rows=1847 loops=1)
Total runtime: 4892192.086 ms

Can someone point me to some documentation as to why this writes to disk?

Thanks,
Nik


Re: [PERFORM] Why does this query write to the disk?

2008-09-18 Thread Kevin Grittner
>>> "Nikolas Everett" <[EMAIL PROTECTED]> wrote: 
 
> I'm a bit confused as to why this query writes to the disk:
> SELECT count(*)
> FROMbigbigtable
> WHERE customerid IN (SELECT customerid FROM
> smallcustomertable)
> AND x !=
> 'special'
> 
> AND y IS NULL
> 
> It writes a whole bunch of data to the disk that has the tablespace
where
> bigbigtable lives as well as writes a little data to the main disk. 
It
> looks like its is actually WAL logging these writes.
 
It's probably writing hint bits to improve performance of subsequent
access to the table.  The issue is discussed here:
 
http://wiki.postgresql.org/wiki/Hint_Bits
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why does this query write to the disk?

2008-09-18 Thread Tom Lane
"Kevin Grittner" <[EMAIL PROTECTED]> writes:
> "Nikolas Everett" <[EMAIL PROTECTED]> wrote: 
>> I'm a bit confused as to why this query writes to the disk:
 
> It's probably writing hint bits to improve performance of subsequent
> access to the table.  The issue is discussed here:
> http://wiki.postgresql.org/wiki/Hint_Bits

Hint-bit updates wouldn't be WAL-logged.  If the table has been around a
long time, it might be freezing old tuples, which *would* be WAL-logged
(since 8.2 or so) --- but that would be a one-time, non-repeatable
behavior.  How sure are you that there was WAL output?

What I was thinking was more likely was that the hash table for the hash
join was spilling out to temp files.  That wouldn't be WAL-logged
either, but depending on your tablespace setup it might result in I/O on
some other disk than the table proper.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why does this query write to the disk?

2008-09-18 Thread Scott Carey
How big is your work_mem setting, and is this behavior affected by its size?

You can increase the work_mem on an individual connection before the test.

Simply:

set work_mem = '100MB'

to set it to 100 Megabytes.  If your issue is spilling data out of work_mem
to the temp storage, this setting will affect that.

On Thu, Sep 18, 2008 at 10:30 AM, Nikolas Everett <[EMAIL PROTECTED]> wrote:

> List,
>
> I'm a bit confused as to why this query writes to the disk:
> SELECT count(*)
> FROMbigbigtable
> WHERE customerid IN (SELECT customerid FROM
> smallcustomertable)
> AND x !=
> 'special'
>
> AND y IS NULL
>
> It writes a whole bunch of data to the disk that has the tablespace where
> bigbigtable lives as well as writes a little data to the main disk.  It
> looks like its is actually WAL logging these writes.
>
> Here is the EXPLAIN ANALYZE:
> Aggregate  (cost=46520194.16..46520194.17 rows=1 width=0) (actual
> time=4892191.995..4892191.995 rows=1 loops=1)
>   ->  Hash IN Join  (cost=58.56..46203644.01 rows=126620058 width=0)
> (actual time=2.938..4840349.573 rows=79815986 loops=1)
> Hash Cond: ((bigbigtable.customerid)::text =
> (smallcustomertable.customerid)::text)
> ->  Seq Scan on bigbigtable  (cost=0.00..43987129.60 rows=126688839
> width=11) (actual time=0.011..4681248.143 rows=128087340 loops=1)
>   Filter: ((y IS NULL) AND ((x)::text <> 'special'::text))
> ->  Hash  (cost=35.47..35.47 rows=1847 width=18) (actual
> time=2.912..2.912 rows=1847 loops=1)
>   ->  Seq Scan on smallcustomertable  (cost=0.00..35.47
> rows=1847 width=18) (actual time=0.006..1.301 rows=1847 loops=1)
> Total runtime: 4892192.086 ms
>
> Can someone point me to some documentation as to why this writes to disk?
>
> Thanks,
> Nik
>


Re: [PERFORM] Why does this query write to the disk?

2008-09-18 Thread Scott Carey
Under what conditions does EXPLAIN ANALYZE report spilling work_mem to
disk?  When does it not report work_mem or other overflow to disk?
I know that a planned disk-sort shows up.  I have also seen it report a
hash-agg on disk, but this was a while ago and rather difficult to reproduce
and I'm somewhat confident I have seen it spill to temp disk without
reporting it in EXPLAIN ANALYZE, but I could be wrong.

On Thu, Sep 18, 2008 at 11:13 AM, Tom Lane <[EMAIL PROTECTED]> wrote:

> "Kevin Grittner" <[EMAIL PROTECTED]> writes:
> > "Nikolas Everett" <[EMAIL PROTECTED]> wrote:
> >> I'm a bit confused as to why this query writes to the disk:
>
> > It's probably writing hint bits to improve performance of subsequent
> > access to the table.  The issue is discussed here:
> > http://wiki.postgresql.org/wiki/Hint_Bits
>
> Hint-bit updates wouldn't be WAL-logged.  If the table has been around a
> long time, it might be freezing old tuples, which *would* be WAL-logged
> (since 8.2 or so) --- but that would be a one-time, non-repeatable
> behavior.  How sure are you that there was WAL output?
>
> What I was thinking was more likely was that the hash table for the hash
> join was spilling out to temp files.  That wouldn't be WAL-logged
> either, but depending on your tablespace setup it might result in I/O on
> some other disk than the table proper.
>
>regards, tom lane
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] Why does this query write to the disk?

2008-09-18 Thread Nikolas Everett
"Kevin Grittner" <[EMAIL PROTECTED]> writes:
> http://wiki.postgresql.org/wiki/Hint_Bits


On Thu, Sep 18, 2008 at 2:13 PM, Tom Lane <[EMAIL PROTECTED]> wrote:

> freezing old tuples

hash join was spilling out to temp files
>

Since this was a new table and the writes to the table's disk were very
large it was probably the hint bits.

The small table was about 1300 rows and my work_mem was 100MB so the writes
to the main disk probably was not hash spillage.  They were tiny, so I'm not
worried about  them.

Thanks very much,
Nik


Re: [PERFORM] Why does this query write to the disk?

2008-09-18 Thread Tom Lane
"Scott Carey" <[EMAIL PROTECTED]> writes:
> Under what conditions does EXPLAIN ANALYZE report spilling work_mem to
> disk?

For hash joins, it doesn't.  You might be thinking of the additional
reporting we added for sorts recently; but there's no comparable
logging for hash ...

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] RAID arrays and performance

2008-09-18 Thread Greg Smith

On Thu, 18 Sep 2008, Matthew Wakeling wrote:


On Tue, 29 Jan 2008, Gregory Stark wrote:


I have a patch which implements it for the low hanging fruit of bitmap 
index scans. it does it using an extra trip through the buffer manager 
which is the least invasive approach but not necessarily the best.


Gregory - what's the status of that patch at the moment? Will it be making it 
into a new version of Postgres, or are we waiting for it to be implemented 
fully?


It and a related fadvise patch have been floating around the project queue 
for a while now.  I just sorted through all the various patches and 
mesasges related to this area and updated the list at 
http://wiki.postgresql.org/wiki/CommitFestInProgress#Pending_patches 
recently, I'm trying to kick back a broader reviewed version of this 
concept right now.


It's just that our system is doing a lot of bitmap index scans at the moment, 
and it'd help to be able to spread them across the 16 discs in the RAID 
array. It's the bottleneck in our system at the moment.


If you have some specific bitmap index scan test case suggestions you can 
pass along (either publicly or in private to me, I can probably help 
anonymize them), that's one of the things that has been holding this up. 
Alternately, if you'd like to join in on testing this all out more help 
would certainly be welcome.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] RAID arrays and performance

2008-09-18 Thread Scott Marlowe
On Thu, Sep 18, 2008 at 1:30 PM, Greg Smith <[EMAIL PROTECTED]> wrote:
> If you have some specific bitmap index scan test case suggestions you can
> pass along (either publicly or in private to me, I can probably help
> anonymize them), that's one of the things that has been holding this up.
> Alternately, if you'd like to join in on testing this all out more help
> would certainly be welcome.

I posted in pgsql-perform about a problem that's using bitmap heap
scans that's really slow compared to just using nested-loops.  Don't
know if that is relevant or not.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] why does this use the wrong index?

2008-09-18 Thread Rainer Mager
I have two identical queries except for the date range. In the first case,
with the wider date range, the correct (I believe) index is used. In the
second case where the date range is smaller a different index is used and a
less efficient plan is chosen. In the second query the problem seems to be
CPU resoures; while it is running 1 core of the CPU is 100% busy.

Note, if I drop the ad_log_date index then this query is always fast, but
some other queries I do require that index.

So, What can I do to encourage Postgres to use the first index even when the
date range is smaller.




# explain analyze SELECT name FROM players AS foo WHERE EXISTS (SELECT 1
FROM ad_log WHERE player = foo.id AND  date(start_time) BETWEEN
E'2008-09-14' AND E'2008-09-18' LIMIT 1) ORDER BY name;
   QUERY
PLAN


-
 Sort  (cost=1573.74..1574.31 rows=230 width=13) (actual time=28.421..28.505
rows=306 loops=1)
   Sort Key: foo.name
   Sort Method:  quicksort  Memory: 28kB
   ->  Seq Scan on players foo  (cost=0.00..1564.72 rows=230 width=13)
(actual time=0.104..27.876 rows=306 loops=1)
 Filter: (subplan)
 SubPlan
   ->  Limit  (cost=0.01..3.39 rows=1 width=0) (actual
time=0.058..0.058 rows=1 loops=460)
 ->  Index Scan using ad_log_player_date on ad_log
(cost=0.01..34571.03 rows=10228 width=0) (actual time=0.056..0.056 rows=1
loops=460)
   Index Cond: ((player = $0) AND (date(start_time) >=
'2008-09-14'::date) AND (date(start_time) <= '2008-09-18'::date))
 Total runtime: 28.623 ms
(10 rows)



# explain analyze SELECT name FROM players AS foo WHERE EXISTS (SELECT 1
FROM ad_log WHERE player = foo.id AND  date(start_time) BETWEEN
E'2008-09-18' AND E'2008-09-18' LIMIT 1) ORDER BY name;
 QUERY PLAN

-
 Index Scan using players_name_key on players foo  (cost=0.00..8376.84
rows=230 width=13) (actual time=813.695..143452.810 rows=301 loops=1)
   Filter: (subplan)
   SubPlan
 ->  Limit  (cost=0.01..18.14 rows=1 width=0) (actual
time=311.846..311.846 rows=1 loops=460)
   ->  Index Scan using ad_log_date on ad_log  (cost=0.01..18.14
rows=1 width=0) (actual time=311.844..311.844 rows=1 loops=460)
 Index Cond: ((date(start_time) >= '2008-09-18'::date) AND
(date(start_time) <= '2008-09-18'::date))
 Filter: (player = $0)
 Total runtime: 143453.100 ms
(8 rows)



Thanks,

--Rainer


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance