Re: [PERFORM] Excessive memory used for INSERT

2014-12-18 Thread Alessandro Ipe
Hi,


A grep in a  nightly dump of this database did not return any AFTER trigger.
The only keys are primary on each daily table, through
ADD CONSTRAINT "MSG_-MM-DD_pkey" PRIMARY KEY (slot, msg);
and on the global table
ADD CONSTRAINT msg_pkey PRIMARY KEY (slot, msg);


Regards,


A.


On Wednesday 17 December 2014 12:49:03 Tom Lane wrote:
> Alessandro Ipe  writes:
> > My dtrigger definition is
> > CREATE TRIGGER msg_trigger BEFORE INSERT ON msg FOR EACH ROW EXECUTE
> > PROCEDURE msg_function(); so it seems that it is a BEFORE trigger.
> 
> Hm, no AFTER triggers anywhere?  Are there foreign keys, perhaps?
> 
>   regards, tom lane



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Excessive memory used for INSERT

2014-12-18 Thread Alessandro Ipe
On Thursday 18 December 2014 08:51:47 Torsten Förtsch wrote:
> On 17/12/14 16:14, Alessandro Ipe wrote:
> > 2014-12-15 17:54:07 GMT LOG: server process (PID 21897) was 
terminated
> > by signal 9: Killed
> 
> since it was killed by SIGKILL, maybe it's the kernel's OOM killer?

Indeed and this hopefully prevented postgresql to crash my whole system due to 
RAM exhaustion. But the problem remains : why an INSERT requires that huge 
amount of memory ?


Regards,


A.


Re: [PERFORM] Excessive memory used for INSERT

2014-12-18 Thread Alessandro Ipe
Hi,


I tried also with an upsert function
CREATE FUNCTION upsert_func(sql_insert text, sql_update text) RETURNS void
LANGUAGE plpgsql
AS $$
BEGIN
EXECUTE sql_update;
IF FOUND THEN
RETURN;
END IF;
BEGIN
EXECUTE sql_insert;
EXCEPTION WHEN OTHERS THEN
EXECUTE sql_update;
END;
RETURN;
END;
$$;
with the same result on the memory used... 

The tables hold 355000 rows in total.


Regards,


A.


On Thursday 18 December 2014 12:16:49 Alessandro Ipe wrote:
> Hi,
> 
> 
> A grep in a  nightly dump of this database did not return any AFTER trigger.
> The only keys are primary on each daily table, through
> ADD CONSTRAINT "MSG_-MM-DD_pkey" PRIMARY KEY (slot, msg);
> and on the global table
> ADD CONSTRAINT msg_pkey PRIMARY KEY (slot, msg);
> 
> 
> Regards,
> 
> 
> A.
> 
> On Wednesday 17 December 2014 12:49:03 Tom Lane wrote:
> > Alessandro Ipe  writes:
> > > My dtrigger definition is
> > > CREATE TRIGGER msg_trigger BEFORE INSERT ON msg FOR EACH 
ROW EXECUTE
> > > PROCEDURE msg_function(); so it seems that it is a BEFORE trigger.
> > 
> > Hm, no AFTER triggers anywhere?  Are there foreign keys, perhaps?
> > 
> > regards, tom lane



Re: [PERFORM] Excessive memory used for INSERT

2014-12-18 Thread Tom Lane
Alessandro Ipe  writes:
> Hi,
> I tried also with an upsert function
> CREATE FUNCTION upsert_func(sql_insert text, sql_update text) RETURNS void
> LANGUAGE plpgsql
> AS $$
> BEGIN
> EXECUTE sql_update;
> IF FOUND THEN
> RETURN;
> END IF;
> BEGIN
> EXECUTE sql_insert;
> EXCEPTION WHEN OTHERS THEN
> EXECUTE sql_update;
> END;
> RETURN;
> END;
> $$;
> with the same result on the memory used... 

If you want to provide a self-contained test case, possibly we could look
into it, but these fragmentary bits of what you're doing don't really
constitute an investigatable problem statement.

I will note that EXCEPTION blocks aren't terribly cheap, so if you're
reaching the "EXECUTE sql_insert" a lot of times that might have something
to do with it.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Question about trigram GIST index

2014-12-18 Thread Robert DiFalco
So, for my use case I simply need to search for a case insensitive
substring. It need not be super exact. It seems like there are two ways I
can do this:

CREATE INDEX idx_users_name ON users USING GIST(lower(name) gist_trgm_ops);
SELECT * FROM users WHERE lower(name) LIKE '%john%';

Or I can do it like this:

CREATE INDEX idx_users_name ON users USING GIST(name gist_trgm_ops);
SELECT * FROM users WHERE name % 'john';

Unfortunately I cannot find any documentation on the trade-offs between
these two approaches. For my test dataset of 75K records the query speed
seems pretty damn similar.

So, I guess my question is, what is the difference for querying and insert
for the two approaches?

Thanks!


Re: [PERFORM] Question about trigram GIST index

2014-12-18 Thread Tom Lane
Robert DiFalco  writes:
> So, for my use case I simply need to search for a case insensitive
> substring. It need not be super exact. It seems like there are two ways I
> can do this:

> CREATE INDEX idx_users_name ON users USING GIST(lower(name) gist_trgm_ops);
> SELECT * FROM users WHERE lower(name) LIKE '%john%';

> Or I can do it like this:

> CREATE INDEX idx_users_name ON users USING GIST(name gist_trgm_ops);
> SELECT * FROM users WHERE name % 'john';

Hm, I don't see anything in the pg_trgm docs suggesting that % is
case-insensitive.  But in any case, I'd go with the former as being
more understandable to someone who knows standard SQL.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Excessive memory used for INSERT

2014-12-18 Thread Alessandro Ipe
Hi,


I can send a full dump of my database (< 2MB) if it is OK for you.


Thanks,


A.


On Thursday 18 December 2014 12:05:45 Tom Lane wrote:
> Alessandro Ipe  writes:
> > Hi,
> > I tried also with an upsert function
> > CREATE FUNCTION upsert_func(sql_insert text, sql_update text) RETURNS 
void
> > 
> > LANGUAGE plpgsql
> > AS $$
> > 
> > BEGIN
> > EXECUTE sql_update;
> > IF FOUND THEN
> > 
> > RETURN;
> > 
> > END IF;
> > BEGIN
> > 
> > EXECUTE sql_insert;
> > EXCEPTION WHEN OTHERS THEN
> > EXECUTE sql_update;
> > END;
> > 
> > RETURN;
> > 
> > END;
> > $$;
> > with the same result on the memory used...
> 
> If you want to provide a self-contained test case, possibly we could look
> into it, but these fragmentary bits of what you're doing don't really
> constitute an investigatable problem statement.
> 
> I will note that EXCEPTION blocks aren't terribly cheap, so if you're
> reaching the "EXECUTE sql_insert" a lot of times that might have something
> to do with it.
> 
>   regards, tom lane



Re: [PERFORM] Question about trigram GIST index

2014-12-18 Thread Robert DiFalco
I know! I was surprised that % 'John' or % 'JOHN' or even % 'jOhn' all
returned the same result.

Besides readability would there be any technical differences between a GIST
index that is lower or not and using LIKE vs. %?

Thanks!


On Thu, Dec 18, 2014 at 9:18 AM, Tom Lane  wrote:
>
> Robert DiFalco  writes:
> > So, for my use case I simply need to search for a case insensitive
> > substring. It need not be super exact. It seems like there are two ways I
> > can do this:
>
> > CREATE INDEX idx_users_name ON users USING GIST(lower(name)
> gist_trgm_ops);
> > SELECT * FROM users WHERE lower(name) LIKE '%john%';
>
> > Or I can do it like this:
>
> > CREATE INDEX idx_users_name ON users USING GIST(name gist_trgm_ops);
> > SELECT * FROM users WHERE name % 'john';
>
> Hm, I don't see anything in the pg_trgm docs suggesting that % is
> case-insensitive.  But in any case, I'd go with the former as being
> more understandable to someone who knows standard SQL.
>
> regards, tom lane
>


Re: [PERFORM] Question about trigram GIST index

2014-12-18 Thread Giuseppe Broccolo
I'm not sure about the '%' operator, but I'm sure that the GIST index will
never be used in the

  SELECT * FROM users WHERE lower(name) LIKE '%john%';

query; it is used for left or right anchored search, such as 'john%' or
'%john'.

Giuseppe.
-- 
Giuseppe Broccolo - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
giuseppe.brocc...@2ndquadrant.it | www.2ndQuadrant.it


Re: [PERFORM] Question about trigram GIST index

2014-12-18 Thread Robert DiFalco
I'm pretty sure '%John%' uses the index.

explain analyze verbose SELECT name FROM wai_users WHERE lower(name) LIKE
'%john%';

  QUERY PLAN

--
 Bitmap Heap Scan on public.wai_users  (cost=53.45..1345.46 rows=900
width=14) (actual time=18.474..32.093 rows=1596 loops=1)
   Output: name
   Recheck Cond: (lower((wai_users.name)::text) ~~ '%john%'::text)
   ->  Bitmap Index Scan on idx_user_name  (cost=0.00..53.41 rows=900
width=0) (actual time=18.227..18.227 rows=1596 loops=1)
 Index Cond: (lower((wai_users.name)::text) ~~ '%john%'::text)
 Total runtime: 33.662 ms
(6 rows)


On Thu, Dec 18, 2014 at 10:00 AM, Giuseppe Broccolo <
giuseppe.brocc...@2ndquadrant.it> wrote:
>
> I'm not sure about the '%' operator, but I'm sure that the GIST index will
> never be used in the
>
>   SELECT * FROM users WHERE lower(name) LIKE '%john%';
>
> query; it is used for left or right anchored search, such as 'john%' or
> '%john'.
>
> Giuseppe.
> --
> Giuseppe Broccolo - 2ndQuadrant Italy
> PostgreSQL Training, Services and Support
> giuseppe.brocc...@2ndquadrant.it | www.2ndQuadrant.it
>


Re: [PERFORM] Question about trigram GIST index

2014-12-18 Thread Jeff Janes
On Thu, Dec 18, 2014 at 10:00 AM, Giuseppe Broccolo <
giuseppe.brocc...@2ndquadrant.it> wrote:
>
> I'm not sure about the '%' operator, but I'm sure that the GIST index will
> never be used in the
>
>   SELECT * FROM users WHERE lower(name) LIKE '%john%';
>
> query; it is used for left or right anchored search, such as 'john%' or
> '%john'.
>

The point of the gist_trgm_ops operator is specifically to overcome that
limitation.

It is pretty awesome.

Cheers,

Jeff
>
>


Re: [PERFORM] Question about trigram GIST index

2014-12-18 Thread Robert DiFalco
Jeff, I'm not seeing that limitation.

On Thu, Dec 18, 2014 at 10:33 AM, Jeff Janes  wrote:
>
> On Thu, Dec 18, 2014 at 10:00 AM, Giuseppe Broccolo <
> giuseppe.brocc...@2ndquadrant.it> wrote:
>>
>> I'm not sure about the '%' operator, but I'm sure that the GIST index
>> will never be used in the
>>
>>   SELECT * FROM users WHERE lower(name) LIKE '%john%';
>>
>> query; it is used for left or right anchored search, such as 'john%' or
>> '%john'.
>>
>
> The point of the gist_trgm_ops operator is specifically to overcome that
> limitation.
>
> It is pretty awesome.
>
> Cheers,
>
> Jeff
>>
>>


Re: [PERFORM] Question about trigram GIST index

2014-12-18 Thread Kevin Grittner
Giuseppe Broccolo wrote:

> I'm not sure about the '%' operator, but I'm sure that the GIST
> index will never be used in the
>
> SELECT * FROM users WHERE lower(name) LIKE '%john%';
>
> query; it is used for left or right anchored search, such as
> 'john%' or '%john'.
It *will* use a *trigram* index for a non-anchored search.

test=# create table words (word text not null);
CREATE TABLE
test=# copy words from '/usr/share/dict/words';
COPY 99171
test=# CREATE EXTENSION pg_trgm;
CREATE EXTENSION
test=# CREATE INDEX words_trgm ON words USING gist (word gist_trgm_ops);
CREATE INDEX
test=# vacuum analyze words;
VACUUM
test=# explain analyze select * from words where word like '%john%';

 QUERY PLAN
---
Bitmap Heap Scan on words  (cost=4.36..40.24 rows=10 width=9) (actual 
time=17.758..17.772 rows=8 loops=1)
   Recheck Cond: (word ~~ '%john%'::text)
   Rows Removed by Index Recheck: 16
   Heap Blocks: exact=4
   ->   Bitmap Index Scan on words_trgm  (cost=0.00..4.36 rows=10 width=0) 
(actual time=17.708..17.708 rows=24 loops=1)
 Index Cond: (word ~~ '%john%'::text)
Planning time: 0.227 ms
Execution time: 17.862 ms
(8 rows)

test=# explain analyze select * from words where word ilike '%john%';

 QUERY PLAN
--
Bitmap Heap Scan on words  (cost=44.05..556.57 rows=1002 width=9) (actual 
time=12.151..12.197 rows=24 loops=1)
   Recheck Cond: (word ~~* '%john%'::text)
   Heap Blocks: exact=4
   ->  Bitmap Index Scan on words_trgm  (cost=0.00..43.80 rows=1002 width=0) 
(actual time=12.124..12.124 rows=24 loops=1)
Index Cond: (word ~~* '%john%'::text)
Planning time: 0.392 ms
Execution time: 12.252 ms
(7 rows)

Note that a trigram index is case-insensitive; doing a
case-sensitive search requires an extra Recheck node to eliminate
the rows that match in the case-insensitive index scan but have
different capitalization.  Because of that case-sensitive is
slower.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance