Hello Andreas and Tom Thank you for replying.
Sorry, I re-created my questions. I was mis-pasted query log on previous question. (@~ operator is PGroonga extension (http://pgroonga.github.io)) Please ignore it. Best regards. 1, Problem. (1) Following query is exteme slow. (478sec) SELECT u.url FROM url_lists4 u, keywords4 k WHERE u.url like k.url AND k.name = 'esc_url'; (2) grep -f kwd.txt sample.txt (exec time under 1sec) 2, Questions (1) Is it possible to improve this query like the command ``grep -f keyword data``? (2) What kind of Index should I create on url_lists table? 3, Environment OS: CentOS7 PostgreSQL 9.4 4, sample source https://github.com/hiroyuki-sato/postgres_like_test 5, Create table drop table if exists url_lists4; create table url_lists4 ( id int not null primary key, url text not null ); create index ix_url_url_lists4 on url_lists4(url); drop table if exists keywords4; create table keywords4 ( id int not null primary key, name varchar(40) not null, url text not null ); create index ix_url_keywords4 on keywords4(url); create index ix_name_keywords4 on keywords4(name); \copy url_lists4(id,url) from 'sample.txt' with delimiter ','; \copy keywords4(id,name,url) from 'keyword.txt' with delimiter ','; vacuum url_lists4; vacuum keywords4; analyze url_lists4; analyze keywords4; 6, Query EXPLAIN SELECT u.url FROM url_lists4 u, keywords4 k WHERE u.url like k.url AND k.name = 'esc_url'; EXPLAIN ANALYZE SELECT u.url FROM url_lists4 u, keywords4 k WHERE u.url like k.url AND k.name = 'esc_url'; SELECT u.url FROM url_lists4 u, keywords4 k WHERE u.url like k.url AND k.name = 'esc_url'; 7, EXPLAIN QUERY PLAN ----------------------------------------------------------------------------- Nested Loop (cost=0.00..37510799.00 rows=12500000 width=57) Join Filter: (u.url ~~ k.url) -> Seq Scan on url_lists4 u (cost=0.00..10682.00 rows=500000 width=57) -> Materialize (cost=0.00..129.50 rows=5000 width=28) -> Seq Scan on keywords4 k (cost=0.00..104.50 rows=5000 width=28) Filter: ((name)::text = 'esc_url'::text) (6 rows) 8, EXPLAIN ANALYZE QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..37510799.00 rows=12500000 width=57) (actual time=6011.642..478011.117 rows=4850 loops=1) Join Filter: (u.url ~~ k.url) Rows Removed by Join Filter: 2499995150 -> Seq Scan on url_lists4 u (cost=0.00..10682.00 rows=500000 width=57) (actual time=0.034..192.646 rows=500000 loops=1) -> Materialize (cost=0.00..129.50 rows=5000 width=28) (actual time=0.000..0.261 rows=5000 loops=500000) -> Seq Scan on keywords4 k (cost=0.00..104.50 rows=5000 width=28) (actual time=0.021..1.705 rows=5000 loops=1) Filter: ((name)::text = 'esc_url'::text) Planning time: 0.061 ms Execution time: 478011.773 ms (9 rows) 2015年12月28日(月) 3:39 Tom Lane <t...@sss.pgh.pa.us>: > Andreas Kretschmer <andr...@a-kretschmer.de> writes: > >> Tom Lane <t...@sss.pgh.pa.us> hat am 27. Dezember 2015 um 19:11 > geschrieben: > >> What in the world is this @~ operator? And what sort of index are > >> you using now, that can accept it? Are the rowcount estimates in > >> the EXPLAIN output accurate? (If they are, it's hardly surprising > >> that the query takes a long time.) > > > in a privat mail he called an other operator: ~~. I think, the @~ is an > error. > > Well, ~~ isn't directly indexable by btree indexes either, so there's > still something wrong with either the EXPLAIN output or the claimed > index definitions. > > regards, tom lane >