Hi,

For long strings, iterate_word_similarity() can run into long-running
tight-loops without honouring interrupts or statement_timeouts. For
example:

postgres=# set statement_timeout='1s';
SET
postgres=# select 1 where repeat('1.1',80000) %>> 'Lorem ipsum dolor sit amet';
?column?
----------
(0 rows)
Time: 29615.842 ms (00:29.616)

The associated perf report:

+ 99.98% 0.00% postgres postgres [.] ExecQual
+ 99.98% 0.00% postgres postgres [.] ExecEvalExprSwitchContext
+ 99.98% 0.00% postgres pg_trgm.so [.] strict_word_similarity_commutator_op
+ 99.98% 0.00% postgres pg_trgm.so [.] calc_word_similarity
+ 99.68% 99.47% postgres pg_trgm.so [.] iterate_word_similarity
0.21% 0.03% postgres postgres [.] pg_qsort
0.16% 0.00% postgres [kernel.kallsyms] [k] asm_sysvec_apic_timer_interrupt
0.16% 0.00% postgres [kernel.kallsyms] [k] sysvec_apic_timer_interrupt
0.16% 0.11% postgres [kernel.kallsyms] [k] __softirqentry_text_start
0.16% 0.00% postgres [kernel.kallsyms] [k] irq_exit_rcu

Adding CHECK_FOR_INTERRUPTS() ensures that such queries respond to
statement_timeout & Ctrl-C signals. With the patch applied, the
above query will interrupt more quickly:

postgres=# select 1 where repeat('1.1',80000) %>> 'Lorem ipsum dolor sit amet';
ERROR: canceling statement due to statement timeout
Time: 1000.768 ms (00:01.001)

Please find the patch attached. The patch does not show any performance
regressions when run against the above use-case. Thanks to SQLSmith
for indirectly leading me to this scenario.

-
Robins Tharakan
Amazon Web Services
Patch applied to commit - 80d690721973f6a031143a24a34b78a0225101a2

SQL repro script
================

CREATE EXTENSION IF NOT EXISTS pg_trgm;
set statement_timeout = '1s';
show statement_timeout;
\timing on

select 1 where repeat('1.1',80000) %>> 'Lorem ipsum dolor sit amet';


-- Check whether this change brought in any performance regressions
set statement_timeout='0';
show statement_timeout;

select COUNT(*) from generate_series(1,1) q(e) where repeat('1.1',10000) %>> 
('Lorem ipsum dolor sit amet'||e::text);
select COUNT(*) from generate_series(1,10) q(e) where repeat('1.1',10000) %>> 
('Lorem ipsum dolor sit amet'||e::text);
select COUNT(*) from generate_series(1,100) q(e) where repeat('1.1',10000) %>> 
('Lorem ipsum dolor sit amet'||e::text);



SQL script output
=================
CREATE EXTENSION
SET
 statement_timeout 
-------------------
 1s
(1 row)
Timing is on.

psql:/home/ubuntu/proj/sqlsmithdata/repro1.sql:11: ERROR:  canceling statement 
due to statement timeout
Time: 1000.792 ms (00:01.001)



SET
Time: 0.093 ms
 statement_timeout 
-------------------
 0
(1 row)

Time: 0.077 ms
 count 
-------
     0
(1 row)

Time: 473.487 ms
 count 
-------
     0
(1 row)

Time: 4726.628 ms (00:04.727)
 count 
-------
     0
(1 row)

Time: 47231.271 ms (00:47.231)
commit - 80d690721973f6a031143a24a34b78a0225101a2

SQL repro script
================

CREATE EXTENSION IF NOT EXISTS pg_trgm;
set statement_timeout = '1s';
show statement_timeout;
\timing on

select 1 where repeat('1.1',80000) %>> 'Lorem ipsum dolor sit amet';SELECT 1;


-- Check whether this change brought in any performance regressions
set statement_timeout='0';
show statement_timeout;

select COUNT(*) from generate_series(1,1) q(e) where repeat('1.1',10000) %>> 
('Lorem ipsum dolor sit amet'||e::text);
select COUNT(*) from generate_series(1,10) q(e) where repeat('1.1',10000) %>> 
('Lorem ipsum dolor sit amet'||e::text);
select COUNT(*) from generate_series(1,100) q(e) where repeat('1.1',10000) %>> 
('Lorem ipsum dolor sit amet'||e::text);



SQL script output
=================
CREATE EXTENSION
SET
 statement_timeout 
-------------------
 1s
(1 row)

Timing is on.

 ?column? 
----------
(0 rows)

Time: 29620.933 ms (00:29.621)
psql:/home/ubuntu/proj/sqlsmithdata/repro1.sql:11: ERROR:  canceling statement 
due to statement timeout
Time: 0.073 ms


SET
Time: 0.159 ms
 statement_timeout 
-------------------
 0
(1 row)

Time: 0.100 ms
 count 
-------
     0
(1 row)

Time: 473.449 ms
 count 
-------
     0
(1 row)

Time: 4725.483 ms (00:04.725)
 count 
-------
     0
(1 row)

Time: 47222.223 ms (00:47.222)

Attachment: v1_cfi_iterate_word_similarity.patch
Description: Binary data

Reply via email to