Hi, For long strings, iterate_word_similarity() can run into long-running tight-loops without honouring interrupts or statement_timeouts. For example:
postgres=# set statement_timeout='1s'; SET postgres=# select 1 where repeat('1.1',80000) %>> 'Lorem ipsum dolor sit amet'; ?column? ---------- (0 rows) Time: 29615.842 ms (00:29.616) The associated perf report: + 99.98% 0.00% postgres postgres [.] ExecQual + 99.98% 0.00% postgres postgres [.] ExecEvalExprSwitchContext + 99.98% 0.00% postgres pg_trgm.so [.] strict_word_similarity_commutator_op + 99.98% 0.00% postgres pg_trgm.so [.] calc_word_similarity + 99.68% 99.47% postgres pg_trgm.so [.] iterate_word_similarity 0.21% 0.03% postgres postgres [.] pg_qsort 0.16% 0.00% postgres [kernel.kallsyms] [k] asm_sysvec_apic_timer_interrupt 0.16% 0.00% postgres [kernel.kallsyms] [k] sysvec_apic_timer_interrupt 0.16% 0.11% postgres [kernel.kallsyms] [k] __softirqentry_text_start 0.16% 0.00% postgres [kernel.kallsyms] [k] irq_exit_rcu Adding CHECK_FOR_INTERRUPTS() ensures that such queries respond to statement_timeout & Ctrl-C signals. With the patch applied, the above query will interrupt more quickly: postgres=# select 1 where repeat('1.1',80000) %>> 'Lorem ipsum dolor sit amet'; ERROR: canceling statement due to statement timeout Time: 1000.768 ms (00:01.001) Please find the patch attached. The patch does not show any performance regressions when run against the above use-case. Thanks to SQLSmith for indirectly leading me to this scenario. - Robins Tharakan Amazon Web Services
Patch applied to commit - 80d690721973f6a031143a24a34b78a0225101a2 SQL repro script ================ CREATE EXTENSION IF NOT EXISTS pg_trgm; set statement_timeout = '1s'; show statement_timeout; \timing on select 1 where repeat('1.1',80000) %>> 'Lorem ipsum dolor sit amet'; -- Check whether this change brought in any performance regressions set statement_timeout='0'; show statement_timeout; select COUNT(*) from generate_series(1,1) q(e) where repeat('1.1',10000) %>> ('Lorem ipsum dolor sit amet'||e::text); select COUNT(*) from generate_series(1,10) q(e) where repeat('1.1',10000) %>> ('Lorem ipsum dolor sit amet'||e::text); select COUNT(*) from generate_series(1,100) q(e) where repeat('1.1',10000) %>> ('Lorem ipsum dolor sit amet'||e::text); SQL script output ================= CREATE EXTENSION SET statement_timeout ------------------- 1s (1 row) Timing is on. psql:/home/ubuntu/proj/sqlsmithdata/repro1.sql:11: ERROR: canceling statement due to statement timeout Time: 1000.792 ms (00:01.001) SET Time: 0.093 ms statement_timeout ------------------- 0 (1 row) Time: 0.077 ms count ------- 0 (1 row) Time: 473.487 ms count ------- 0 (1 row) Time: 4726.628 ms (00:04.727) count ------- 0 (1 row) Time: 47231.271 ms (00:47.231)
commit - 80d690721973f6a031143a24a34b78a0225101a2 SQL repro script ================ CREATE EXTENSION IF NOT EXISTS pg_trgm; set statement_timeout = '1s'; show statement_timeout; \timing on select 1 where repeat('1.1',80000) %>> 'Lorem ipsum dolor sit amet';SELECT 1; -- Check whether this change brought in any performance regressions set statement_timeout='0'; show statement_timeout; select COUNT(*) from generate_series(1,1) q(e) where repeat('1.1',10000) %>> ('Lorem ipsum dolor sit amet'||e::text); select COUNT(*) from generate_series(1,10) q(e) where repeat('1.1',10000) %>> ('Lorem ipsum dolor sit amet'||e::text); select COUNT(*) from generate_series(1,100) q(e) where repeat('1.1',10000) %>> ('Lorem ipsum dolor sit amet'||e::text); SQL script output ================= CREATE EXTENSION SET statement_timeout ------------------- 1s (1 row) Timing is on. ?column? ---------- (0 rows) Time: 29620.933 ms (00:29.621) psql:/home/ubuntu/proj/sqlsmithdata/repro1.sql:11: ERROR: canceling statement due to statement timeout Time: 0.073 ms SET Time: 0.159 ms statement_timeout ------------------- 0 (1 row) Time: 0.100 ms count ------- 0 (1 row) Time: 473.449 ms count ------- 0 (1 row) Time: 4725.483 ms (00:04.725) count ------- 0 (1 row) Time: 47222.223 ms (00:47.222)
v1_cfi_iterate_word_similarity.patch
Description: Binary data