I've attached a file containing the \d+ for all the tables involved and
the EXPLAIN ANALYZE for the query.
Yes, the crash happened under v11.6. I had tried downgrading when I
first encountered the problem.
While trying to put together this information the crash started
happening less frequently (I was only able to reproduce it it twice and
not in a row) and I am unable to confirm if SET
max_parallel_workers_per_gather=0 had any effect.
Also since I've been able to reproduce I'm currently unable to provide a
corefile or backtrace. I'll continue to try and reproduce the error so
I can get one or the other.
I did find a work around for the crash by making the view
(v_capacity_score) a materialized view.
Thanks
tim
On 3/28/20 6:30 PM, Justin Pryzby wrote:
I happened across this bugreport, which seems to have just enough information
to be interesting.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=953204
|Version: 11.7-0+deb10u1
|2020-03-05 16:55:55.511 UTC [515] LOG: background worker "parallel worker"
(PID 884) was terminated by signal 11: Segmentation fault
|2020-03-05 16:55:55.511 UTC [515] DETAIL: Failed process was running:
|SELECT distinct student_prob.student_id, student_prob.score, student_name,
v_capacity_score.capacity
|FROM data JOIN model on model.id = 2 AND data_stage(data) =
model.target_begin_field_id
|JOIN student_prob ON data.crm_id = student_prob.student_id AND model.id =
student_prob.model_id AND (student_prob.additional_aid < 1)
|LEFT JOIN v_capacity_score ON data.crm_id = v_capacity_score.student_id AND
student_prob.model_id = v_capacity_score.model_id
|WHERE data.term_code = '202090' AND student_prob.score > 0
|ORDER BY student_prob.score DESC, student_name
|LIMIT 100 OFFSET 100 ;
Tim: it'd be nice to get more information, if and when possible:
- "explain" plan for that query;
- \d for the tables involved: constraints, inheritence, defaults;
- corefile or backtrace; it looks like there's two different crashes (maybe
same problem) so both would be useful;
- Can you reprodue the crash if you "SET max_parallel_workers_per_gather=0" ?
- Do you know if it crashed under v11.6 ?
If anyone wants to hack on the .deb:
https://packages.debian.org/buster/amd64/postgresql-11/download and (I couldn't
find the dbg package anywhere else)
https://snapshot.debian.org/package/postgresql-11/11.7-0%2Bdeb10u1/#postgresql-11-dbgsym_11.7-0:2b:deb10u1
$ mkdir pg11
$ cd pg11
$ wget -q
http://security.debian.org/debian-security/pool/updates/main/p/postgresql-11/postgresql-11_11.7-0+deb10u1_amd64.deb
$ ar x ./postgresql-11_11.7-0+deb10u1_amd64.deb
$ tar xf ./data.tar.xz
$ ar x postgresql-11-dbgsym_11.7-0+deb10u1_amd64.deb
$ tar tf data.tar.xz
$ gdb usr/lib/postgresql/11/bin/postgres
(gdb) set debug-file-directory usr/lib/debug/
(gdb) file usr/lib/postgresql/11/bin/postmaster
(gdb) info target
If I repeat the process Bernhard used (thanks for that) on the first crash in
libc6, I get:
(gdb) find /b 0x0000000000022320, 0x000000000016839b, 0xf9, 0x20, 0x77, 0x1f,
0xc5, 0xfd, 0x74, 0x0f, 0xc5, 0xfd, 0xd7, 0xc1, 0x85, 0xc0, 0x0f, 0x85, 0xdf,
0x00, 0x00, 0x00, 0x48, 0x83, 0xc7, 0x20, 0x83, 0xe1, 0x1f, 0x48, 0x83, 0xe7,
0xe0, 0xeb, 0x36, 0x66, 0x90, 0x83, 0xe1, 0x1f, 0x48, 0x83, 0xe7, 0xe0, 0xc5,
0xfd, 0x74, 0x0f, 0xc5, 0xfd, 0xd7, 0xc1, 0xd3, 0xf8, 0x85, 0xc0, 0x74, 0x1b,
0xf3, 0x0f, 0xbc, 0xc0, 0x48, 0x01, 0xf8, 0x48
0x15c17d <__strlen_avx2+13>
warning: Unable to access 1631 bytes of target memory at 0x167d3d, halting
search.
1 pattern found.
I'm tentatively guessing that heap_modify_tuple() is involved, since it calls
getmissingattr and (probably) fill_val. It looks like maybe some data
structure is corrupted which crashed two parallel workers, one in
fill_val()/strlen() and one in heap_deform_tuple()/getmissingattr(). Maybe
something not initialized in parallel worker, or a use-after-free? I'll stop
guessing.
Justin
Table
"public.ra_data"
Column | Type |
Collation | Nullable | Default | Storage | Stats target | Description
------------------------------------------------+------------------------+-----------+----------+---------+----------+--------------+-------------
tgt_accepted | boolean |
| not null | false | plain | |
tgt_active_paid | boolean |
| not null | false | plain | |
tgt_application_completed | boolean |
| not null | false | plain | |
tgt_application_started | boolean |
| not null | false | plain | |
tgt_cancelled_paid | boolean |
| not null | false | plain | |
tgt_engaged | boolean |
| not null | false | plain | |
tgt_paid | boolean |
| not null | false | plain | |
tgt_prospect | boolean |
| not null | false | plain | |
academic_year | integer |
| | | plain | |
acceptance_date | date |
| | | plain | |
act_six | boolean |
| not null | false | plain | |
act_six_stage | character varying(10) |
| | | extended | |
address_poltical_party | character varying(32) |
| | | extended | |
admissions_record_id | character varying(128) |
| | | extended | |
ap_dual_credit | boolean |
| not null | false | plain | |
app_source | character varying(128) |
| | | extended | |
application_completed_date | date |
| | | plain | |
application_date | date |
| | | plain | |
application_started_date | date |
| | | plain | |
applied_competition | boolean |
| | | plain | |
applied_competition_tier_1 | boolean |
| | | plain | |
applied_competition_tier_2 | boolean |
| | | plain | |
applied_competition_tier_3 | boolean |
| | | plain | |
athlete | boolean |
| not null | false | plain | |
athletic_recruit_type | character varying(8) |
| | | extended | |
campus_visit_date | date |
| | | plain | |
campus_visit_date_exists | boolean |
| not null | false | plain | |
christian_connector_program | boolean |
| | | plain | |
cipc_code_two | character varying(2) |
| | | extended | |
college_code | character varying(8) |
| | | extended | |
college_enrollment_status | character varying(32) |
| | | extended | |
conference_source | boolean |
| | | plain | |
congress_district_political_history | character varying(32) |
| | | extended | |
cram_student | boolean |
| | | plain | |
crm_id | character varying(32) |
| | | extended | |
days_to_app_completion | integer |
| | | plain | |
days_to_payment | integer |
| | | plain | |
decision_code | character varying(2) |
| | | extended | |
decision_date | date |
| | | plain | |
department_code | character varying(8) |
| | | extended | |
employee_student | boolean |
| not null | false | plain | |
erp_id | character varying(128) |
| | | extended | |
est_household_income | character varying(1) |
| | | extended | |
est_household_income_median | integer |
| | | plain | |
est_household_income_rank | integer |
| | | plain | |
expected_received_items_difference | integer |
| | | plain | |
facebook_group_membership | boolean |
| not null | false | plain | |
filed_fafsa | boolean |
| not null | false | plain | |
foreign_student | boolean |
| not null | false | plain | |
gender | character varying(32) |
| | | extended | |
gpa_points | numeric(12,2) |
| | | main | |
high_school_congress_district_id | integer |
| | | plain | |
high_school_county_code | integer |
| | | plain | |
high_school_csa | integer |
| | | plain | |
high_school_enrollment | integer |
| | | plain | |
high_school_name | character varying(128) |
| | | extended | |
high_school_ncessch_id | character varying(12) |
| | | extended | |
high_school_ncessch_num | integer |
| | | plain | |
high_school_urban_centric_locale | integer |
| | | plain | |
highest_act | integer |
| | | plain | |
highest_sat | integer |
| | | plain | |
hispanic_latino | boolean |
| | | plain | |
how_critical_is_fin_aid | integer |
| | | plain | |
hs_grad_date | date |
| | | plain | |
interview_date | date |
| | | plain | |
interview_date_exists | boolean |
| not null | false | plain | |
is_minority | boolean |
| | | plain | |
last_activity_date | date |
| | | plain | |
lead_days_application_started | integer |
| | | plain | |
lead_days_source | integer |
| | | plain | |
leadsource_category | character varying(32) |
| | | extended | |
leadsource_code | character varying(3) |
| | | extended | |
leadsource_date | date |
| | | plain | |
legacy | boolean |
| not null | false | plain | |
lilly_scholarship | boolean |
| not null | false | plain | |
mailingcity | character varying(128) |
| | | extended | |
mailingstate | character varying(32) |
| | | extended | |
major_code | character varying(8) |
| | | extended | |
market_status | character varying(32) |
| | | extended | |
marketing_automation_engagement_score | integer |
| | | plain | |
marketing_automation_engagement_score_relative | real |
| | | plain | |
meeting_date | date |
| | | plain | |
meeting_date_exists | boolean |
| not null | false | plain | |
missionary_kid | boolean |
| not null | false | plain | |
months_between_app_and_hs_grad | integer |
| | | plain | |
months_between_first_visit_and_interview | integer |
| | | plain | |
months_between_interview_and_enroll | integer |
| | | plain | |
neighbor_state | boolean |
| not null | false | plain | |
number_admissions_records | integer |
| | | plain | |
number_admissions_records_won | integer |
| | | plain | |
other_college_applied_to_1 | character varying(128) |
| | | extended | |
other_college_applied_to_2 | character varying(128) |
| | | extended | |
other_college_applied_to_3 | character varying(128) |
| | | extended | |
owner | character varying(128) |
| | | extended | |
packaged | boolean |
| not null | false | plain | |
parent_works_at_other_xian_college__c | boolean |
| not null | false | plain | |
pastor_kid | boolean |
| not null | false | plain | |
percent_discount | numeric(12,2) |
| | | main | |
pre_major | boolean |
| not null | false | plain | |
primary_top_50 | boolean |
| not null | false | plain | |
purchase_source | character varying(128) |
| | | extended | |
race | character varying(128) |
| | | extended | |
red_flag | boolean |
| not null | false | plain | |
reg_eng_101_placement | boolean |
| | | plain | |
reg_reading_math_proficiency | character varying(16) |
| | | extended | |
school_type | character varying(128) |
| | | extended | |
student_initiative_indicator | boolean |
| | | plain | |
student_name | character varying(255) |
| | | extended | |
student_services_fml | character varying(4) |
| | | extended | |
student_type | character varying(128) |
| | | extended | |
student_type_code | character varying(1) |
| | | extended | |
term | character varying(32) |
| | | extended | |
term_code | character varying(32) |
| | | extended | |
test_score_submitted | boolean |
| not null | false | plain | |
third_cultural_student | boolean |
| not null | false | plain | |
time_to_decision | integer |
| | | plain | |
total_athletic_scholarship | integer |
| | | plain | |
total_award | integer |
| | | plain | |
total_inst_grant_award | integer |
| | | plain | |
total_visits | integer |
| | | plain | |
visit_included_campus_tour | boolean |
| | | plain | |
visit_included_chapel | boolean |
| | | plain | |
visit_included_class_visit | boolean |
| | | plain | |
visit_included_meeting | boolean |
| | | plain | |
visit_included_overnight | boolean |
| | | plain | |
zip | character varying(5) |
| | | extended | |
zip_income_class | integer |
| | | plain | |
zip_median_income | integer |
| | | plain | |
zip_miles_from_campus | integer |
| | | plain | |
google_maps_latitude | real |
| | | plain | |
google_maps_longitude | real |
| | | plain | |
google_maps_formatted_address | character varying(255) |
| | | extended | |
google_civic_party | character varying(255) |
| | | extended | |
google_civic_republican_score | integer |
| | | plain | |
google_civic_democrat_score | integer |
| | | plain | |
google_civic_other_score | integer |
| | | plain | |
test_z_score | real |
| not null | 0.00 | plain | |
mailingaddress | character varying(128) |
| | | extended | |
mailingcountry | character varying(128) |
| | | extended | |
Table
"public.ra_model"
Column | Type | Collation | Nullable |
Default | Storage | Stats target | Description
-----------------------+------------------------+-----------+----------+-----------------------------------+----------+--------------+-------------
id | integer | | not null |
nextval('model_id_seq'::regclass) | plain | |
target_begin_field_id | integer | | |
| plain | |
target_end_field_id | integer | | not null |
| plain | |
algorithm | character varying(50) | | not null |
| extended | |
index | integer | | |
| plain | |
model_start_label | character varying(128) | | |
| extended | |
model_end_label | character varying(128) | | |
| extended | |
active | boolean | | not null | true
| plain | |
Indexes:
"model_pkey" PRIMARY KEY, btree (id) CLUSTER
Referenced by:
TABLE "ra_model_field" CONSTRAINT "model_id_fkey" FOREIGN KEY (model_id)
REFERENCES ra_model(id) ON DELETE CASCADE
Table "public.ra_student_prob"
Column | Type | Collation | Nullable |
Default | Storage | Stats target | Description
----------------+-----------------------+-----------+----------+-------------------------+----------+--------------+-------------
student_id | character varying(32) | | not null |
| extended | |
model_id | integer | | not null |
| plain | |
score | numeric(7,4) | | not null | 0.00
| main | |
additional_aid | integer | | not null | 0
| plain | |
date_complete | date | | |
| plain | |
term_code | character varying(8) | | not null |
NULL::character varying | extended | |
Indexes:
"student_prob_pkey" PRIMARY KEY, btree (student_id, model_id,
additional_aid, term_code) CLUSTER
View "public.v_capacity_score"
Column | Type | Collation | Nullable |
Default | Storage | Description
-----------------------------+-----------------------+-----------+----------+---------+----------+-------------
est_household_income_median | integer | | |
| plain |
score | numeric(7,4) | | |
| main |
student_id | character varying(32) | | |
| extended |
model_id | integer | | |
| plain |
ii | numeric | | |
| main |
capacity | numeric | | |
| main |
View definition:
SELECT ra_data.est_household_income_median,
ra_student_prob.score,
ra_data.crm_id AS student_id,
ra_student_prob.model_id,
income_index(ra_data.*) AS ii,
ceil(income_index(ra_data.*) * ra_student_prob.score / 400::numeric *
10::numeric) AS capacity
FROM ra_data
JOIN ra_student_prob ON ra_data.crm_id::text =
ra_student_prob.student_id::text AND ra_student_prob.score > 0::numeric AND
ra_student_prob.additional_aid < 1
WHERE ra_student_prob.score > 0::numeric AND
ra_data.est_household_income_median > 0 AND ra_student_prob.score < 100::numeric
ORDER BY ra_student_prob.score DESC, ra_data.est_household_income_median DESC;
EXPLAIN ANALYZE
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=218664.91..218665.61 rows=20 width=136) (actual
time=8398.604..8399.052 rows=100 loops=1)
-> Unique (cost=218661.41..218665.61 rows=120 width=136) (actual
time=8398.201..8398.869 rows=200 loops=1)
-> Sort (cost=218661.41..218661.71 rows=120 width=136) (actual
time=8398.198..8398.539 rows=200 loops=1)
Sort Key: ra_student_prob.score DESC, ra_data.student_name,
ra_student_prob.student_id, ra_data.academic_year,
ra_data.admissions_record_id, ra_data.owner, ra_data.last_activity_date,
ra_data.department_code, ra_data.college_code, ra_data.major_code,
ra_data.mailingcity, ra_data.mailingstate, v_capacity_score.capacity
Sort Method: external merge Disk: 7696kB
-> Hash Left Join (cost=124847.28..218657.27 rows=120
width=136) (actual time=6193.330..8032.022 rows=61577 loops=1)
Hash Cond: ((ra_student_prob.model_id =
v_capacity_score.model_id) AND ((ra_data.crm_id)::text =
(v_capacity_score.student_id)::text))
-> Gather (cost=1001.53..94237.33 rows=120 width=108)
(actual time=6.686..1600.142 rows=61577 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop (cost=1.54..93225.33 rows=50
width=108) (actual time=6.269..1858.789 rows=20526 loops=3)
-> Hash Join (cost=1.11..93128.48 rows=108
width=92) (actual time=6.235..1079.633 rows=20526 loops=3)
Hash Cond: (data_stage(ra_data.*) =
ra_model.target_begin_field_id)
-> Parallel Seq Scan on ra_data
(cost=0.00..92695.18 rows=27692 width=1738) (actual time=5.497..762.502
rows=21721 loops=3)
Filter: ((term_code)::text =
'202090'::text)
Rows Removed by Filter: 63342
-> Hash (cost=1.10..1.10 rows=1
width=8) (actual time=0.017..0.018 rows=1 loops=3)
Buckets: 1024 Batches: 1 Memory
Usage: 9kB
-> Seq Scan on ra_model
(cost=0.00..1.10 rows=1 width=8) (actual time=0.009..0.012 rows=1 loops=3)
Filter: (id = 2)
Rows Removed by Filter: 7
-> Index Scan using student_prob_pkey on
ra_student_prob (cost=0.42..0.89 rows=1 width=20) (actual time=0.028..0.030
rows=1 loops=61577)
Index Cond: (((student_id)::text =
(ra_data.crm_id)::text) AND (model_id = 2) AND (additional_aid < 1))
Filter: (score > '0'::numeric)
-> Hash (cost=122306.60..122306.60 rows=64676 width=47)
(actual time=6186.526..6186.527 rows=95218 loops=1)
Buckets: 65536 Batches: 2 Memory Usage: 2934kB
-> Subquery Scan on v_capacity_score
(cost=114113.78..122306.60 rows=64676 width=47) (actual time=5747.870..6105.377
rows=95218 loops=1)
-> Gather Merge (cost=114113.78..121659.84
rows=64676 width=88) (actual time=5747.867..5986.930 rows=95218 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=113113.76..113194.60
rows=32338 width=88) (actual time=5730.412..5754.249 rows=31739 loops=3)
Sort Key: ra_student_prob_1.score
DESC, ra_data_1.est_household_income_median DESC
Sort Method: quicksort Memory:
3280kB
Worker 0: Sort Method: quicksort
Memory: 3232kB
Worker 1: Sort Method: quicksort
Memory: 3232kB
-> Parallel Hash Join
(cost=9102.29..110691.49 rows=32338 width=88) (actual time=434.927..5621.711
rows=31739 loops=3)
Hash Cond:
((ra_data_1.crm_id)::text = (ra_student_prob_1.student_id)::text)
-> Parallel Seq Scan on
ra_data ra_data_1 (cost=0.00..92695.18 rows=70280 width=1665) (actual
time=0.993..4852.820 rows=55406 loops=3)
Filter:
(est_household_income_median > 0)
Rows Removed by
Filter: 29657
-> Parallel Hash
(cost=8486.17..8486.17 rows=49290 width=20) (actual time=431.536..431.539
rows=56394 loops=3)
Buckets: 262144
(originally 131072) Batches: 1 (originally 1) Memory Usage: 12416kB
-> Parallel Seq Scan
on ra_student_prob ra_student_prob_1 (cost=0.00..8486.17 rows=49290 width=20)
(actual time=0.010..251.982 rows=56394 loops=3)
Filter: ((score
> '0'::numeric) AND (additional_aid < 1) AND (score > '0'::numeric) AND (score
< '100'::numeric) AND (model_id = 2))
Rows Removed by
Filter: 105746
Planning Time: 0.775 ms
Execution Time: 8402.217 ms
(47 rows)