Hello, > Is it true that the SELECTs have no joins in them ?
Yes there is a lot of LEFT JOIN. > When it doesn't work, you could check SELECT * FROM pg_stat_activity, and >SELECT pg_blocking_pids(pid), * FROM pg_locks, to see what's going on. I can't see any blocking queries blocking pg_locks, pg_blocking_pids. > It'd be very useful to get "explain analyze" for a working query and for a > stuck query. It sound like the stuck query never finishes, so maybe the second > part is impossible (?) We run an explain analysis and we see some very interesting stuff going on. It seems without explicitly adding a `ANALYZE`, the query has a cost of over billions, so the query is not stuck but took forever. When I run the same scripts with an ANALYZE right before running the query, the query is exec is 50secondes and the cost is normal Explain analyze WITHOUT ANALYZE https://explain.depesz.com/s/RaSr Explain analyze same query WITH ANALYZE BEFORE https://explain.depesz.com/s/tYVl The configuration is tuned by aws aurora, but this issue happens also with a default config. allow_system_table_mods,off application_name,DataGrip 2021.1.3 archive_command,(disabled) archive_mode,off archive_timeout,5min array_nulls,on authentication_timeout,1min autovacuum,on autovacuum_analyze_scale_factor,0.05 autovacuum_analyze_threshold,50 autovacuum_freeze_max_age,200000000 autovacuum_max_workers,12 autovacuum_multixact_freeze_max_age,400000000 autovacuum_naptime,5s autovacuum_vacuum_cost_delay,1ms autovacuum_vacuum_cost_limit,1200 autovacuum_vacuum_scale_factor,0.1 autovacuum_vacuum_threshold,50 autovacuum_work_mem,-1 backend_flush_after,0 backslash_quote,safe_encoding bgwriter_delay,200ms bgwriter_flush_after,0 bgwriter_lru_maxpages,100 bgwriter_lru_multiplier,2 bonjour,off bytea_output,hex check_function_bodies,on checkpoint_completion_target,0.9 checkpoint_flush_after,0 checkpoint_timeout,15min checkpoint_warning,30s client_encoding,UTF8 client_min_messages,notice commit_delay,0 commit_siblings,5 constraint_exclusion,partition cpu_index_tuple_cost,0.005 cpu_operator_cost,0.0025 cpu_tuple_cost,0.01 cursor_tuple_fraction,0.1 DateStyle,"ISO, MDY" db_user_namespace,off deadlock_timeout,1s debug_pretty_print,on debug_print_parse,off debug_print_plan,off debug_print_rewritten,off default_statistics_target,500 default_text_search_config,pg_catalog.simple default_transaction_deferrable,off default_transaction_isolation,read committed default_transaction_read_only,off dynamic_library_path,$libdir effective_cache_size,4GB effective_io_concurrency,600 enable_bitmapscan,on enable_gathermerge,on enable_hashagg,on enable_hashjoin,on enable_indexonlyscan,on enable_indexscan,on enable_material,on enable_mergejoin,on enable_nestloop,on enable_parallel_append,on enable_parallel_hash,on enable_partition_pruning,on enable_partitionwise_aggregate,off enable_partitionwise_join,off enable_seqscan,on enable_sort,on enable_tidscan,on escape_string_warning,on event_source,PostgreSQL exit_on_error,off extra_float_digits,3 force_parallel_mode,off from_collapse_limit,8 fsync,off full_page_writes,off geqo,on geqo_effort,5 geqo_generations,0 geqo_pool_size,0 geqo_seed,0 geqo_selection_bias,2 geqo_threshold,12 gin_fuzzy_search_limit,0 gin_pending_list_limit,4MB hot_standby,off hot_standby_feedback,on huge_pages,try idle_in_transaction_session_timeout,25min ignore_checksum_failure,off ignore_system_indexes,off IntervalStyle,postgres jit,off jit_above_cost,100000 jit_debugging_support,off jit_dump_bitcode,off jit_expressions,on jit_inline_above_cost,500000 jit_optimize_above_cost,500000 jit_profiling_support,off jit_provider,llvmjit jit_tuple_deforming,on join_collapse_limit,8 lc_monetary,C lc_numeric,C lc_time,C listen_addresses,* lock_timeout,0 lo_compat_privileges,off maintenance_work_mem,2GB max_connections,100 max_files_per_process,1000 max_locks_per_transaction,256 max_logical_replication_workers,4 max_parallel_maintenance_workers,12 max_parallel_workers,12 max_parallel_workers_per_gather,6 max_pred_locks_per_page,2 max_pred_locks_per_relation,-2 max_pred_locks_per_transaction,64 max_prepared_transactions,0 max_replication_slots,10 max_stack_depth,6MB max_standby_archive_delay,30s max_standby_streaming_delay,14s max_sync_workers_per_subscription,2 max_wal_senders,0 max_wal_size,8GB max_worker_processes,12 min_parallel_index_scan_size,512kB min_parallel_table_scan_size,8MB min_wal_size,2GB old_snapshot_threshold,-1 operator_precedence_warning,off parallel_leader_participation,off parallel_setup_cost,1000 parallel_tuple_cost,0.1 password_encryption,md5 port,5432 post_auth_delay,0 pre_auth_delay,0 quote_all_identifiers,off random_page_cost,1.1 restart_after_crash,on row_security,on search_path,public seq_page_cost,1 session_replication_role,origin shared_buffers,1GB standard_conforming_strings,on statement_timeout,0 superuser_reserved_connections,3 synchronize_seqscans,on synchronous_commit,on syslog_facility,local0 syslog_ident,postgres syslog_sequence_numbers,on syslog_split_messages,on tcp_keepalives_count,9 tcp_keepalives_idle,7200 tcp_keepalives_interval,75 temp_buffers,8MB temp_file_limit,-1 TimeZone,UTC trace_notify,off trace_recovery_messages,log trace_sort,off track_activities,on track_activity_query_size,4kB track_commit_timestamp,off track_counts,on track_functions,none track_io_timing,off transform_null_equals,off update_process_title,on vacuum_cleanup_index_scale_factor,0.1 vacuum_cost_delay,0 vacuum_cost_limit,200 vacuum_cost_page_dirty,20 vacuum_cost_page_hit,1 vacuum_cost_page_miss,0 vacuum_defer_cleanup_age,0 vacuum_freeze_min_age,50000000 vacuum_freeze_table_age,150000000 vacuum_multixact_freeze_min_age,5000000 vacuum_multixact_freeze_table_age,150000000 wal_buffers,16MB wal_compression,off wal_level,minimal wal_log_hints,off wal_receiver_status_interval,10s wal_receiver_timeout,30s wal_retrieve_retry_interval,5s wal_sender_timeout,1min wal_sync_method,fdatasync wal_writer_delay,200ms wal_writer_flush_after,1MB work_mem,2GB xmlbinary,base64 xmloption,content zero_damaged_pages,off On Thu, Jul 8, 2021 at 2:33 PM Justin Pryzby <pry...@telsasoft.com> wrote: > On Thu, Jul 08, 2021 at 01:00:28AM +0200, Allan Barrielle wrote: > > All is fine, and can work great. > > But sometimes, some queries that used to take about 20 secs to complete > can > > suddenly end in 5mins. > > Important all queries have the same shape -> CREATE TABLE SELECT AS *(a > bit > > of transform) FROM TABLE). No update, nothing, it’s dead simple. > > > > Nothing works, it’s very random, some query won’t simply work ( even > after > > hours ). > > When it doesn't work, you could check SELECT * FROM pg_stat_activity, and > SELECT pg_blocking_pids(pid), * FROM pg_locks, to see what's going on. > > > Important all queries have the same shape -> CREATE TABLE SELECT AS *(a > bit > > of transform) FROM TABLE). No update, nothing, it’s dead simple. > > We are just trying to copy a table from schema1, to schema2, to schema3 > and > > finally schema3. That’s it. > > Is it true that the SELECTs have no joins in them ? > > Did this ever work better or differently under different versions of > postgres ? > > > Why does the query never end even after hours ? Why there is no log about > > where the query is stuck. > > Please send your nondefault config. > https://wiki.postgresql.org/wiki/Server_Configuration > > Also enable logging (I just added this to the wiki). > https://wiki.postgresql.org/wiki/Slow_Query_Questions#Enable_Logging > > It'd be very useful to get "explain analyze" for a working query and for a > stuck query. It sound like the stuck query never finishes, so maybe the > second > part is impossible (?) > > But it'd be good to get at least "explain" output. You'd have to edit > your sql > script to run an "explain" before each query, and run it, logging the > ouput, > until you capture the plan for a stuck query. Save the output and send > here, > along with the query plan for a working query. > > -- > Justin >