Re: Advice needed: query performance deteriorates by 2000% within 1 minute

2022-03-01 Thread Michael Lewis
And I would absolutely crank up autovacuum and analyze settings. Turn up the cost limits, turn down the cost delays, decrease the scale factor. Whatever you need to do such that autovacuum runs often. No need to schedule a manual vacuum at all. Just don't wait until 20% of the table is dead before

Re: Advice needed: query performance deteriorates by 2000% within 1 minute

2022-03-01 Thread Michael Lewis
If you expect to have high cache hits and/or have ssd or similar fast storage, random page cost should be more like 1-2 rather than the default 4. When using jsonb, you'd normally have estimates based solely on the constants for the associated datatype (1/3 or 2/3 for a nullable boolean for instanc

Re: Advice needed: query performance deteriorates by 2000% within 1 minute

2022-02-24 Thread Michael Lewis
You are getting row estimate 48 in both cases, so it seems perhaps tied to the free space map that will mean more heap lookups from the index, to the point where the planner thinks that doing sequential scan is less costly. What is random_page_cost set to? Do you have default autovacuum/analyze se

Re: Slow query fixed by replacing equality with a nested query

2022-01-24 Thread Michael Lewis
On Mon, Jan 24, 2022 at 6:22 AM Valentin Janeiko wrote: > I have rewritten the query using JOINs. I had to make one of them a > FULL JOIN, but otherwise JOINs seem like a good idea. > I have added the new query to the (same) gist: > > https://gist.github.com/valeneiko/89f8cbe26db7ca2651b47524462b

Re: Slow query fixed by replacing equality with a nested query

2022-01-24 Thread Michael Lewis
On Fri, Jan 21, 2022 at 4:37 AM wrote: > I have done a few simple experiments in the past comparing CTEs like this > to JOINS, but the resultant query plans were the same. CTEs seemed easier > to follow when troubleshooting issues, so I left them as such. Do JOINs > become better than CTEs at a c

Re: Slow query fixed by replacing equality with a nested query

2022-01-20 Thread Michael Lewis
I don't see any reference to cte1. Is that expected? I'm unclear why these sets are not just inner join'd on resource_surrogate_id. It seems like that column it is being selected as Sid1 in each CTE, and then the next one does the below. Why? where resource_surrogate_id IN (SELECT Sid1 FROM cte_p

Re: Out of memory error

2021-11-23 Thread Michael Lewis
It seems like that function has some syntax errors, and also doesn't do what you want since I presume the "from employee" bit would mean you get many rows inserted into that temp table for all the existing data and not the one row you are operating on at the moment the trigger fires. It is worth n

Re: performance of analytical query

2021-11-12 Thread Michael Lewis
On Thu, Nov 11, 2021 at 7:42 PM Justin Pryzby wrote: > BTW, we disable nested loops for the our analytic report queries. I have > never > been able to avoid pathological plans any other way. > Curious, do you see any problems from that? Are there certain nodes that really are best suited to a n

Re: Lock contention high

2021-10-25 Thread Michael Lewis
On Mon, Oct 25, 2021, 5:36 PM Andres Freund wrote: If your hot data set is actually larger than s_b, I'd recommend trying a larger s_b. It's plausible that a good chunk of lock contention is from that. How much larger might you go? Any write ups on lock contention as it relates to shared buffers

Re: Postgres views cannot use both union and join/where

2021-10-19 Thread Michael Lewis
On Tue, Oct 19, 2021 at 3:48 PM Mithran Kulasekaran < mithranakulaseka...@gmail.com> wrote: > create view template_view (id, name, description, is_staged) as > select t.id,t.name, t.description, false as is_staged > from template t > left join template_staging ts on t.name = ts.name and

Re: Query out of memory

2021-10-19 Thread Michael Lewis
Check explain plan, change work mem to 100MBs and then check explain plan again. If it changed, then try explain analyze. Work mem is limit is used per node in the plan, so especially with partitioned tables, that limit is way too high.

Re: Troubleshooting a long running delete statement

2021-10-06 Thread Michael Lewis
On Wed, Oct 6, 2021 at 12:00 PM Dirschel, Steve < steve.dirsc...@thomsonreuters.com> wrote: > Here is what I could see in Postgres: > >- When I did an explain on the delete I could see it was full scanning >the table. I did a full scan of the table interactively in less than 1 >second

Re: hashjoins, index loops to retrieve pk/ux constrains in pg12

2021-09-27 Thread Michael Lewis
s of data, rather than do nested loop index scans. I think that's how that works. With the lack of flexibility to change the query, you might be able to set a few configs for the user that runs these schema checks. If you can find changes that make an overall improvement. *Michael Lewis | Dat

Re: hashjoins, index loops to retrieve pk/ux constrains in pg12

2021-09-23 Thread Michael Lewis
I believe that this is a planning problem with the number of tables/joins involved in the query you have written. If you take a look at the definition of the views in information_schema that you are using and read about from_collapse_limit/join_collapse_limit, you may see that this is a bit painful

Re: Postgres chooses slow query plan from time to time

2021-09-13 Thread Michael Lewis
Autovacuum will only run for freezing, right? Insert only tables don't get autovacuumed/analyzed until PG13 if I remember right.

Re: Better performance no-throw conversion?

2021-09-08 Thread Michael Lewis
On Wed, Sep 8, 2021 at 11:33 AM Tom Lane wrote: > "l...@laurent-hasson.com" writes: > > Some databases such as SQLServer (try_cast) or BigQuery (safe.cast) > offer not-throw conversion. > > ... > > I couldn't find a reference to such capabilities in Postgres and > wondered if I missed it, and if

Re: Big performance slowdown from 11.2 to 13.3

2021-07-21 Thread Michael Lewis
I'm not seeing the valueof the CTE. Why not access assessmenticcqa_raw directly in the main query and only do GROUP BY once? Do you have many values in iccqar_ques_code which are not used in this query? >

Re: AWS forcing PG upgrade from v9.6 a disaster

2021-05-28 Thread Michael Lewis
The plan is also influenced by cost related and memory related config settings such as random_page_cost and work_mem, right? Hence the questions if configs are matching or newer versions are using very conservative (default) settings.

Re: Partition with check constraint with "like"

2021-05-20 Thread Michael Lewis
On Thu, May 20, 2021, 8:38 PM Justin Pryzby wrote: > On Fri, May 21, 2021 at 02:36:14AM +, Nagaraj Raj wrote: > > Thank you. This is a great help. > > But "a" have some records with alpha and numeric. > > So then you should make one or more partitions FROM ('1')TO('9'). > What about 0? Sorr

Re: SELECT Query taking 200 ms on PostgreSQL compared to 4 ms on Oracle after migration.

2021-04-03 Thread Michael Lewis
It seems like something is missing. Is this table partitioned? How long ago was migration done? Has vacuum freeze and analyze of tables been done? Was index created after populating data or reindexed after perhaps? What version are you using?

Re: Is there a known bug with SKIP LOCKED and "tuple to be locked was already moved to another partition due to concurrent update"?

2021-03-12 Thread Michael Lewis
https://www.postgresql-archive.org/CPU-hogged-by-concurrent-SELECT-FOR-UPDATE-SKIP-LOCKED-td6150480.html David Rowley on 20 Aug 2020- "When updates occur in a non-partitioned table we can follow item pointer chains to find the live row and check if the WHERE clause still matches to determine if th

Re: Fwd: different execution time for the same query (and same DB status)

2021-03-10 Thread Michael Lewis
I would increase shared_buffers to 1GB or more. Also, it would be very interesting to see these queries executed with JIT off.

Re: different execution time for the same query (and same DB status)

2021-03-08 Thread Michael Lewis
You don't mention shared_buffers, which is quite low by default. Not sure of the memory of your docker container, but it might be prudent to increase shared_buffers to keep as much data as possible in memory rather than needing to read from disk by a second run. To test the possibility Tom Lane sug

Re: Potential performance issues related to group by and covering index

2021-03-02 Thread Michael Lewis
> > If we want to do anything much smarter than that like trying every > combination of the GROUP BY clause, then plan times are likely going > to explode. The join order search is done based on the chosen query > pathkeys, which in many queries is the pathkeys for the GROUP BY > clause (see standa

Re: Query performance issue

2021-02-16 Thread Michael Lewis
What indexes exist on those tables? How many rows do you expect to get back in total? Is the last_contacted_anychannel_dttm clause restrictive, or does that include most of the prospect table (check pg_stats for the histogram if you don't know). and (a."shared_paddr_with_customer_ind" = 'N')

Re: Slow query and wrong row estimates for CTE

2021-02-16 Thread Michael Lewis
> >Sort Method: external > merge Disk: 30760kB >Worker 0: Sort Method: > external merge Disk: 30760kB >Worker 1: Sort Method: > external me

Re: How to deal with analyze gathering irrelevant stats

2021-01-11 Thread Michael Lewis
On Mon, Jan 11, 2021 at 9:52 AM Rémi Chatenay wrote: > I'd say it's a 1 insert for 5 - 10 updates. > > As for the index on the status, it's because we have a job that runs every > night that deals with conversations in specific statuses. Having a low > cardinality index that changes frequently se

Re: How to deal with analyze gathering irrelevant stats

2021-01-11 Thread Michael Lewis
What is the usage pattern of the conversations table? Is getting many inserts during the day, or updates of status mostly? Why have an index on the status column at all? My guess would be that there are 2-10 statuses, but many many rows in the table for most of those statuses. Having a low cardina

Re: Autovacuum not functioning for large tables but it is working for few other small tables.

2021-01-08 Thread Michael Lewis
By the way, please do not top-post (reply above, quoting the full email after) in these groups. On Fri, Jan 8, 2021 at 5:00 AM M Tarkeshwar Rao < m.tarkeshwar@ericsson.com> wrote: > Hi all, > > As we know, the VACUUM VERBOSE output has a lot of dependencies from > production end and is indef

Re: Slow recursive CTE query questions, with row estimate and n_distinct issues

2020-12-28 Thread Michael Lewis
On Mon, Dec 28, 2020 at 7:51 AM Christopher Baines wrote: > derivation_inputs: > COUNT(*): 285422539 > reltuples: 285422528 > > derivation_id: > COUNT(DISTINCT): 7508610 > n_distinct: 4336644 (~57% of the true value) > > derivation_output_id: > COUNT(DISTINCT): 5539406 >

Re: PostgeSQL JSONB Column with various type of data

2020-12-04 Thread Michael Lewis
On Fri, Dec 4, 2020 at 9:21 AM Riswana Rahman wrote: > CREATE OR REPLACE FUNCTION jsonbNull(jsonb_column JSONB) > > returns boolean as $$ > > declare > > isPoint text := jsonb_typeof(jsonb_column) ; > > begin > > CASE isPoint > > WHEN 'arra

Re: Temporarily disable not null constraints

2020-12-03 Thread Michael Lewis
On Thu, Dec 3, 2020 at 1:00 PM Nagaraj Raj wrote: > Hi, > > Can we disable not null constraints temporarily in the session-based > transaction, like we disable FK constraints? > > SET session_replication_role = ‘replica’; > alter table table_name disable trigger user;” > > above two options are w

Re: Adding nextval() to a select caused hang/very slow execution

2020-11-04 Thread Michael Lewis
On Wed, Nov 4, 2020 at 12:12 PM Eric Raskin wrote: > OK - I see. And to add insult to injury, I tried creating a temporary > table to store the intermediate results. Then I was going to just do an > insert... select... to insert the rows. That would de-couple the > nextval() from the query. >

Re: Understanding bad estimate (related to FKs?)

2020-11-02 Thread Michael Lewis
> > The query I asked about in the original post of this thread has 13 > relations in it. IIUC, that's 13! or > 6 billion possible plans. How did > the planner pick one plan out of 6 billion? I'm curious, both for practical > purposes (I want my query to run well) and also because it's fascinating.

Re: query plan using partial index expects a much larger number of rows than is possible

2020-10-29 Thread Michael Lewis
On Wed, Oct 28, 2020 at 5:30 PM Tom Lane wrote: > "Olivier Poquet" writes: > > Looking at it in more detail, I found that the planner is assuming that > I'll get millions of rows back even when I do a simple query that does an > index scan on my partial index: > > We don't look at partial-index

Re: Understanding bad estimate (related to FKs?)

2020-10-26 Thread Michael Lewis
On Mon, Oct 26, 2020 at 11:14 AM Philip Semanchuk < phi...@americanefficient.com> wrote: > >> The item I'm focused on is node 23. The estimate is for 7 rows, actual > is 896 (multiplied by 1062 loops). I'm confused about two things in this > node. > >> > >> The first is Postgres' estimate. The con

Re: Query Performance / Planner estimate off

2020-10-21 Thread Michael Lewis
On Wed, Oct 21, 2020, 8:42 AM Mats Olsen wrote: > > On 10/21/20 2:38 PM, Sebastian Dressler wrote: > > Hi Mats, > > On 20. Oct 2020, at 11:37, Mats Julian Olsen > wrote: > > [...] > > 1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR > 2) enable_nestloop=off (4 min): https://explain.d

Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-19 Thread Michael Lewis
Reply to the group, not just me please. Btw, when you do reply to the group, it is best practice on these lists to reply in-line and not just reply on top with all prior messages quoted. On Sun, Oct 18, 2020 at 3:23 AM aditya desai wrote: > I tried vacuum full and execution time came down to hal

Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-15 Thread Michael Lewis
What version by the way? Do you get a faster execution if you disable sequential scan? Or set parallel workers per gather to 0? Your estimates look decent as do cache hits, so other than caching data or upgrading hardware, not sure what else there is to be done. Although... you are hitting 70k blo

Re: Slow Query

2020-10-14 Thread Michael Lewis
Based on the execution plan, it looks like the part that takes 13 seconds of the total 14.4 seconds is just calculating the max time used in the where clause. Anytime I see an OR involved in a plan gone off the rails, I always always check if re-writing the query some other way may be faster. How's

Re: Slow Query

2020-10-14 Thread Michael Lewis
Is there no index on thread.spool? What about notification.user? How about message.time (without thread as a leading column). Those would all seem very significant. Your row counts are very low to have a query perform so badly. Work_mem could probably be increased above 4MB, but it isn't hurting th

Re: Too many waits on extension of relation

2020-10-05 Thread Michael Lewis
What is relation 266775 of database 196511? Is it cms_c207c1e2_0ce7_422c_aafb_77d43f61e563.cms_item or some system catalog table? When I search google for "ExclusiveLock on extension of relation" I find one thread about shared_buffers being very high but not big enough to fit the entire data in th

Re: Single column vs composite partial index

2020-09-18 Thread Michael Lewis
Index Cond: ((load_dttm)::date >= (CURRENT_DATE - 7))" There is no need to cast the load_dttm field to a date in the query. The plain index on the field would be usable if you skipped that. In your example, you show creating the single column index but it isn't getting used because of the type cas

Re: Performance Issue (Not using Index when joining two tables).

2020-09-13 Thread Michael Lewis
Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])) AND f_sel_policy_all(vpd_key, 'CH R_EMP_POSITION'::character varying) AND f_sel_policy_prod_locale((cep.*)::character varying, prod_locale_code)) This looks like some stuff for row level security perhaps. My understanding is

Re: Query performance issue

2020-09-05 Thread Michael Lewis
On Fri, Sep 4, 2020, 4:20 PM Nagaraj Raj wrote: > Hi Mechel, > > I added the index as you suggested and the planner going through the > bitmap index scan,heap and the new planner is, > HaOx | explain.depesz.com > > HaOx | explain.depesz.com > >

Re: Query performance issue

2020-09-04 Thread Michael Lewis
"Subquery Scan on rec (cost=1628601.89..1676580.92 rows=7381 width=41) (actual time=22171.986..23549.079 rows=1236042 loops=1)" " Filter: (rec.mpos = 1)" " Rows Removed by Filter: 228737" " Buffers: shared hit=45 read=1166951" " I/O Timings: read=29.530" " -> WindowAgg (cost=1628601.89..1658127.45

Re: Query performance issue

2020-09-04 Thread Michael Lewis
Note- you may need to vacuum* the table to get full benefit of index only scan by updating the visibility map. I think index only scan is skipped in favor of just checking visibility when the visibility map is stale. *NOT full

Re: Query performance issue

2020-09-04 Thread Michael Lewis
CREATE INDEX receiving_item_delivered_received ON receiving_item_delivered_received USING btree ( eventtype, replenishmenttype, serial_no, eventtime DESC ); > More work_mem as Tomas suggests, but also, the above index should find the candidate rows by the first two keys, and then be able to skip t

Re: Too few rows expected by Planner on partitioned tables

2020-08-26 Thread Michael Lewis
On Wed, Aug 26, 2020, 1:37 AM Julian Wolf wrote: > Hi Justin, > > thank you very much for your help and sorry for the late answer. > > After testing around with your suggestions, it actually was the daterange > type which caused all the problems. Messing around with the statistics > value improve

Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

2020-08-20 Thread Michael Lewis
On Thu, Aug 20, 2020 at 4:40 PM Jim Jarvie wrote: > On 20-Aug.-2020 17:42, Michael Lewis wrote: > > Can you share an explain analyze for the query that does the select for > update? I wouldn't assume that partition pruning is possible at all with > hash, and it would be inter

Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

2020-08-20 Thread Michael Lewis
Can you share an explain analyze for the query that does the select for update? I wouldn't assume that partition pruning is possible at all with hash, and it would be interesting to see how it is finding those rows. >

Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

2020-08-20 Thread Michael Lewis
Great to hear that some of the issues are now mitigated. Though, perhaps you actually require that ORDER BY if items are expected to be sitting in the queue quite some time because you have incoming queue items in a burst pattern and have to play catch up sometimes. If so, I highly suspect the inde

Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

2020-08-18 Thread Michael Lewis
Also, have you checked how bloated your indexes are getting? Do you run default autovacuum settings? Did you update to the new default 2ms cost delay value? With a destructive queue, it would be very important to ensure autovacuum is keeping up with the churn. Share your basic table structure and i

Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

2020-08-18 Thread Michael Lewis
On Tue, Aug 18, 2020 at 6:22 PM Jim Jarvie wrote: > There is some ordering on the select [ ORDER BY q_id] so each block of 250 > is sequential-ish queue items; I just need them more or less in the order > they were queued so as near FIFO as possible without being totally strict > on absolute sequ

Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

2020-08-18 Thread Michael Lewis
Message queue... Are rows deleted? Are they updated once or many times? Have you adjusted fillfactor on table or indexes? How many rows in the table currently or on average? Is there any ordering to which rows you update? It seems likely that one of the experts/code contributors will chime in and

Re: Hstore index for full text search

2020-08-11 Thread Michael Lewis
On Tue, Aug 11, 2020 at 4:46 PM Tom Lane wrote: > A GIN index on an hstore column only provides the ability to search for > exact matches to hstore key strings. There are a few bells and whistles, > like the ability to AND or OR such conditions. But basically it's just an > exact-match engine,

Re: Hstore index for full text search

2020-08-11 Thread Michael Lewis
Hash Cond: (o.courier_id = cc.id) Filter: (((o.tracker_code)::text ~~* '%1654323%'::text) OR ((table_cus.name)::text ~~* '%1654323%'::text) OR ((au.username)::text ~~ '%1654323%'::text) OR ((o.source)::text ~~* '%1654323%'::text) OR ((o.ops -> 'shop'::text) ~~* '%1654323%'::text) OR ((o.ops -> 'cam

Re: Sudden insert performance degradation

2020-07-13 Thread Michael Lewis
Is this an insert only table and perhaps not being picked up by autovacuum? If so, try a manual "vacuum analyze" before/after each batch run perhaps. You don't mention updates, but also have been adjusting fillfactor so I am not not sure.

Re: Unclamped row estimates whith OR-ed subplans

2020-06-19 Thread Michael Lewis
> > While you're waiting, you might think about recasting the query to > avoid the OR. Perhaps you could do a UNION of two scans of the > transactions table? > Minor note- use UNION ALL to avoid the dedupe work if you already know those will be distinct sets, or having duplicates is fine.

Re: simple query running for ever

2020-06-16 Thread Michael Lewis
On Tue, Jun 16, 2020 at 2:35 PM Nagaraj Raj wrote: > I wrote a simple query, and it is taking too long, not sure what is wrong > in it, even its not giving EXPLAIN ANALYZE. > More context is needed. Please review- https://wiki.postgresql.org/wiki/Slow_Query_Questions

Re: view reading information_schema is slow in PostgreSQL 12

2020-06-15 Thread Michael Lewis
On Fri, Jun 12, 2020 at 12:26 PM regrog wrote: > I'm facing performance issues migrating from postgres 10 to 12 (also from > 11 > to 12) even with a new DB. > Th performance difference is huge 300ms in pg10 vs 3 minutes in pg12. > > I have a view that abstracts the data in the database: > > CREAT

Re: Request to help on GIS Query improvement suggestion.

2020-05-22 Thread Michael Lewis
Your indexes and operators are not compatible. You have added a btree index on md5 function result and are not using md5 in your query, and also using LIKE operator not one of the supported ones. I believe it might use a btree operator (plain value, not md5 result) if you are always searching for "

Re: AutoVacuum and growing transaction XID's

2020-05-08 Thread Michael Lewis
autovacuum_naptime being only 5 seconds seems too frequent. A lock_timeout might be 1-5 seconds depending on your system. Usually, DDL can fail and wait a little time rather than lock the table for minutes and have all reads back up behind the DDL. Given you have autovacuum_vacuum_cost_limit set t

Re: AutoVacuum and growing transaction XID's

2020-05-07 Thread Michael Lewis
It is trying to do a vacuum freeze. Do you have autovacuum turned off? Any settings changed from default related to autovacuum? https://www.postgresql.org/docs/9.6/routine-vacuuming.html Read 24.1.5. Preventing Transaction ID Wraparound Failures These may also be of help- https://info.crunchydata

Re: good book or any other resources for Postgresql

2020-05-04 Thread Michael Lewis
I don't know the others, but have enjoyed and learned a great deal from The Art of PostgreSQL. >

Re: Duplicate WHERE condition changes performance and plan

2020-05-04 Thread Michael Lewis
Why not vacuum analyze both tables to ensure stats are up to date? Have you customized default_statistics_target from 100? It may be that 250 would give you a more complete sample of the table without increasing the size of the stats tables too much such that planning time increases hugely. Do yo

Re: 600 million rows of data. Bad hardware or need partitioning?

2020-05-03 Thread Michael Lewis
What kinds of storage (ssd or old 5400 rpm)? What else is this machine running? What configs have been customized such as work_mem or random_page_cost?

Re: The query plan get all columns but I'm using only one column.

2020-04-30 Thread Michael Lewis
> > In staging environment we have disabled autovacuum since that it is a > testing environment and the database are restored very often. > But in production environment it is enabled autovacuum=on > > The weird case is that production was slow and staging environment was > faster. > You haven't s

Re: The query plan get all columns but I'm using only one column.

2020-04-29 Thread Michael Lewis
It is generally a very bad idea to turn off autovacuum. When it is causing problems, it is likely that it needs to run more often to keep up with the work, rather than not run at all. Certainly if it is turned off, it would be critical to have a regularly scheduled process to vacuum analyze all tab

Re: The query plan get all columns but I'm using only one column.

2020-04-25 Thread Michael Lewis
The example is nonsensical so I expect it is too contrived to be useful for analyzing the actual problem. Additionally, the total query time is under 1ms and most of it is planning time. Use a prepared statement or do something else to reduce planning time like reducing statistics target if that a

Re: PostgreSQL does not choose my indexes well

2020-04-23 Thread Michael Lewis
> > "unless the table is mostly marked all-visible" Is that taken into account during planning when evaluating index scan vs sequential scan?

Re: Best way to delete big amount of records from big table

2020-03-27 Thread Michael Lewis
If you can afford the time, I am not sure the reason for the question. Just run it and be done with it, yes? A couple of thoughts- 1) That is a big big transaction if you are doing all the cleanup in a single function call. Will this be a production system that is still online for this archiving?

Re: JOIN on partitions is very slow

2020-03-23 Thread Michael Lewis
On Mon, Mar 23, 2020 at 1:40 AM daya airody wrote: > Yes. I can tweak the query. Version of postgres is 9.5.15. I have about 20 > partitions for company_sale_account table. > I do have an index on company name. > > I need to use DISTINCT as i need to remove the duplicates. > DISTINCT is a sign o

Re: JOIN on partitions is very slow

2020-03-22 Thread Michael Lewis
Are you able to tweak the query or is that generated by an ORM? What version of Postgres? Which configs have you changed from default? How many partitions do you have? Is there an index on company name? Anytime I see distinct keyword, I expect it to be a performance bottleneck and wonder about rew

Re: Many DataFileRead - IO waits

2020-02-28 Thread Michael Lewis
If no updates or deletes are happening on the table, it would be best practice to set up a scheduled manual vacuum analyze to ensure statistics and the visibility map is updated. Other than creating the index on the first two columns only, I'm out of ideas. Hopefully someone running Postgres at lar

Re: Many DataFileRead - IO waits

2020-02-27 Thread Michael Lewis
How big is ix_tabledata_intid_timestampdate_intotherid3_intotherid2 on disk? If you create another index with same fields, how much space does it take? Real question- are you vacuuming aggressively enough for your workload? Your index name seems to indicate that intotherid3 would be the third key,

Re: much slower query in production

2020-02-26 Thread Michael Lewis
> > UPDATE multicards >SET defacements = COALESCE( count, 0 ) > FROM ( SELECT multicard_uid, COUNT(*) AS count FROM tickets GROUP BY > multicard_uid ) AS sub > WHERE uid = multicard_uid OR multicard_uid is null; > I expect this should work. Not sure of performance of course.

Re: much slower query in production

2020-02-26 Thread Michael Lewis
By the way, I expect the time is cut in half while heap fetches stays similar because the index is now in OS cache on the second run and didn't need to be fetched from disk. Definitely need to check on vacuuming as Justin says. If you have a fairly active system, you would need to run this query ma

Re: How to avoid UPDATE performance degradation in a transaction

2020-02-14 Thread Michael Lewis
If your trigger is supposed to change certain fields, you could return OLD instead of NEW if those fields have not been changed by the trigger. You could also check an updated_on timestamp field to verify if the row has already been modified and potentially skip the trigger altogether. Just a coupl

Re: Query optimization advice for beginners

2020-01-27 Thread Michael Lewis
You've got two references to order_basketitemdetail both aliased to bid and ALSO a table called order_basketitembatch aliased to bib. I assume that confuses the planner, but even if it doesn't it certainly confuses any new developers trying to understand the query's intention. The biggest thing th

Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters

2020-01-16 Thread Michael Lewis
base. I ran the query for > all of them. The only one miss-behaving is "92". I ran the query with > random values for Ver (invalid values), the query plan always attempts to > use the index using both values. > I looked into "most_common_values" in pg_stats, this val

Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters

2020-01-16 Thread Michael Lewis
Does the behavior change with different values of Ver column? I'd be curious of the fraction in the MCVs frequency list in stats indicates that rows with Ver = 92 are rare and therefore the index on only Ver column is sufficient to find the rows quickly. What is reltuples for this table by the way?

Re: Seeking reason behind performance gain in 12 with HashAggregate

2020-01-13 Thread Michael Lewis
I am not at all familiar with PostGIS so perhaps this is a silly question, is bloat an issue on the older instance? Correlation isn't causation, but half the buffers scanned and half the runtime in the v12 plan has me curious why that might be. >

Re: Specific query taking time to process

2019-12-11 Thread Michael Lewis
This seems beyond me at this point, but I am curious if you also vacuumed alf_node_properties and alf_node tables and checked when they last got (auto)vacuumed/analyzed. With default configs for autovacuum parameters and tables with that many rows, they don't qualify for autovacuum very often. I do

Re: Specific query taking time to process

2019-12-10 Thread Michael Lewis
On Mon, Dec 9, 2019 at 3:39 PM Fahiz Mohamed wrote: > I ran "explain analyse" on this query and I got following result. (We have > 2 identical DB instances and they consist of same data. Instane 1 took 20+ > second to process and instance 2 took less than a second) > > Instance 1: (This is used b

Re: Specific query taking time to process

2019-12-09 Thread Michael Lewis
> > There is a specific search query I am running to get list of Documents and > their metadata from several table in the DB. > We are running Postgres 9.6.9 on Amazon RDS (db.m5.4xlarge instance) > > Our current DB consists of 500GB of data and indexes. Most of the rows in > table are consist of 4

Re: autovacuum locking question

2019-12-05 Thread Michael Lewis
On Thu, Dec 5, 2019 at 3:26 PM Mike Schanne wrote: > I am concerned that if the autovacuum is constantly canceled, then the > table never gets cleaned and its performance will continue to degrade over > time. Is it expected for the vacuum to be canceled by an insert in this > way? > > > > We are

Re: Make recently inserted/updated records available in the buffer/cache

2019-12-03 Thread Michael Lewis
On Tue, Dec 3, 2019 at 11:46 AM Sachin Divekar wrote: > I am also going to use SKIP LOCKED to _select for update_. Any suggestions > on tuning parameters for SKIP LOCKED? > I am not aware of any. Either you use it because it fits your need, or not. Note- please don't top-post (reply and include

Re: Make recently inserted/updated records available in the buffer/cache

2019-12-03 Thread Michael Lewis
"I am going to use it as a queue" You may want to look at lowering fillfactor if this queue is going to have frequent updates, and also make autovacuum/analyze much more aggressive assuming many updates and deletes.

Re: Re[4]: Postgresql planning time too high

2019-11-22 Thread Michael Lewis
As a matter of habit, I put all inner joins that may limit the result set as the first joins, then the left joins that have where conditions on them. I am not sure whether the optimizer sees that only those tables are needed to determine which rows will be in the end result and automatically priori

Re: Simple DELETE on modest-size table runs 100% CPU forever

2019-11-14 Thread Michael Lewis
> If I leave out the "analyze", here's what I get (note that the > categories_staging_N table's name changes every time; it's > created on demand as "create table categories_staging_n(id integer)"). > How/when are they created? In the same statement? After create, are you analyzing these tables? I

Re: Slow "not in array" operation

2019-11-12 Thread Michael Lewis
h=0) (actual time=1493.823..1493.823 rows=7203173 loops=1) > >Index Cond: (project_id = 123) > > Planning Time: 1.273 ms > > Execution Time: 95132.766 ms > > (15 rows) > > > On Tue, Nov 12, 2019 at 8:20 PM Michael Lewis wrote: > >>

Re: Slow "not in array" operation

2019-11-12 Thread Michael Lewis
It is very interesting to me that the optimizer chose a parallel sequential scan rather than an index scan on either of your indexes that start with project_id that also reference trashed_at. 1) Are you running on SSD type storage? Has random_page_cost been lowered to 1-1.5 or so (close to 1 assum

Re: Slow "not in array" operation

2019-11-12 Thread Michael Lewis
What's the plan for the slow one? What's the time to just count all rows? >

Re: Slow planning, fast execution for particular 3-table query

2019-11-06 Thread Michael Lewis
Is default_statistics_target set above default 100? I would assume that would reflect in the size of pg_statistic, but wanted to ask since increasing that from 100 to 1000 was the only time I have seen planning time explode. Are other queries slow to plan?

Reading explain plans- row estimates/actuals on lower nodes vs next level up

2019-10-17 Thread Michael Lewis
30 actual and 3350 vs 3320)... why does the higher node have such a different estimate vs actual ratio? *Michael Lewis | Software Engineer* *Entrata*

Re: Query slow again after adding an `OR` operation (was: Slow PostgreSQL 10.6 query)

2019-10-09 Thread Michael Lewis
Are you prefixing this auto generated query with set join_collapse_limit = 30, or are you changing the default and reloading config? That is, can you impact ONLY this query with these config changes? I wouldn't assume so, so any hack/query hint like turning off hashjoins (which seem to be chosen in

Re: Query slows when used with view

2019-10-09 Thread Michael Lewis
> > When you join to a view, the view sticks together, as if they were all in > parentheses. But when you substitute the text of a view into another > query, then they are all on the same level and can be parsed differently. > > Consider the difference between "1+1 * 3", and "(1+1) * 3" > I thou

Re: Query slows when used with view

2019-10-09 Thread Michael Lewis
> > Those are not equivalent queries. Read up on the syntax of FROM; > particularly, that JOIN binds more tightly than comma. > I see this- "A JOIN clause combines two FROM items, which for convenience we will refer to as “tables”, though in reality they can be any type of FROM item. Use parenth

Re: distinct on extract returns composite type

2019-10-07 Thread Michael Lewis
As long as we are on the performance list and not general, it might be worth noting that partitioning should be defined directly on the data and not on a function result I believe. If you always do the extract year and extract quarter thing, it may work out just fine. But just a regular btree index

  1   2   >