Re: Failed upgrade from 12.11 to 14.4
I was unaware this was a user created view. Dropping the view did the trick thanks for the help. On Wed, Mar 1, 2023 at 1:54 PM Tom Lane wrote: > Arthur Ramsey writes: > > "Database instance is in a state that cannot be upgraded: pg_restore: > from > > TOC entry 1264; 1259 32392758 VIEW pg_stat_activity_allusers master > > pg_restore: error: could not execute query: ERROR: column reference > "query" > > is ambiguous LINE 32: "get_sa"."query" > > > CREATE VIEW > > "publiczugmgeaxkmwph5jgobvwmvzb9freg0s7"."pg_stat_activity_allusers" AS > > SELECT "get_sa"."datid", "get_sa"."datname", "get_sa"."pid", > > "get_sa"."usesysid", "get_sa"."usename", "get_sa"."application_name", > > "get_sa"."client_addr", "get_sa"."client_hostname", > "get_sa"."client_port", > > "get_sa"."backend_start", "get_sa"."xact_start", "get_sa"."query_start", > > "get_sa"."state_change", "get_sa"."wait_event_type", > "get_sa"."wait_event", > > "get_sa"."state", "get_sa"."backend_xid", "get_sa"."backend_xmin", > > "get_sa"."query" FROM "publiczugmgeaxkmwph5jgobvwmvzb9freg0s7"."get_sa"() > > "get_sa"("datid", "datname", "pid", "usesysid", "usename", > > "application_name", "client_addr", "client_hostname", "client_port", > > "backend_start", "xact_start", "query_start", "state_change", > > "wait_event_type", "wait_event", "state", "backend_xid", "backend_xmin", > > "query", "backend_type");". > > You're really expecting us to intuit a lot from this amount of > detail, aren't you? But okay, I'll guess: I think this is a > view on pg_stat_activity, and it's not accounting for the fact > that pg_stat_activity gained some columns between v12 and v14. > > Probably your best bet is to drop that view, do the upgrade, > and recreate the view with adjustments. > > regards, tom lane >
Re: Postgres Index and Updates
On Thu, 2023-03-02 at 15:53 +0800, Navindren Baskaran wrote: > We would like to understand in which scenario an index on a table will be > rebuilt. I assume that you are meaning "updated" or "modified" rather than rebuilt from scratch. > Example if we have a table, which has two columns and one if it is indexed, > which of the below operation trigger an index rebuild: > > 1. updating the value of the indexed column. > 2. updating the value of the non-indexed column. If the indexed column is updated, a new index entry has to be added. If the other column is updated, it depends. If the updated column is not indexed and there is enough room for the new row version in the same table block, the index doesn't have to be modified. Otherwise it is. See https://www.cybertec-postgresql.com/en/hot-updates-in-postgresql-for-better-performance/ Yours, Laurenz Albe
Re: Postgres Index and Updates
On Thu, Mar 2, 2023 at 10:08 AM Laurenz Albe wrote: > On Thu, 2023-03-02 at 15:53 +0800, Navindren Baskaran wrote: > If the other column is updated, it depends. If the updated column is not > indexed and there is enough room for the new row version in the same > table block, the index doesn't have to be modified. Otherwise it is. > > See > https://www.cybertec-postgresql.com/en/hot-updates-in-postgresql-for-better-performance/ Very interesting article, thanks Laurenz. I wasn't aware of HOT and its relation to fillfactor. At the risk of highjacking this thread a bit, although still on topic I think. What's the 2023 status of that zheap table storage work? And more specifically about a particular data-model of mine. I have a 3-level hierarchical parents-children-grandchidren table structure: 1) a "root" table, with tens to thousands (~200K max) of (small) rows. 2) a "folder" table, with 20 to 50 (small) rows *per* "root" row. 3) several "value" tables, with again a 20 to 100 (large to very large) rows per "folder" row. The root and folder tables must maintain a "last modified" timestamp for their respective subtrees, which must be maintained via triggers (how else?). That makes those tables Update-heavy no? So from your article, those two tables, with smaller rows (and fewer rows total in general) should have larger fillfactors to increase the chances of an HOT update? Am I interpreting your article (and its linked articles) correctly for this situation? TIA, --DD
Getting the exact SQL from inside an event trigger
Hello all, We are using event triggers to capture DDL for subsequent replay on a logical replica. The intention is to write the DDL statement to a table, inside the same transaction that executes the DDL, and have a separate process on the replica notice changes in this table and execute whatever it finds. We have declared a trigger on the ddl_command_end event for this purpose. We can get the SQL from running current_query() inside the trigger; oddly, the pg_event_trigger_ddl_commands() function does have an attribute called "command", typed as "pg_ddl_command", but there are no SQL functions that can operate on this type, including turning it into a string. This process works for a simple case of e.g. "CREATE TABLE t()". However, in other cases --- e.g. "DO $$ BEGIN CREATE TABLE t(); CREATE TABLE s(); END; $$;" --- the trigger will fire for each CREATE TABLE but the current_query() will evaluate to the entire DO block. This makes it difficult to capture just the actual statement that is being executed. I am looking for a way to get the precise statement that is being executed from within the ddl_command_event trigger function. Does anyone know if this is possible? Thanks, -Joe
Re: Getting the exact SQL from inside an event trigger
On Thu, Mar 02, 2023 at 11:12:37AM +, Joe Wildish wrote: > We are using event triggers to capture DDL for subsequent replay on a logical > replica. This might be a bit different answer from what you expect, but have you seen pgl_ddl_deploy project? Best regards, depesz
Re: Getting the exact SQL from inside an event trigger
Hi Depesz, On Thu, 2 Mar 2023, at 12:29, hubert depesz lubaczewski wrote: > This might be a bit different answer from what you expect, but have you > seen pgl_ddl_deploy project? Thanks --- I was unaware of this project so I will take a look. However, we are operating under a limitation that the publisher is in a hosted provider, so we cannot install extensions unless they are "blessed"; hence looking for a solution that doesn't require that. I did go and look at what pgaudit does WRT to this problem as I know that that extension can emit just the statement being executed. I didn't fully understand all the detail, but did notice that it maintains a stack of statements for the current execution, presumably so it knows if the trigger has been called already, e.g. in the case of a DO block. This indicates to me that I can't do what I would like in pure PLPGSQL triggers. -Joe
Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?
On 16/02/2023 17:15, Ron wrote: On 2/16/23 09:47, cen wrote: Hi, I am running the same application (identical codebase) as two separate instances to index (save) different sets of data. Both run PostgreSQL 13. The queries are the same but the content in actual databases is different. One database is around 1TB and the other around 300GB. There is a problem with a paginated select query with a join and an order. Depending on which column you order by (primary or FK) the query is either instant or takes minutes. So on one database, ordering by primary is instant but on the other it is slow and vice-versa. Switching the columns around on the slow case fixes the issue. All relavant colums are indexed. What does EXPLAIN ANALYZE say? I finally managed to get back to this and have a similar sample query. In this case, changing the WHERE clause to use the joined table column slows the query down. The initial case was on the ORDER BY column but the simptoms are the same I think. I understand that even though both colums are indexed, the indexes are completely different but the point is, how would one know in advance which one will be faster when designing the query? And as I mentioned in my initial text, the fast case of columns can switch around as the database grows. Fast case: SELECT t0."status", b1."timestamp" FROM "transactions" AS t0 INNER JOIN "blocks" AS b1 ON b1."number" = t0."block_number" WHERE (((t0."to_address_hash" = '\x3012c')) OR (t0."from_address_hash" = '\x3012c') OR (t0."created_contract_address_hash" = '\x3012c')) AND (t0."block_number" >= 30926000) AND (t0."block_number" <= 31957494) ORDER BY t0."block_number" DESC LIMIT 150 OFFSET 300; Plan: Limit (cost=15911.73..23367.09 rows=150 width=16) (actual time=205.093..305.423 rows=150 loops=1) -> Gather Merge (cost=1001.03..812143.50 rows=16320 width=16) (actual time=36.140..305.333 rows=450 loops=1) Workers Planned: 2 Workers Launched: 2 -> Nested Loop (cost=1.00..809259.75 rows=6800 width=16) (actual time=2.662..155.779 rows=153 loops=3) -> Parallel Index Scan Backward using transactions_block_number_index on transactions t0 (cost=0.56..753566.08 rows=6800 width=8) (actual time=0.224..145.998 rows=153 loops=3) Index Cond: ((block_number >= 30926000) AND (block_number <= 31957494)) Filter: ((to_address_hash = '\x3012c'::bytea) OR (from_address_hash = '\x3012c'::bytea) OR (created_contract_address_hash = '\x3012c'::bytea)) Rows Removed by Filter: 22471 -> Index Scan using blocks_number_index on blocks b1 (cost=0.44..8.18 rows=1 width=16) (actual time=0.059..0.060 rows=1 loops=460) Index Cond: (number = t0.block_number) Planning Time: 0.513 ms Execution Time: 305.541 ms -- Slow case: SELECT t0."status", b1."timestamp" FROM "transactions" AS t0 INNER JOIN "blocks" AS b1 ON b1."number" = t0."block_number" WHERE (((t0."to_address_hash" = '\x3012c')) OR (t0."from_address_hash" = '\x3012c') OR (t0."created_contract_address_hash" = '\x3012c')) AND (b1."number" >= 30926000) -- using col from joined table instead AND (b1."number" <= 31957494) -- using col from joined table instead ORDER BY t0."block_number" DESC LIMIT 150 OFFSET 300; Plan: Limit (cost=1867319.63..1877754.02 rows=150 width=16) (actual time=95830.704..95962.116 rows=150 loops=1) -> Gather Merge (cost=1846450.83..2015348.94 rows=2428 width=16) (actual time=95805.872..95962.075 rows=450 loops=1) Workers Planned: 2 Workers Launched: 2 -> Merge Join (cost=1845450.81..2014068.67 rows=1012 width=16) (actual time=95791.362..95824.633 rows=159 loops=3) Merge Cond: (t0.block_number = b1.number) -> Sort (cost=1845402.63..1845823.81 rows=168474 width=8) (actual time=95790.194..95790.270 rows=186 loops=3) Sort Key: t0.block_number DESC Sort Method: external merge Disk: 2496kB Worker 0: Sort Method: external merge Disk: 2408kB Worker 1: Sort Method: external merge Disk: 2424kB -> Parallel Bitmap Heap Scan on transactions t0 (cost=39601.64..1828470.76 rows=168474 width=8) (actual time=7274.149..95431.494 rows=137658 loops=3) Recheck Cond: ((to_address_hash = '\x3012c'::bytea) OR (from_address_hash = '\x3012c'::bytea) OR (created_contract_address_hash = '\x3012c'::bytea)) Rows Removed by Index Recheck: 716205 Heap Blocks: exact=7043 lossy=134340 -> BitmapOr (cost=39601.64..39601.64 rows=404359 width=0) (actual time=7264.127..7264.130 rows=0 loops=1) -> Bitmap Index Scan on transac
Converting row elements into a arrays?
Postgresql 12.13 Given the sample below, I'm looking for how to generate this output. It's like GROUP BY, but generating an array instead of an aggreate number. f1 | f2_array +- 1 | {1,2,3} 2 | {1,2,3,4} 3 | {1,2} The ultimate goal is to somehow use pg_index.indkey to get column names from pg_attribute. create table foo (f1 int, f2 int); insert into foo values (1, 1); insert into foo values (1, 2); insert into foo values (1, 3); insert into foo values (2, 1); insert into foo values (2, 2); insert into foo values (2, 3); insert into foo values (2, 4); insert into foo values (3, 1); insert into foo values (3, 2); select * from foo order by f1, f2; f1 | f2 + 1 | 1 1 | 2 1 | 3 2 | 1 2 | 2 2 | 3 2 | 4 3 | 1 3 | 2 (9 rows) -- Born in Arizona, moved to Babylonia.
Re: Converting row elements into a arrays?
On 02/03/2023 20:58, Ron wrote: Postgresql 12.13 Given the sample below, I'm looking for how to generate this output. It's like GROUP BY, but generating an array instead of an aggreate number. f1 | f2_array +- 1 | {1,2,3} 2 | {1,2,3,4} 3 | {1,2} Something like this (off the top of my head)? - select f1, array_agg(f2) as f2_array group by f1; Hope that helps (and that it's right!). Ray. The ultimate goal is to somehow use pg_index.indkey to get column names from pg_attribute. create table foo (f1 int, f2 int); insert into foo values (1, 1); insert into foo values (1, 2); insert into foo values (1, 3); insert into foo values (2, 1); insert into foo values (2, 2); insert into foo values (2, 3); insert into foo values (2, 4); insert into foo values (3, 1); insert into foo values (3, 2); select * from foo order by f1, f2; f1 | f2 + 1 | 1 1 | 2 1 | 3 2 | 1 2 | 2 2 | 3 2 | 4 3 | 1 3 | 2 (9 rows) -- Raymond O'Donnell // Galway // Ireland r...@rodonnell.ie
Re: Converting row elements into a arrays?
On 02/03/2023 21:01, Ray O'Donnell wrote: On 02/03/2023 20:58, Ron wrote: Postgresql 12.13 Given the sample below, I'm looking for how to generate this output. It's like GROUP BY, but generating an array instead of an aggreate number. f1 | f2_array +- 1 | {1,2,3} 2 | {1,2,3,4} 3 | {1,2} Something like this (off the top of my head)? - select f1, array_agg(f2) as f2_array group by f1; ... from foo ... R. -- Raymond O'Donnell // Galway // Ireland r...@rodonnell.ie
Re: Converting row elements into a arrays?
On Thu, Mar 2, 2023 at 1:58 PM Ron wrote: > Postgresql 12.13 > > Given the sample below, I'm looking for how to generate this output. It's > like GROUP BY, but generating an array instead of an aggreate number. > > Group By creates groups, that's it. How you aggregate the data that are in those groups is determined by which function you call. Sure, a function like "count()" produces a single number, but there are other functions. Even a whole documentation section listing them. https://www.postgresql.org/docs/current/functions-aggregate.html David J.
Re: Converting row elements into a arrays?
On 3/2/23 13:58, Ron wrote: Postgresql 12.13 Given the sample below, I'm looking for how to generate this output. It's like GROUP BY, but generating an array instead of an aggreate number. f1 | f2_array +- 1 | {1,2,3} 2 | {1,2,3,4} 3 | {1,2} The ultimate goal is to somehow use pg_index.indkey to get column names from pg_attribute. create table foo (f1 int, f2 int); insert into foo values (1, 1); insert into foo values (1, 2); insert into foo values (1, 3); insert into foo values (2, 1); insert into foo values (2, 2); insert into foo values (2, 3); insert into foo values (2, 4); insert into foo values (3, 1); insert into foo values (3, 2); select * from foo order by f1, f2; f1 | f2 + 1 | 1 1 | 2 1 | 3 2 | 1 2 | 2 2 | 3 2 | 4 3 | 1 3 | 2 (9 rows) In which environment are you accessing that array? psql only?
Re: Converting row elements into a arrays?
On 3/2/23 15:34, David G. Johnston wrote: On Thu, Mar 2, 2023 at 1:58 PM Ron wrote: Postgresql 12.13 Given the sample below, I'm looking for how to generate this output. It's like GROUP BY, but generating an array instead of an aggreate number. Group By creates groups, that's it. How you aggregate the data that are in those groups is determined by which function you call. Sure, a function like "count()" produces a single number, but there are other functions. Even a whole documentation section listing them. https://www.postgresql.org/docs/current/functions-aggregate.html David J. I'm used to the bog standard COUNT, AVG, MIN, MAX, SUM. It didn't occur to me that there would be others... -- Born in Arizona, moved to Babylonia.
Re: Converting row elements into a arrays?
On 3/2/23 15:45, Rob Sargent wrote: On 3/2/23 13:58, Ron wrote: Postgresql 12.13 Given the sample below, I'm looking for how to generate this output. It's like GROUP BY, but generating an array instead of an aggreate number. f1 | f2_array +- 1 | {1,2,3} 2 | {1,2,3,4} 3 | {1,2} The ultimate goal is to somehow use pg_index.indkey to get column names from pg_attribute. create table foo (f1 int, f2 int); insert into foo values (1, 1); insert into foo values (1, 2); [snip] In which environment are you accessing that array? psql only? Correct. -- Born in Arizona, moved to Babylonia.
Re: Converting row elements into a arrays?
On 3/2/23 15:01, Ray O'Donnell wrote: On 02/03/2023 20:58, Ron wrote: Postgresql 12.13 Given the sample below, I'm looking for how to generate this output. It's like GROUP BY, but generating an array instead of an aggreate number. f1 | f2_array +- 1 | {1,2,3} 2 | {1,2,3,4} 3 | {1,2} Something like this (off the top of my head)? - select f1, array_agg(f2) as f2_array group by f1; Hope that helps (and that it's right!). That did it. -- Born in Arizona, moved to Babylonia.
Re: Converting row elements into a arrays?
On 3/2/23 14:49, Ron wrote: On 3/2/23 15:45, Rob Sargent wrote: On 3/2/23 13:58, Ron wrote: Postgresql 12.13 Given the sample below, I'm looking for how to generate this output. It's like GROUP BY, but generating an array instead of an aggreate number. f1 | f2_array +- 1 | {1,2,3} 2 | {1,2,3,4} 3 | {1,2} The ultimate goal is to somehow use pg_index.indkey to get column names from pg_attribute. create table foo (f1 int, f2 int); insert into foo values (1, 1); insert into foo values (1, 2); [snip] In which environment are you accessing that array? psql only? Correct. For multiple columns it's a litte messier ( if you have different data types in your columns) create table foo (f1 int, f2 int, f3 text); insert into foo values (1, 1,'asdf'); insert into foo values (1, 2, 'qerr'); select f1, array_agg(array[f2::text,f3]) from foo group by f1; f1 | array_agg +- 1 | {{1,asdf},{2,qerr}} (1 row)
Re: DISTINCT *and* ORDER BY in aggregate functions on expressions(!)y
On Tue, 28 Feb 2023, Alban Hertroys wrote: >Perhaps you can use a lateral cross join to get the result of >jsonb_build_object as a jsonb value to pass around? I don’t see how. (But then I’ve not yet worked with lateral JOINs.) But I c̲a̲n̲ just generate the objects first, I t̲h̲i̲n̲k̲, given one of them corresponds to exactly one of the rows of an m:n-linked table and nothing else. Something like… WITH cgwaj AS ( SELECT cgwa.id AS id, jsonb_build_object( 'weekday', cgwa.weekday, 'forenoon', cgwa.forenoon, 'afternoon', cgwa.afternoon, 'evening', cgwa.evening) AS obj FROM core_generalworkavailability cgwa ), -- … same for opening times SELECT cp.email, …, -- … jsonb_build_object('possible_work_times', COALESCE( jsonb_agg(DISTINCT cgwaj.obj ORDER BY cgwaj.obj->>'weekday', cgwaj.obj->>'forenoon', cgwaj.obj->>'afternoon', cgwaj.obj->>'evening') FILTER (WHERE cgwaj.id IS NOT NULL))) || -- … FROM core_person cp -- … LEFT JOIN core_person_possible_work_times cppwt ON cppwt.person_id=cp.id LEFT JOIN cgwaj ON cgwaj.id=cppwt.generalworkavailability_id -- … That is, add a CTE for each m:n-attached table whose “value” is an object, not a single field, keep the id field; LEFT JOIN that (instead of the original table), then we have a field to use in ORDER BY. I think. I’ve not yet tried it (I don’t have access to that DB normally, I was just helping out). This avoids sub-SELECTs in the sense of needing to run one for each user row, because the innermost JSON object building needs to be done for each (connected (if the query is not filtering on specific users)) row of the “property table”, anyway. (And even if filtered, that can be passed down.) bye, //mirabilos -- Solange man keine schmutzigen Tricks macht, und ich meine *wirklich* schmutzige Tricks, wie bei einer doppelt verketteten Liste beide Pointer XORen und in nur einem Word speichern, funktioniert Boehm ganz hervorragend. -- Andreas Bogk über boehm-gc in d.a.s.r
Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?
On Thu, Mar 2, 2023 at 8:20 AM cen wrote: > On 16/02/2023 17:15, Ron wrote: > > On 2/16/23 09:47, cen wrote: > >> Hi, > >> > >> I am running the same application (identical codebase) as two > >> separate instances to index (save) different sets of data. Both run > >> PostgreSQL 13. > >> > >> The queries are the same but the content in actual databases is > >> different. One database is around 1TB and the other around 300GB. > >> > >> > >> There is a problem with a paginated select query with a join and an > >> order. Depending on which column you order by (primary or FK) the > >> query is either instant or takes minutes. > >> > >> So on one database, ordering by primary is instant but on the other > >> it is slow and vice-versa. Switching the columns around on the slow > >> case fixes the issue. > >> > >> All relavant colums are indexed. > > > > What does EXPLAIN ANALYZE say? > > I finally managed to get back to this and have a similar sample query. > In this case, changing the WHERE clause to use the joined table column > slows the query down. > > The initial case was on the ORDER BY column but the simptoms are the > same I think. > > I understand that even though both colums are indexed, the indexes are > completely different but the point is, how would one know in advance > which one will be faster when designing the query? > > And as I mentioned in my initial text, the fast case of columns can > switch around as the database grows. > > > Fast case: > > SELECT >t0."status", >b1."timestamp" > FROM >"transactions" AS t0 > INNER JOIN >"blocks" AS b1 > ON >b1."number" = t0."block_number" > WHERE >(((t0."to_address_hash" = '\x3012c')) > OR (t0."from_address_hash" = '\x3012c') > OR (t0."created_contract_address_hash" = '\x3012c')) >AND (t0."block_number" >= 30926000) >AND (t0."block_number" <= 31957494) > ORDER BY >t0."block_number" DESC > LIMIT >150 > OFFSET >300; > > Plan: > > Limit (cost=15911.73..23367.09 rows=150 width=16) (actual > time=205.093..305.423 rows=150 loops=1) > -> Gather Merge (cost=1001.03..812143.50 rows=16320 width=16) > (actual time=36.140..305.333 rows=450 loops=1) > Workers Planned: 2 > Workers Launched: 2 > -> Nested Loop (cost=1.00..809259.75 rows=6800 width=16) > (actual time=2.662..155.779 rows=153 loops=3) > -> Parallel Index Scan Backward using > transactions_block_number_index on transactions t0 (cost=0.56..753566.08 > rows=6800 width=8) (actual time=0.224..145.998 rows=153 loops=3) > Index Cond: ((block_number >= 30926000) AND > (block_number <= 31957494)) > Filter: ((to_address_hash = '\x3012c'::bytea) OR > (from_address_hash = '\x3012c'::bytea) OR (created_contract_address_hash > = '\x3012c'::bytea)) > Rows Removed by Filter: 22471 > -> Index Scan using blocks_number_index on blocks b1 > (cost=0.44..8.18 rows=1 width=16) (actual time=0.059..0.060 rows=1 > loops=460) > Index Cond: (number = t0.block_number) > Planning Time: 0.513 ms > Execution Time: 305.541 ms > > -- > > Slow case: > > SELECT >t0."status", >b1."timestamp" > FROM >"transactions" AS t0 > INNER JOIN >"blocks" AS b1 > ON >b1."number" = t0."block_number" > WHERE >(((t0."to_address_hash" = '\x3012c')) > OR (t0."from_address_hash" = '\x3012c') > OR (t0."created_contract_address_hash" = '\x3012c')) >AND (b1."number" >= 30926000) -- using col from joined table instead >AND (b1."number" <= 31957494) -- using col from joined table instead > ORDER BY >t0."block_number" DESC > LIMIT >150 > OFFSET >300; > > Plan: > > Limit (cost=1867319.63..1877754.02 rows=150 width=16) (actual > time=95830.704..95962.116 rows=150 loops=1) > -> Gather Merge (cost=1846450.83..2015348.94 rows=2428 width=16) > (actual time=95805.872..95962.075 rows=450 loops=1) > Workers Planned: 2 > Workers Launched: 2 > -> Merge Join (cost=1845450.81..2014068.67 rows=1012 > width=16) (actual time=95791.362..95824.633 rows=159 loops=3) > Merge Cond: (t0.block_number = b1.number) > -> Sort (cost=1845402.63..1845823.81 rows=168474 > width=8) (actual time=95790.194..95790.270 rows=186 loops=3) > Sort Key: t0.block_number DESC > Sort Method: external merge Disk: 2496kB > Worker 0: Sort Method: external merge Disk: 2408kB > Worker 1: Sort Method: external merge Disk: 2424kB > -> Parallel Bitmap Heap Scan on transactions t0 > (cost=39601.64..1828470.76 rows=168474 width=8) (actual > time=7274.149..95431.494 rows=137658 loops=3) > Recheck Cond: ((to_address_hash = > '\x3012c'::bytea) OR (from_address_hash = '\x3012c'::bytea) OR > (created_contract_address
Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?
On Fri, 3 Mar 2023 at 02:20, cen wrote: > I understand that even though both colums are indexed, the indexes are > completely different but the point is, how would one know in advance > which one will be faster when designing the query? Likely to be safe, you'd just include both. The problem is that the query planner makes use of equivalence classes to deduce equivalence in quals. If you have a query such as: select * from t1 inner join t2 on t1.x = t2.y where t1.x = 3; then the planner can deduce that t2.y must also be 3 and that qual can be pushed down to the scan level. If t2.y = 3 is quite selective and there's an index on that column, then this deduction is likely going to be a very good win, as the alternative of not using it requires looking at all rows in t2. The problem is that the equivalence class code only can deduce equality. If we had written: select * from t1 inner join t2 on t1.x = t2.y where t1.x > 2 and t1.x < 4; then we'd not have gotten quite as optimal a plan. Providing we're doing an inner join, then we could just write both sets of quals to force the planner's hand: select * from t1 inner join t2 on t1.x = t2.y where t1.x > 2 and t1.x < 4 and t2.y > 2 and t2.y < 4; you could likely do this. I still hope to improve this in the planner one day. A few other things are getting closer which sets the bar a bit lower on getting something like this committed. There's some relevant discussion in [1]. David [1] https://postgr.es/m/CAKJS1f9FK_X_5HKcPcSeimy16Owe3EmPmmGsGWLcKkj_rW9s6A%40mail.gmail.com