Re: Table and data comparison
Yes, PostgreSQL. Why not just do this via SQL join ? On Tue, 3 Sept 2024, 17:34 arun chirappurath, wrote: > Hi All, > > Do we have any open-source utility to compare two table data not structure > and then provide some sort of reports? > > Row by row comparison > > Thanks, > Arun >
Re: Will hundred of thousands of this type of query cause Parsing issue
I would usually try to avoid such long IN causes. Why not try this .. Create a temp table of 1 column. Bulk insert all your IDs into that table. Then change your query to join to the temp table? This also has the advantage of working for 1000s of values. On Fri, 13 Sept 2024, 16:35 Wong, Kam Fook (TR Technology), < kamfook.w...@thomsonreuters.com> wrote: > Follow Postgres expert, > > We have a flavor of this type of query with long in-list/bind variables > (see below). We notice that some of the bind variables come in as 0 which > causes the optimizer to choose to full scan two of the following 3 tables. > One thought to fix a full table scan is to chop off the not-needed bind > variables (proven to work after some tests). But my other worry is will > cause parsing issues because the app will be executing > 100k/sec with this > type of query. > > I am an Oracle DBA, and this change for sure will generate a different > query id. Which in turn generates tons of extra parsing to the DB because > all soft and hard parsing occurs at the DB level. But my understanding for > Postgres is parsing occurs at the client jdbc level. Am I understanding > this correctly? > > > In summary/my concern: > > 1) Where does query parsing occur? > 2) Will this cause extra parsing to the posgress DB? Any pg system table > to measure parsing? > > > > SELECT abc, efg from DOCLOC a, COLLECTION b WHERE a.colum1 IN ($1, $2, > $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, > $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, $29, $30, $31, $32, $33, > $34, $35, $36, $37, $38, $39, $40, $41, $42, $43, $44, $45, $46, $47, $48, > $49, $50, $51, $52, $53, $54, $55, $56, $57, $58, $59, $60, $61, $62, $63, > $64, $65, $66, $67, $68, $69, $70, $71, $72, $73, $74, $75, $76, $77, $78, > $79, $80, $81, $82, $83, $84, $85, $86, $87, $88, $89, $90, $91, $92, $93, > $94, $95, $96, $97, $98, $99, $100, $101, $102, $103, $104, $105, $106, > $107, $108, $109, $110, $111, $112, $113, $114, $115, $116, $117, $118, > $119, $120, $121, $122, $123, $124, $125, $126, $127, $128, $129, $130, > $131, $132, $133, $134, $135, $136, $137, $138, $139, $140, $141, $142, > $143, $144, $145, $146, $147, $148, $149, $150, $151, $152, $153, $154, > $155, $156, $157, $158, $159, $160, $161, $162, $163, $164, $165, $166, > $167, $168, $169, $170, $171, $172, $173, $174, $175, $176, $177, $178, > $179, $180, $181, $182, $183, $184, $185, $186, $187, $188, $189, $190, > $191, $192, $193, $194, $195, $196, $197, $198, $199, $200, $201, $202, > $203, $204, $205, $206, $207, $208, $209, $210, $211, $212, $213, $214, > $215, $216, $217, $218, $219, $220, $221, $222, $223, $224, $225, $226, > $227, $228, $229, $230, $231, $232, $233, $234, $235, $236, $237, $238, > $239, $240, $241, $242, $243, $244, $245, $246, $247, $248, $249, $250, > $251, $252, $253, $254, $255, $256, $257, $258, $259, $260, $261, $262, > $263, $264, $265, $266, $267, $268, $269, $270, $271, $272, $273, $274, > $275, $276, $277, $278, $279, $280, $281, $282, $283, $284, $285, $286, > $287, $288, $289, $290, $291, $292, $293, $294, $295, $296, $297, $298, > $299, $300, $301, $302, $303, $304, $305, $306, $307, $308, $309, $310, > $311, $312, $313, $314, $315, $316, $317, $318, $319, $320, $321, $322, > $323, $324, $325, $326, $327, $328, $329, $330, $331, $332, $333, $334, > $335, $336, $337, $338, $339, $340, $341, $342, $343, $344, $345, $346, > $347, $348, $349, $350, $351, $352, $353, $354, $355, $356, $357, $358, > $359, $360, $361, $362, $363, $364, $365, $366, $367, $368, $369, $370, > $371, $372, $373, $374, $375, $376, $377, $378, $379, $380, $381, $382, > $383, $384, $385, $386, $387, $388, $389, $390, $391, $392, $393, $394, > $395, $396, $397, $398, $399, $400, $401, $402, $403, $404, $405, $406, > $407, $408, $409, $410, $411, $412, $413, $414, $415, $416, $417, $418, > $419, $420, $421, $422, $423, $424, $425, $426, $427, $428, $429, $430, > $431, $432, $433, $434, $435, $436, $437, $438, $439, $440, $441, $442, > $443, $444, $445, $446, $447, $448, $449, $450, $451, $452, $453, $454, > $455, $456, $457, $458, $459, $460, $461, $462, $463, $464, $465, $466, > $467, $468, $469, $470, $471, $472, $473, $474, $475, $476, $477, $478, > $479, $480, $481, $482, $483, $484, $485, $486, $487, $488, $489, $490, > $491, $492, $493, $494, $495, $496, $497, $498, $499, $500, $501, $502, > $503, $504, $505, $506, $507, $508, $509, $510, $511, $512, $513, $514, > $515, $516, $517, $518, $519, $520, $521, $522, $523, $524, $525, $526, > $527, $528, $529, $530, $531, $532, $533, $534, $535, $536, $537, $538, > $539, $540, $541, $542, $543, $544, $545, $546, $547, $548, $549, $550, > $551, $552, $553, $554, $555, $556, $557, $558, $559, $560, $561, $562, > $563, $564, $565, $566, $567, $568, $569, $570, $571, $572, $573, $574, > $575, $576, $577, $578, $579, $580, $581, $582, $583, $584, $585, $586, > $587, $588, $589, $590, $591, $592, $593, $594, $595, $596, $597, $5
Re: Index Partition Size Double of its Table Partition?
Are you able to cluster the table ? The idea is that rows ordered in the same way as the index might reduce it's size ? On Wed, 30 Oct 2024, 16:29 Don Seiler, wrote: > On Wed, Oct 30, 2024 at 11:23 AM Peter Geoghegan wrote: > >> >> If a substantial amount of the index was written by CREATE INDEX (and >> not by retail inserts) then my theory is unlikely to be correct. It >> could just be that you managed to absorb most inserts in one >> partition, but not in the other. That's probably possible when there >> are only relatively small differences in the number of inserts that >> need to use of the space left behind by fillfactor in each case. In >> general page splits tend to come in distinct "waves" after CREATE >> INDEX is run. >> > > What do you mean by "absorb" the inserts? > > It sounds like the answer will be "No", but: Would rebuilding the index > after the month-end (when inserts have stopped on this partition) change > anything? > > Don. > -- > Don Seiler > www.seiler.us >
Re: Performance Issue with Hash Partition Query Execution in PostgreSQL 16
Thanks for correction. At this point I would be trying to modify plan_cache_mode for the session which uses the bond variable. alter it so that plan_cache_mode=force_custom_plan One hypothesis is that, a bad plan got cached for that SQL pattern. Obviously, when you run it *manually* you are always getting a *custom* plan as it's not a prepared statement. On Sat, 9 Nov 2024, 03:46 ravi k, wrote: > Sorry, it was typo. Bind variable is bigint only. > > Thanks > > On Fri, 8 Nov, 2024, 7:09 pm David Mullineux, wrote: > >> Just spotted a potential problem. The indexed column is a bigint. Are >> you, in your prepared statement passing a string or a big int ? >> I notice your plan is doing an implicit type conversion when you run it >> manually. >> Sometimes the wrong type will make it not use the index. >> >> On Fri, 8 Nov 2024, 03:07 ravi k, wrote: >> >>> Hi , >>> >>> Thanks for the suggestions. >>> >>> Two more observations: >>> >>> 1) no sequence scan noticed from pg_stat_user_tables ( hope stats are >>> accurate in postgres 16) if parameter sniffing happens the possibility of >>> going to sequence scan is more right. >>> >>> 2) no blockings or IO issue during the time. >>> >>> 3) even with limit clause if touch all partitions also it could have >>> been completed in milliseconds as this is just one record. >>> >>> 4) auto_explain in prod we cannot enable as this is expensive and with >>> high TPS we may face latency issues and lower environment this issue cannot >>> be reproduced,( this is happening out of Million one case) >>> >>> This looks puzzle to us, just in case anyone experianced pls share your >>> experience. >>> >>> Regards, >>> Ravi >>> >>> On Thu, 7 Nov, 2024, 3:41 am David Mullineux, wrote: >>> >>>> It might be worth eliminating the use of cached plans here. Is your app >>>> using prepared statements at all? >>>> Point is that if the optimizer sees the same prepared query , 5 times, >>>> the it locks the plan that it found at that time. This is a good trade off >>>> as it avoids costly planning-time for repetitive queries. But if you are >>>> manually querying, the a custom plan will be generated anew. >>>> A quick analyze of the table should reset the stats and invalidate any >>>> cached plans. >>>> This may not be your problem just worth eliminating it from the list >>>> of potential causes. >>>> >>>> On Wed, 6 Nov 2024, 17:14 Ramakrishna m, wrote: >>>> >>>>> Hi Team, >>>>> >>>>> One of the queries, which retrieves a single record from a table with >>>>> 16 hash partitions, is taking more than 10 seconds to execute. In >>>>> contrast, >>>>> when we run the same query manually, it completes within milliseconds. >>>>> This >>>>> issue is causing exhaustion of the application pools. Do we have any bugs >>>>> in postgrs16 hash partitions? Please find the attached log, table, and >>>>> execution plan. >>>>> >>>>> size of the each partitions : 300GB >>>>> Index Size : 12GB >>>>> >>>>> Postgres Version : 16.x >>>>> Shared Buffers : 75 GB >>>>> Effective_cache : 175 GB >>>>> Work _mem : 4MB >>>>> Max_connections : 3000 >>>>> >>>>> OS : Ubuntu 22.04 >>>>> Ram : 384 GB >>>>> CPU : 64 >>>>> >>>>> Please let us know if you need any further information or if there are >>>>> additional details required. >>>>> >>>>> >>>>> Regards, >>>>> Ram. >>>>> >>>>
Re: Help with query optimizer
Hi, Its usually helpful to give some details on what 'slow' means and then define what 'fast enough ' means. Just so we know when to stop optimizing. You should also help by giving the DDL of those.tables and any indexes they have etc.. The plan shows me that there are expensive seq scans on 2 tables : conversations and tasks. So it would be good to get the pg_stats items for the columns used on those filters on those tables. I would then look at what indexes you have on those tables to find out if some are not being used and why.( lots of reasons why ). thanks On Tue, 1 Oct 2024, 08:53 Siraj G, wrote: > Hello Experts! > > Can you please help with the attached query as it is taking time to > execute. The execution plan for the same is below: > Thank you! > > Nested Loop Left Join (cost=8176.99..168603.11 rows=1 width=294) (actual > time=3301.886..3303.715 rows=0 loops=1) >Join Filter: (conversation.id = "jobReminders"."conversationId") >Buffers: shared hit=743696 read=78069 dirtied=16 written=10 >I/O Timings: shared read=7224.068 write=0.287 >-> Gather (cost=8176.99..168472.90 rows=1 width=246) (actual > time=3301.885..3303.711 rows=0 loops=1) > Workers Planned: 2 > Workers Launched: 2 > Buffers: shared hit=743696 read=78069 dirtied=16 written=10 > I/O Timings: shared read=7224.068 write=0.287 > -> Nested Loop (cost=7176.99..167472.80 rows=1 width=246) > (actual time=3286.866..3286.874 rows=0 loops=3) >Buffers: shared hit=743696 read=78069 dirtied=16 written=10 >I/O Timings: shared read=7224.068 write=0.287 >-> Nested Loop (cost=7176.56..61685.83 rows=59320 > width=174) (actual time=68.447..1978.593 rows=57917 loops=3) > Buffers: shared hit=284214 read=56135 dirtied=11 > written=9 > I/O Timings: shared read=4546.889 write=0.242 > -> Hash Join (cost=7176.13..20450.20 rows=15571 > width=142) (actual time=68.196..211.016 rows=17903 loops=3) >Hash Cond: (conversation."mailBoxId" = > "mailBox".id) >Buffers: shared hit=7428 read=11254 dirtied=9 > written=3 >I/O Timings: shared read=85.214 write=0.085 >-> Parallel Hash Join (cost=6944.87..20143.53 > rows=26942 width=118) (actual time=67.736..197.476 rows=21937 loops=3) > Hash Cond: (conversation."taskId" = > task.id) > Buffers: shared hit=6678 read=11254 > dirtied=9 written=3 > I/O Timings: shared read=85.214 > write=0.085 > -> Parallel Seq Scan on conversations > conversation (cost=0.00..13029.34 rows=64502 width=102) (actual > time=0.017..79.191 rows=51677 loops=3) >Filter: ((NOT "isCompose") AND > ("parentConversationId" IS NULL) AND (id <> > '559702f9-55a8-47c7-9b5c-93b29baabf3d'::uuid)) >Rows Removed by Filter: 622 >Buffers: shared hit=3605 read=8608 > dirtied=5 written=3 >I/O Timings: shared read=57.862 > write=0.085 > -> Parallel Hash (cost=6583.67..6583.67 > rows=28896 width=16) (actual time=67.046..67.047 rows=22266 loops=3) >Buckets: 131072 Batches: 1 Memory > Usage: 4224kB >Buffers: shared hit=3073 read=2646 > dirtied=4 >I/O Timings: shared read=27.352 >-> Parallel Seq Scan on tasks task > (cost=0.00..6583.67 rows=28896 width=16) (actual time=0.015..48.469 > rows=22266 loops=3) > Filter: (("deletedAt" IS > NULL) AND (type = 'MESSAGE'::enum_tasks_type)) > Rows Removed by Filter: 32166 > Buffers: shared hit=3073 > read=2646 dirtied=4 > I/O Timings: shared > read=27.352 >-> Hash (cost=230.80..230.80 rows=37 > width=24) (actual time=0.382..0.383 rows=37 loops=3) > Buckets: 1024 Batches: 1 Memory Usage: > 11kB > Buffers: shared hit=690 > -> Seq Scan on "mailBoxes" "mailBox" > (cost=0.00..230.80 rows=37 width=24) (actual time=0.021..0.367 rows=37 > loops=3) >Filter: (status = ANY > ('{ACTIVE,SYNCING}'::"enum_mailBoxes_status"[])) >Rows Removed by Filter: 27 >Buffers: shared hit=690 > -> Index Scan using idx_message_fk_conversation_i
Re: Performance Issue with Hash Partition Query Execution in PostgreSQL 16
It might be worth eliminating the use of cached plans here. Is your app using prepared statements at all? Point is that if the optimizer sees the same prepared query , 5 times, the it locks the plan that it found at that time. This is a good trade off as it avoids costly planning-time for repetitive queries. But if you are manually querying, the a custom plan will be generated anew. A quick analyze of the table should reset the stats and invalidate any cached plans. This may not be your problem just worth eliminating it from the list of potential causes. On Wed, 6 Nov 2024, 17:14 Ramakrishna m, wrote: > Hi Team, > > One of the queries, which retrieves a single record from a table with 16 > hash partitions, is taking more than 10 seconds to execute. In contrast, > when we run the same query manually, it completes within milliseconds. This > issue is causing exhaustion of the application pools. Do we have any bugs > in postgrs16 hash partitions? Please find the attached log, table, and > execution plan. > > size of the each partitions : 300GB > Index Size : 12GB > > Postgres Version : 16.x > Shared Buffers : 75 GB > Effective_cache : 175 GB > Work _mem : 4MB > Max_connections : 3000 > > OS : Ubuntu 22.04 > Ram : 384 GB > CPU : 64 > > Please let us know if you need any further information or if there are > additional details required. > > > Regards, > Ram. >
Re: Performance Issue with Hash Partition Query Execution in PostgreSQL 16
Just spotted a potential problem. The indexed column is a bigint. Are you, in your prepared statement passing a string or a big int ? I notice your plan is doing an implicit type conversion when you run it manually. Sometimes the wrong type will make it not use the index. On Fri, 8 Nov 2024, 03:07 ravi k, wrote: > Hi , > > Thanks for the suggestions. > > Two more observations: > > 1) no sequence scan noticed from pg_stat_user_tables ( hope stats are > accurate in postgres 16) if parameter sniffing happens the possibility of > going to sequence scan is more right. > > 2) no blockings or IO issue during the time. > > 3) even with limit clause if touch all partitions also it could have been > completed in milliseconds as this is just one record. > > 4) auto_explain in prod we cannot enable as this is expensive and with > high TPS we may face latency issues and lower environment this issue cannot > be reproduced,( this is happening out of Million one case) > > This looks puzzle to us, just in case anyone experianced pls share your > experience. > > Regards, > Ravi > > On Thu, 7 Nov, 2024, 3:41 am David Mullineux, wrote: > >> It might be worth eliminating the use of cached plans here. Is your app >> using prepared statements at all? >> Point is that if the optimizer sees the same prepared query , 5 times, >> the it locks the plan that it found at that time. This is a good trade off >> as it avoids costly planning-time for repetitive queries. But if you are >> manually querying, the a custom plan will be generated anew. >> A quick analyze of the table should reset the stats and invalidate any >> cached plans. >> This may not be your problem just worth eliminating it from the list of >> potential causes. >> >> On Wed, 6 Nov 2024, 17:14 Ramakrishna m, wrote: >> >>> Hi Team, >>> >>> One of the queries, which retrieves a single record from a table with 16 >>> hash partitions, is taking more than 10 seconds to execute. In contrast, >>> when we run the same query manually, it completes within milliseconds. This >>> issue is causing exhaustion of the application pools. Do we have any bugs >>> in postgrs16 hash partitions? Please find the attached log, table, and >>> execution plan. >>> >>> size of the each partitions : 300GB >>> Index Size : 12GB >>> >>> Postgres Version : 16.x >>> Shared Buffers : 75 GB >>> Effective_cache : 175 GB >>> Work _mem : 4MB >>> Max_connections : 3000 >>> >>> OS : Ubuntu 22.04 >>> Ram : 384 GB >>> CPU : 64 >>> >>> Please let us know if you need any further information or if there are >>> additional details required. >>> >>> >>> Regards, >>> Ram. >>> >>
Re: Error creating materialized view
On Mon, 25 Nov 2024, 06:08 Shaun Robinson, wrote: > Hi, > > I'm currently testing an application with Postgres 17.2 and am getting an > error when creating a materialized view which works in version 16 and > below. The sql works fine running as a query, but adding the > create materialized view breaks it. > > The error comes when calling a custom function and the error is that a > relation doesn't exist (which it does as it works within the same query > when not creating a view). > > Is this a known issue in the version 17.2? > > Thanks > Shau > Can you please post the DDL statements so we can see? Thanks > can you >
Re: Memory settings when running postgres in a docker container
i dont get why you think all memroy will be used. When you say shared_buffers = 16GB effective_cache_size = 48GB ...then this is using only 16GB for shared buffers. The effective _cache_size doesn't cause any memory to.be allocated. It's just a hint to optomizer On Wed, 20 Nov 2024, 11:16 Koen De Groote, wrote: > Assuming a machine with: > > * 16 CPU cores > * 64GB RAM > > Set to 500 max connections > > A tool like this: https://pgtune.leopard.in.ua/ > > Will output recommended settings: > > max_connections = 500 > shared_buffers = 16GB > effective_cache_size = 48GB > maintenance_work_mem = 2GB > checkpoint_completion_target = 0.9 > wal_buffers = 16MB > default_statistics_target = 100 > random_page_cost = 1.1 > effective_io_concurrency = 200 > work_mem = 8388kB > huge_pages = try > min_wal_size = 1GB > max_wal_size = 4GB > max_worker_processes = 16 > max_parallel_workers_per_gather = 4 > max_parallel_workers = 16 > max_parallel_maintenance_workers = 4 > > And they basically use up all the memory of the machine. > > 16GB shared buffers, 48GB effective cache size, 8MB of work_mem for some > reason... > > This seems rather extreme. I feel there should be free memory for > emergencies and monitoring solutions. > > And then there's the fact that postgres on this machine will be run in a > docker container. Which, on Linux, receives 64MB of /dev/shm shared memory > by default, but can be increased. > > I feel like I should probably actually lower my upper limit for memory, > regardless of what the machine actually has, so I can have free memory, and > also not bring the container process itself into danger. > > Is it as straightforward as putting my limit on, say 20GB, and then giving > more /dev/shm to the container? Or is there more to consider? > > Regards, > Koen De Groote > > > > > > >
Re: Wired behaviour from SELECT
Instead of nickname you probably want tontet where password=`Arbol' .. or am.i.missong something ?. On Fri, 22 Nov 2024, 20:13 David G. Johnston, wrote: > On Fri, Nov 22, 2024 at 1:07 PM Arbol One wrote: > >> The below sql statement produces the right output >> SELECT nickname, password FROM password WHERE id='0938105618107N1'; >> nickname | password >> -+-- >> Piccard@@21 | Arbol >> (1 row) >> However, if this sql statement produces the wrong output >> >> >> >> *SELECT nickname, password FROM password WHERE nickname='Arbol'; >> nickname | password --+-- (0 rows)* >> What am I doing wrong? >> > Naming a column in your table the same name as the table is problematic > generally. As for the query, if they are both intended to return the same > row the value Arbol is in the password column, not the nickname column. > You seem to have reversed the data for the two columns. > > David J. > >