Correct way of using complex expressions as partitioning key
Hello! Is there some correct way to use complex expressions as a key for partitioned table? Inserting works as expected, but select runs over all partitions until use complete partition key expression as predicate test=# create table test ( id text, v1 bigint, v2 bigint ) partition by range (((v1 + v2) % 10)); CREATE TABLE test=# create table test_1 partition of test for values from (0) to (1); CREATE TABLE test=# create table test_2 partition of test for values from (1) to (2); CREATE TABLE test=# insert into test values (1, 100, 101); INSERT 0 1 test=# insert into test values (1, 100, 100); INSERT 0 1 test=# select * from test_1; id | v1 | v2 +-+- 1 | 100 | 100 (1 row) test=# select * from test_2; id | v1 | v2 +-+- 1 | 100 | 101 (1 row) test=# explain analyze select * from test where v1 = 100 and v2 = 100; QUERY PLAN Append (cost=0.00..52.11 rows=2 width=48) (actual time=0.011..0.017 rows=1 loops=1) -> Seq Scan on test_1 (cost=0.00..26.05 rows=1 width=48) (actual time=0.011..0.011 rows=1 loops=1) Filter: ((v1 = 100) AND (v2 = 100)) -> Seq Scan on test_2 (cost=0.00..26.05 rows=1 width=48) (actual time=0.004..0.004 rows=0 loops=1) Filter: ((v1 = 100) AND (v2 = 100)) Rows Removed by Filter: 1 Planning Time: 0.457 ms Execution Time: 0.036 ms (8 rows) test=# explain analyze select * from test where ((v1 + v2) % 10) = 0 and v1 = 100 and v2 = 100; QUERY PLAN --- Seq Scan on test_1 test (cost=0.00..34.08 rows=1 width=48) (actual time=0.010..0.011 rows=1 loops=1) Filter: ((v1 = 100) AND (v2 = 100) AND (((v1 + v2) % '10'::bigint) = 0)) Planning Time: 0.131 ms Execution Time: 0.031 ms (4 rows)
Re: Correct way of using complex expressions as partitioning key
On Fri, 2023-10-27 at 07:27 +0300, Alexander Rumyantsev wrote: > Is there some correct way to use complex expressions as a key for partitioned > table? > Inserting works as expected, but select runs over all partitions until use > complete > partition key expression as predicate > > test=# create table test ( > id text, > v1 bigint, > v2 bigint > ) > partition by range (((v1 + v2) % 10)); > CREATE TABLE > > test=# explain analyze select * from test where v1 = 100 and v2 = 100; > [no partition pruning] > > test=# explain analyze select * from test where ((v1 + v2) % 10) = 0 and v1 = > 100 and v2 = 100; > [partition pruning] Yes, you only get partition pruning if the WHERE clause contains a comparison with the partitioning key. There is no way around that. Yours, Laurenz Albe
need SQL logic to find out way's to Implement check sum to validate table data is migrated correctly
Hi Need SQL logic/different approach method's to find out way's a) to Implement check sum to validate table data is migrated correctly Any guidance RegardsBharani SV
Re: Correct way of using complex expressions as partitioning key
Alexander Rumyantsev schrieb am 27.10.2023 um 06:27: > Hello! > > Is there some correct way to use complex expressions as a key for partitioned > table? > Inserting works as expected, but select runs over all partitions until use > complete partition key expression as predicate > > test=# create table test ( > id text, > v1 bigint, > v2 bigint > ) > partition by range (((v1 + v2) % 10)); > CREATE TABLE > > test=# create table test_1 partition of test for values from (0) to (1); > CREATE TABLE > > test=# create table test_2 partition of test for values from (1) to (2); > CREATE TABLE > > test=# insert into test values (1, 100, 101); > INSERT 0 1 > > test=# insert into test values (1, 100, 100); > INSERT 0 1 > > test=# select * from test_1; > id | v1 | v2 > +-+- > 1 | 100 | 100 > (1 row) > > test=# select * from test_2; > id | v1 | v2 > +-+- > 1 | 100 | 101 > (1 row) It seems you are trying to simulate hash partitioning using that expression. Why not use hash partitioning directly then? create table test ( id text, v1 bigint, v2 bigint ) partition by hash (v1, v2); create table test_1 partition of test for values with (modulus 2, remainder 0); create table test_2 partition of test for values with (modulus 2, remainder 1); Then the select will result in: QUERY PLAN - Seq Scan on test_1 test (cost=0.00..17.49 rows=1 width=48) (actual time=0.015..0.016 rows=1 loops=1) Filter: ((v1 = 100) AND (v2 = 100)) Planning Time: 0.159 ms Execution Time: 0.037 ms Demo: https://dbfiddle.uk/wuopLYeQ
Re: need SQL logic to find out way's to Implement check sum to validate table data is migrated correctly
Hi Why do you need checksums? Can you not employ a full outer join? My though behind this is that checksumming is quite costly cpuwise and you have to fiddle with each and every relevent column, be it as part of the checksum string be it as part of the joiner. The joiner would have the advantage that you could, if you wanted, also implement logic to tell you precisely in which columns the difference is encountered. Kind regards Thiemo Am 27.10.2023 um 10:56 schrieb Y_esteembsv-forum: Hi Need SQL logic/different approach method's to find out way's a) to Implement check sum to validate table data is migrated correctly Any guidance Regards Bharani SV
Re: need SQL logic to find out way's to Implement check sum to validate table data is migrated correctly
OP might be referring to migrating from, say, Oracle to Postgresql when oracle_fdw is not available. On 10/27/23 04:34, Thiemo Kellner wrote: Hi Why do you need checksums? Can you not employ a full outer join? My though behind this is that checksumming is quite costly cpuwise and you have to fiddle with each and every relevent column, be it as part of the checksum string be it as part of the joiner. The joiner would have the advantage that you could, if you wanted, also implement logic to tell you precisely in which columns the difference is encountered. Kind regards Thiemo Am 27.10.2023 um 10:56 schrieb Y_esteembsv-forum: Hi Need SQL logic/different approach method's to find out way's a) to Implement check sum to validate table data is migrated correctly Any guidance Regards Bharani SV -- Born in Arizona, moved to Babylonia.
Re: need SQL logic to find out way's to Implement check sum to validate table data is migrated correctly
On 10/27/23 03:56, Y_esteembsv-forum wrote: Hi Need SQL logic/different approach method's to find out way's a) to Implement check sum to validate table data is migrated correctly When migrating from Oracle to PostgreSQL, on both the Oracle side and the Postgresql side, I dumped each table (for the big ones, I split them into views) as a tab-separated value file ordered by the primary key and piped that into md5sum. Had to play a few games with capitalization in blob/bytea columns That way, I could verify that each table's data was identical. -- Born in Arizona, moved to Babylonia.
BRIN index maintenance on table without primary key
Hello list. Key characteristics of my case: + HUGE table with 20G rows, ca 2TB + May be partitioned (have both versions on two test clusters ATM) + Plenty of inserts, no updates + No primary key - we had one IDENTITY bigint column until recently, but it proved useless and inefficient (index too big for mem) so we deleted it. + All queries are filtering on a not-unique not-null integer column. + On this column we have a BRIN INDEX since insert order is /mostly/ incremental. So the question is: how to maintain the physical order of the tuples? Even though the insertions populate the index column /mostly/ incrementally, there are outliers and it's certain that over time the BRIN index will deteriorate. I'm monitoring the "correlation" and I want to run a command to "defrag" it when it drops below 0.9. + Can't run CLUSTER: ERROR: cannot cluster on index "tst_brin_idx" because access method does not support clustering + Can't run pg_repack, from [1]: > Target table must have a PRIMARY KEY, or at least a UNIQUE total index on a NOT NULL column [1] https://reorg.github.io/pg_repack/ Any ideas? Thank you in advance, Dimitris
Re: Disk wait problem... may not be hardware...
On 2023-10-26 22:03:25 -0400, p...@pfortin.com wrote: > Are there any extra PG low level logs that can be turned on? Have you looked at the query plans as I recommended? (You might also want to enable track_io_timing to get extra information, but comparing just the query plans of fast and slow queries would be a first step) You haven't shown any postgresql logs or your settings, so it's hard to know what you have already turned on. There are a ton of logging related parameters. > I've only found logging in the journal; By "the journal" you mean the one maintained by journald? (Did you mention what OS you are using? From several outputs I'm guessing it is Linux). On Linux systems postgresql is usually set up to log into files somewhere under /var/log/postgresql (see the log_directory entry in postgresql.conf) but your setup might be different, of course. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature
pg_checksums?
Hi, I have a few questions about pg_checksums. Long story short, checksums were disabled in our environment but are getting enabled now (again!) because of block corruptions which destroyed a few databases in a database cluster. And before you say "told you so", the decision to disable checksums was not mine. Some people just have to learn the hard way. Anyway, re-enabling data checksums creates a few questions: a) why isn't it possible to enable checksumming while a database cluster is up? b) why isn't it possible to check whether checksums are enabled or not? c) in a Patroni cluster consisting of a primary and a sync standby, is it safe to enable checksumming in the replica, then switch over and enable it in the ex-primary, i.e. now new replica, without any penalty? Or do I have to perform a reinit to really get them in sync again, though paronictl happily reports them to be in sync? d) how long does it take to enable checksums in general? Minimizing down-time is crucial. Does it depend on the database cluster size, or the number of files it uses, or what can be taken as a criterion to estimate then necessary down-time. Thanks in advance for your insights. Cheers Paul
Re: pg_checksums?
On 10/27/23 13:34, Paul Förster wrote: Hi, I have a few questions about pg_checksums. Long story short, checksums were disabled in our environment but are getting enabled now (again!) because of block corruptions which destroyed a few databases in a database cluster. And before you say "told you so", the decision to disable checksums was not mine. Some people just have to learn the hard way. Anyway, re-enabling data checksums creates a few questions: a) why isn't it possible to enable checksumming while a database cluster is up? Data might be changing. b) why isn't it possible to check whether checksums are enabled or not? (This is my tiny test instance.) $ pg_controldata | grep checksum Data page checksum version: 0 postgres=# show data_checksums; data_checksums off (1 row) c) in a Patroni cluster consisting of a primary and a sync standby, is it safe to enable checksumming in the replica, then switch over and enable it in the ex-primary, i.e. now new replica, without any penalty? Or do I have to perform a reinit to really get them in sync again, though paronictl happily reports them to be in sync? d) how long does it take to enable checksums in general? Minimizing down-time is crucial. Does it depend on the database cluster size, or the number of files it uses, or what can be taken as a criterion to estimate then necessary down-time. Thanks in advance for your insights. Cheers Paul -- Born in Arizona, moved to Babylonia.
Re: pg_checksums?
Hi Ron, > On Oct 27, 2023, at 21:02, Ron wrote: >> b) why isn't it possible to check whether checksums are enabled or not? > > (This is my tiny test instance.) > > $ pg_controldata | grep checksum > Data page checksum version: 0 > > postgres=# show data_checksums; > data_checksums > > off > (1 row) this helps a lot. Thanks very much. Now, there are only two other questions. Cheers Paul
Re: pg_checksums?
> On 27 Oct 2023, at 20:34, Paul Förster wrote: > a) why isn't it possible to enable checksumming while a database cluster is > up? It is surprisingly complicated to enable checksums on a live cluster, a patch was submitted a while back but ultimately never made it into postgres. The below threads may shine some light on the problem: https://www.postgresql.org/message-id/flat/CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp%3D-7OJWBbcg%40mail.gmail.com https://www.postgresql.org/message-id/flat/560A2239-5DE2-4B9C-92BC-878C6822F47C%40yesql.se#10b665b2193445a3d7f24be7917a952c -- Daniel Gustafsson
Re: pg_checksums?
On Fri, Oct 27, 2023 at 10:45:16PM +0200, Daniel Gustafsson wrote: > > On 27 Oct 2023, at 20:34, Paul Förster wrote: > > > a) why isn't it possible to enable checksumming while a database cluster is > > up? > > It is surprisingly complicated to enable checksums on a live cluster, a patch > was submitted a while back but ultimately never made it into postgres. The > below threads may shine some light on the problem: > > https://www.postgresql.org/message-id/flat/CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp%3D-7OJWBbcg%40mail.gmail.com > https://www.postgresql.org/message-id/flat/560A2239-5DE2-4B9C-92BC-878C6822F47C%40yesql.se#10b665b2193445a3d7f24be7917a952c Yeah, it was a big surprise that this feature was so hard to implement because we have _no_ infrastructure for having multiple data layouts active in a live system. The discussion eventually made that clear. If we have more features that need this kind of dynamic ability, we might revisit this feature too. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Re: pg_checksums?
Hi Bruce, hi Daniel, > On Oct 27, 2023, at 23:21, Bruce Momjian wrote: > > On Fri, Oct 27, 2023 at 10:45:16PM +0200, Daniel Gustafsson wrote: >>> On 27 Oct 2023, at 20:34, Paul Förster wrote: >> >>> a) why isn't it possible to enable checksumming while a database cluster is >>> up? >> >> It is surprisingly complicated to enable checksums on a live cluster, a patch >> was submitted a while back but ultimately never made it into postgres. The >> below threads may shine some light on the problem: >> >> https://www.postgresql.org/message-id/flat/CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp%3D-7OJWBbcg%40mail.gmail.com >> https://www.postgresql.org/message-id/flat/560A2239-5DE2-4B9C-92BC-878C6822F47C%40yesql.se#10b665b2193445a3d7f24be7917a952c > > Yeah, it was a big surprise that this feature was so hard to implement > because we have _no_ infrastructure for having multiple data layouts > active in a live system. The discussion eventually made that clear. > > If we have more features that need this kind of dynamic ability, we > might revisit this feature too. Ok, I see. But unfortunately still, my questions c) and d) are unanswered. I'd especially be interested in an answer to c), i.e. is it *safe* to "pg_checksum -e" the replica instance in a patroni cluster, switch over, and then do the other one? Cheers Paul
Re: PgAmin view
On Thu, Oct 26, 2023 at 5:56 AM Shaozhong SHI wrote: > If a PgAmin view is created and tables it queries are missing, what will > happen? > > If a PdAmin view is created, and it may take 20 hours to complete, what > will happen? > > Would views automatically run, when you start the PgAmin? > Views are rewritten into the query when the query is planned. merlin
Re: Disk wait problem... may not be hardware...
Peter, Thanks for your patience; I've been feeling pressure to get this resolved; so have been lax in providing info here.. Hope the following helps... On Fri, 27 Oct 2023 19:07:11 +0200 Peter J. Holzer wrote: >On 2023-10-26 22:03:25 -0400, p...@pfortin.com wrote: >> Are there any extra PG low level logs that can be turned on? $ uname -a Linux pf.pfortin.com 6.5.3-server-1.mga10 #1 SMP PREEMPT_DYNAMIC Sat Sep 16 00:04:28 UTC 2023 x86_64 GNU/Linux PostgreSQL 15.4 on x86_64-mageia-linux-gnu, compiled by gcc (Mageia 12.3.0-3.mga9) 12.3.0, 64-bit Operating System: Mageia 10 KDE Plasma Version: 5.27.8 KDE Frameworks Version: 5.111.0 Qt Version: 5.15.7 Kernel Version: 6.5.3-server-1.mga10 (64-bit) Graphics Platform: X11 Processors: 20 × 12th Gen Intel® Core™ i7-12700K Memory: 125.5 GiB of RAM Graphics Processor: AMD Radeon RX 6600 XT Manufacturer: Dell Inc. Product Name: XPS 8950 >Have you looked at the query plans as I recommended? (You might also >want to enable track_io_timing to get extra information, but comparing >just the query plans of fast and slow queries would be a first step) I didn't see how that would help since other than the table name the queries are identical. Curious: are you implying PG stores tables differently? The tables are probably greater than 90-95% identical rows. We're still investigating how to best store that type of data which changes a small number of rows each week... The query is the built-in row-count query in SQL-workbench/J which is visible in the attached files. >explain (analyze, buffers) select count(*) from ncvhis_2016_12_03; Finalize Aggregate (cost=404669.65..404669.66 rows=1 width=8) (actual time=844.158..847.309 rows=1 loops=1) Buffers: shared hit=248 read=25022 -> Gather (cost=404669.43..404669.65 rows=2 width=8) (actual time=844.133..847.301 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=248 read=25022 -> Partial Aggregate (cost=403669.43..403669.45 rows=1 width=8) (actual time=838.772..838.772 rows=1 loops=3) Buffers: shared hit=248 read=25022 -> Parallel Index Only Scan using ncvhis_2016_12_03_voted_party_cd_idx on ncvhis_2016_12_03 (cost=0.44..372735.05 rows=12373755 width=0) (actual time=18.277..592.473 rows=9900389 loops=3) Heap Fetches: 0 Buffers: shared hit=248 read=25022 Planning Time: 0.069 ms JIT: Functions: 8 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 0.284 ms, Inlining 0.000 ms, Optimization 0.268 ms, Emission 3.590 ms, Total 4.143 ms Execution Time: 847.498 ms >explain (analyze, buffers) select count(*) from ncvhis_2020_08_01; Finalize Aggregate (cost=438377.94..438377.95 rows=1 width=8) (actual time=624.700..628.024 rows=1 loops=1) Buffers: shared hit=286 read=27084 -> Gather (cost=438377.73..438377.94 rows=2 width=8) (actual time=624.669..628.016 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=286 read=27084 -> Partial Aggregate (cost=437377.73..437377.74 rows=1 width=8) (actual time=619.297..619.297 rows=1 loops=3) Buffers: shared hit=286 read=27084 -> Parallel Index Only Scan using ncvhis_2020_08_01_voted_party_cd_idx on ncvhis_2020_08_01 (cost=0.44..403856.78 rows=13408379 width=0) (actual time=0.029..357.697 rows=10724282 loops=3) Heap Fetches: 0 Buffers: shared hit=286 read=27084 Planning Time: 0.072 ms JIT: Functions: 8 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 0.285 ms, Inlining 0.000 ms, Optimization 0.218 ms, Emission 3.933 ms, Total 4.435 ms Execution Time: 628.216 ms >explain (analyze, buffers) select count(*) from ncvhis_2020_10_31; Finalize Aggregate (cost=438179.73..438179.74 rows=1 width=8) (actual time=1090.209..1092.976 rows=1 loops=1) Buffers: shared hit=276 read=27095 -> Gather (cost=438179.52..438179.73 rows=2 width=8) (actual time=1090.141..1092.967 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=276 read=27095 -> Partial Aggregate (cost=437179.52..437179.53 rows=1 width=8) (actual time=1084.766..1084.767 rows=1 loops=3) Buffers: shared hit=276 read=27095 -> Parallel Index Only Scan using ncvhis_2020_10_31_voted_party_cd_idx on ncvhis_2020_10_31 (cost=0.44..403675.53 rows=13401595 width=0) (actual time=222.686..816.447 rows=10720916 loops=3) Heap Fetches: 0 Buffers: shared hit=276 read=27095 Planning Time: 0.131 ms JIT: Functions: 8 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 0.322 ms, Inlining 0.000 ms, Optimization 0.261 ms, Emission 3.648 ms, Total 4.231 ms Execution Time: 1093.209 ms Those responses are reasonably quick... A suggestion from the Linux list was to run "fstrim -v /mnt/work" which is the NVMe mount point for the DB: fstrim -nv /mnt/work /mnt/work: 0 B (dry run) trimmed $ fstrim -v /mnt/work /mnt/work: 2 TiB (2248993087488 bytes) trimmed bu
Re: Disk wait problem... may not be hardware...
On 10/27/23 16:46, p...@pfortin.com wrote: Peter, Thanks for your patience; I've been feeling pressure to get this resolved; so have been lax in providing info here.. Hope the following helps... Something I hadn't noticed before: SQL-workbench/J (build 129.6) displays an execution timer at the bottom of the window. I see all the queries reporting 0-47ms which it plenty fast. It's the results that often take a long time to display. Which is not surprising as building GUI elements is an expensive operation. If the time to display is your wait issue then this is not really a disk issue. What happens if you use psql as the client? -- Adrian Klaver adrian.kla...@aklaver.com