Re: glibc updarte 2.31 to 2.38

2024-09-22 Thread Shaheed Haque
I've been working on Unix-like systems for decades and though I thought I understood most of the issues to do with i18n/l10n, I've only just started using Postgres and I don't understand is why these changes ONLY seem to affect Postgres. Or is it more that it also affects text editors and the like,

Re: Database schema for "custom fields"

2024-09-10 Thread Shaheed Haque
The relational purists will gave their concerns, but especially given what you described about your performance and volumetrics, there is a reason why JSON(b) is a thing. For type checking, and more, I've had success a multi-key approach so that one entry might comprise: - A "name" - A "type" - A

Re: Monitoring logical replication

2024-09-03 Thread Shaheed Haque
Mostly to close the loop on this, now that I have things going seemingly reliably... On Tue, 18 Jun 2024 at 14:33, Ron Johnson wrote: > On Tue, Jun 18, 2024 at 5:03 AM Shaheed Haque > wrote: > >> Hi all, >> >> Is there an "official" pairing of LSN values o

Re: Postgres Logical Replication - how to see what subscriber is doing with received data?

2024-09-02 Thread Shaheed Haque
t result: the LSNs can tell you if you are not in sync, but not the reason why. (Again, I understand that row counts worked for me, but might not work for others). Thanks for your kind help and pointers! Regards, > Muhammad Ikram > > > > On Mon, Sep 2, 2024 at 12:42 PM Shaheed Haq

Re: Postgres Logical Replication - how to see what subscriber is doing with received data?

2024-09-02 Thread Shaheed Haque
gnose which knobs to tweak (whether in application, PG, the OS or the network) is basically black magic when all we really have is a pair of opaque LSNs. > > Regards, > Muhammad Ikram > > > On Sun, Sep 1, 2024 at 9:22 PM Shaheed Haque > wrote: > >> Since nobody

Re: Postgres Logical Replication - how to see what subscriber is doing with received data?

2024-09-01 Thread Shaheed Haque
Since nobody more knowledgeable has replied... I'm very interested in this area and still surprised that there is no official/convenient/standard way to approach this (see https://www.postgresql.org/message-id/CAHAc2jdAHvp7tFZBP37awcth%3DT3h5WXCN9KjZOvuTNJaAAC_hg%40mail.gmail.com ). Based partly

Re: pg_dump restores as expected on some machines and reports duplicate keys on others

2024-06-23 Thread Shaheed Haque
First, thanks for all the kind replies. To my eternal shame, after spending hours trying to debug this, I found, buried deep in one of my own initialisation scripts, the creation of a handful of "seed" database objects which, of course, caused all my woes. Thanks again, Shaheed

pg_dump restores as expected on some machines and reports duplicate keys on others

2024-06-22 Thread Shaheed Haque
Hi, I am using Postgres 14 on AWS RDS and am seeing the output of pg_dump be restored as expected by pg_restore on some database instances, and fail with reports of duplicate keys on other database instances: - My deployments are always a pair, one "logic VM" for Django etc and one "RDS ins

Re: Monitoring logical replication

2024-06-18 Thread Shaheed Haque
ry useful. On Tue, 30 Jan 2024 at 11:27, Shaheed Haque wrote: > This is great, thank you for posting. I'm currently a subcontinent or two > away from my dev env, but will compare your approach with mine (you are > using some facilities of psql I'm not familiar with). At l

Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Shaheed Haque
On Tue, 4 Jun 2024 at 20:47, Gavin Roy wrote: > > On Tue, Jun 4, 2024 at 3:15 PM Ron Johnson > wrote: > >> >> But why tar instead of custom? That was part of my original question. >> > > I've found it pretty useful for programmatically accessing data in a dump > for large databases outside of th

Re: Purpose of pg_dump tar archive format?

2024-06-04 Thread Shaheed Haque
We use it. I bet lots of others do too. On Tue, 4 Jun 2024, 18:06 Adrian Klaver, wrote: > Reply to list also. > Ccing list > > On 6/4/24 10:03 AM, Ron Johnson wrote: > > > > > If you don't need the tar format then don't use it. > > > > > > That's neither the purpose nor the point of my quest

Re: PostgreSQL as advanced job queuing system

2024-03-22 Thread Shaheed Haque
Generally, I'd suggest you think carefully about the nature of the jobs, and draw up a list of must-have properties (performance of course, but also things like whether jobs have to survive planned or unplanned outages, be visible across a WAN, numbers of readers and writers, delivery guarantees, e

Re: creating a subset DB efficiently ?

2024-03-09 Thread Shaheed Haque
I go more or less the other way. I have a script which follows the child...parent...grandparent...etc foreign keys in the source database and dumps only the records belonging to the selected "project" (your terminology, in my case it is "client"). I save the dumped data to an archive file. The scr

Re: Improving pg_dump performance when handling large numbers of LOBs

2024-02-05 Thread Shaheed Haque
Might it be worth a modest amount of time using some basic profiling to see where the time is going? A week is a looonnngg time, even for 150e6 operations. For example, if there an unexpectedly high IO load, some temporary M.2 storage might help? On Tue, 6 Feb 2024, 01:36 Ron Johnson, wrote: > O

Re: Scriptable way to validate a pg_dump restore ?

2024-01-30 Thread Shaheed Haque
On Tue, 30 Jan 2024, 05:02 Adrian Klaver, wrote: > On 1/29/24 11:35, Shaheed Haque wrote: > > > > > > On Tue, 30 Jan 2024, 00:27 Adrian Klaver, > <mailto:adrian.kla...@aklaver.com>> wrote: > > > > On 1/29/24 10:12, Shaheed Haque wrote: > >

Re: Scriptable way to validate a pg_dump restore ?

2024-01-30 Thread Shaheed Haque
On Tue, 30 Jan 2024, 05:02 Adrian Klaver, wrote: > On 1/29/24 11:35, Shaheed Haque wrote: > > > > > > On Tue, 30 Jan 2024, 00:27 Adrian Klaver, > <mailto:adrian.kla...@aklaver.com>> wrote: > > > > On 1/29/24 10:12, Shaheed Haque wrote: > >

Re: Monitoring logical replication

2024-01-30 Thread Shaheed Haque
psql -t regdns | head -1 | xargs) > if [ $COUNT -eq $NUMTAB ]; then > echo "OK: Subscription '$PUB' contains $NUMTAB tables - > that is OK. Checking for initial-sync status ..." > logger -t $PREFIX "OK: Subscription '$PUB' contains

Re: Scriptable way to validate a pg_dump restore ?

2024-01-29 Thread Shaheed Haque
On Tue, 30 Jan 2024, 00:27 Adrian Klaver, wrote: > On 1/29/24 10:12, Shaheed Haque wrote: > > > > > > Yes. But I was under the impression that the initial copy of logical > > replication was the same? > > > > Are you taking about the copy_data option to WI

Re: Scriptable way to validate a pg_dump restore ?

2024-01-29 Thread Shaheed Haque
On Mon, 29 Jan 2024, 23:57 Adrian Klaver, wrote: > On 1/29/24 09:28, Shaheed Haque wrote: > > > > > > Right, for me, state, not just record count is what I'm interested in > > (for the initial full table copy part of replication). So, given the > > explanati

Re: Scriptable way to validate a pg_dump restore ?

2024-01-29 Thread Shaheed Haque
On Mon, 29 Jan 2024, 22:52 Adrian Klaver, wrote: > On 1/29/24 00:12, Laura Smith wrote: > > Hi > > > > Let's say I've got a scenario where I'm doing a pg_dump replication > rather than online streaming, e.g. due to air-gap or whatever. > > > > Is there a scriptable way to validate the restore ?

Re: Scriptable way to validate a pg_dump restore ?

2024-01-29 Thread Shaheed Haque
I'd also like to know how to do this. The current approaches seem, afaict, to involve making on both end of the connection. Even given the inherently racy nature of the issue, that seems unwieldy to me. https://www.postgresql.org/message-id/CAHAc2jdAHvp7tFZBP37awcth%3DT3h5WXCN9KjZOvuTNJaAAC_hg%40m

Monitoring logical replication

2023-10-07 Thread Shaheed Haque
Hi, I've been playing with logical replication (currently on PG14), specifically in an AWS RDS Postgres context, but NOT using AWS' own replication tooling. I'm generally familiar with the challenges of distributed systems (such causality, time synchronisation etc), but not especially familiar wit

Re: Is it possible to index "deep" into a JSONB column?

2022-05-31 Thread Shaheed Haque
I'm unable to make "CREATE INDEX...USING gin ((snapshot -> '$.employee.* ? (@.pay_graph <> 0 || @.last_run_of_employment == true || @.state.employment[last][2] == 0)'))" trigger that. On Tue, 31 May 2022 at 10:16, Shaheed Haque wrote: > > OK, I w

Re: Is it possible to index "deep" into a JSONB column?

2022-05-31 Thread Shaheed Haque
gin ((snapshot -> '$.employee.*')); CREATE INDEX ... USING gin ((snapshot -> '$.employee')); CREATE INDEX bar ON paiyroll_payrun USING gin ((snapshot -> '$.employee.*.pay_graph')); Any thoughts on indexing for this case? If it makes any difference, I

Re: Is it possible to index "deep" into a JSONB column?

2022-05-30 Thread Shaheed Haque
oyee": {"a12": {"date_of_birth": >> "2000-01-01"},"b56": {"date_of_birth": "2000-02-02"}}}') >> ; >> >> select jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth' ) >> from jsonb_table; >> >>

Re: Is it possible to index "deep" into a JSONB column?

2022-05-30 Thread Shaheed Haque
Hi Bryn, On Mon, 30 May 2022 at 03:12, Bryn Llewellyn wrote: ... > > Try this: > >snapshot -> ‘employee’->>’date_of_birth’ > Syntactically, that works: create index bryn on paiyroll_payrun using btree ((snapshot -> 'employee'->>'date_of_birth')); But IIUC it is looking for 'date_of_bir

Is it possible to index "deep" into a JSONB column?

2022-05-29 Thread Shaheed Haque
Suppose I have a JSONB field called "snapshot". I can create a GIN index on it like this: create index idx1 on mytable using gin (snapshot); In principle, I believe this allows index-assisted access to keys and values nested in arrays and inner objects but in practice, it seems the planner "oft

Re: JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-29 Thread Shaheed Haque
On Sun, 29 May 2022, 15:58 Tom Lane, wrote: > Shaheed Haque writes: > > Unfortunately, the real query which I think should behave very > > similarly is still at the several-seconds level despite using the > > index. ... > > > -> Bitmap Heap Scan on paiyroll

Re: JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-28 Thread Shaheed Haque
apAnd (cost=26.88..26.88 rows=1 width=0) (actual time=0.038..0.039 rows=0 loops=1) -> Bitmap Index Scan on paiyroll_payrun_company_id_ce341888 (cost=0.00..6.56 rows=304 width=0) (actual time=0.016..0.016 rows=304 loops=1) Index Cond: (company_id = 173) -> B

Re: JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-28 Thread Shaheed Haque
knobs you suggested and will report back with anything of note. One last thought about TOAST. If the cost of the -> retrieving the data cannot be obviated, is there any way to tweak how that works? Thanks, Shaheed On Sat, 28 May 2022 at 19:41, Tom Lane wrote: > > Sh

JSONB index not in use, but is TOAST the real cause of slow query?

2022-05-28 Thread Shaheed Haque
Hi, I have a database table with a modest number of rows (<1000) but where one column in the table is a JSONB "snapshot" which can be a few MB in size. Generally, this is a great fit for the read-write access patterns involved, but there is a read-query which is VERY slow. I've searched via Googl

Re: Using JSONB with nested key-value structures, and skipping/wildcarding the key

2020-06-02 Thread Shaheed Haque
> > Hi, > > On Mon, 1 Jun 2020 at 23:50, Alban Hertroys wrote: > > On 1 Jun 2020, at 20:18, Shaheed Haque wrote: > > > > Hi, > > > > I'm using Django's ORM to access Postgres12. My "MyModel" table has a > JSONB column called &#x

Using JSONB with nested key-value structures, and skipping/wildcarding the key

2020-06-01 Thread Shaheed Haque
Hi, I'm using Django's ORM to access Postgres12. My "MyModel" table has a JSONB column called 'snapshot'. In Python terms, each row's 'snapshot' looks like this: == snapshot = { 'pay_definition' : { '1234': {..., 'name': 'foo', ...}, '99': {..., 'name': 'ba