Re: Partitioning options

2024-02-21 Thread Alec Lazarescu
Hi, Justin.

The example link has self-contained DDL to create the partitions (in
flat vs composite mode for comparison) and then making the FK's on
each showing the marked speed difference for the same net number of
partitions (1200 flat vs 80x15 = 1200 composite):
https://www.postgresql.org/message-id/CAE%2BE%3DSQacy6t_3XzCWnY1eiRcNWfz4pp02FER0N7mU_F%2Bo8G_Q%40mail.gmail.com

Alec

On Tue, Feb 20, 2024 at 11:59 AM Justin  wrote:
>
>
> On Sun, Feb 18, 2024 at 5:20 PM Alec Lazarescu  wrote:
>>
>> "Would probably look at a nested partitioning"
>>
>> I'm not the original poster, but I have a schema with nested
>> (composite) partitions and I do run into some significant
>> inefficiencies compared to flat partitions in various schema metadata
>> operations (queries to get the list of tables, creating foreign keys,
>> etc.) in tables with 1,000+ total partitions.
>>
>> One example: 
>> https://www.postgresql.org/message-id/CAE%2BE%3DSQacy6t_3XzCWnY1eiRcNWfz4pp02FER0N7mU_F%2Bo8G_Q%40mail.gmail.com
>>
>> Alec
>>
>>
>
> Hi Alec,
>
> would need to see the DDL of the partitions and the queries accessing these 
> partitions to have an opinion
>
> Thank you
> Justin




How to schema-qualify "IS DISTINCT FROM" on trigger definition? (I created a db that cannot be dump/restored)

2024-02-21 Thread Vick Khera
I've created a database which my vendor (Supabase) cannot
dump/restore/upgrade. Ultimately, it comes down to this trigger statement,
and the fact that the underlying operations needed to perform the `IS
DISTINCT FROM` comparison in the WHEN clause need to be found in the
`public` schema. During the restore, the search path is empty, so it fails.

Full example file is below.

The trigger:

CREATE TRIGGER record_content_update BEFORE UPDATE OF content, embedding ON
t1
  FOR EACH ROW
  WHEN (((new.content <> old.content) OR (new.embedding IS DISTINCT FROM
old.embedding)))
  EXECUTE FUNCTION t1_content_update_handler();

The content field is a JSONB, and the embedding field is a vector from the
pg_vector extension.

I make a backup using pg_dump, and upon restore it errors out with this:

psql:dump1.sql:122: ERROR:  operator does not exist: public.vector =
public.vector
LINE 1: ... (((new.content <> old.content) OR (new.embedding IS DISTINC...
 ^
HINT:  No operator matches the given name and argument types. You might
need to add explicit type casts.

The ^ is under the "IS DISTINCT" in case the formatting makes it unclear.

If I make the operator just `<>` the pg_dump properly schema-qualifies it
in the dump as

new.embedding OPERATOR(public.<>) old.embedding

but I need to account for the NULLs. I cannot find a way to schema-quailify
the `IS DISTINCT FROM` comparison.

How do I make this trigger definition survive pg_dump/pg_restore? I cannot
alter the dump file between the steps.

I'm running version: psql (PostgreSQL) 15.5. For my tests I'm on FreeBSD
14, but Supabase runs whatever version of linux they do and Pg version 15.1.

Full reproduction steps:

Save the file below as create.sql then run these commands:

createdb -U postgres t1
psql -U postgres -f create.sql t1
pg_dump -U postgres t1 > dump.sql
createdb -U postgres t2
psql -U postgres -f dump.sql t2

On the last step, the above referenced error will occur.

Is there a way to fix this, or is it a "don't do that" situation?

The only workaround I can think of is to move the IS DISTINCT FROM test to
be inside my trigger function.

--- create.sql file ---
CREATE EXTENSION IF NOT EXISTS "vector";

CREATE TABLE t1 (
  id SERIAL PRIMARY KEY,
  content JSONB DEFAULT '{}'::JSONB NOT NULL,
  embedding vector
);

CREATE FUNCTION t1_content_update_handler() RETURNS TRIGGER
   LANGUAGE plpgsql
   AS $$
 BEGIN
   RAISE INFO '% trigger called for id=%', TG_TABLE_NAME, OLD.id;
   RETURN NEW;
 END;
$$;

CREATE TRIGGER record_content_update BEFORE UPDATE OF content, embedding ON
t1
  FOR EACH ROW
  WHEN (((new.content <> old.content) OR (new.embedding IS DISTINCT FROM
old.embedding)))
  EXECUTE FUNCTION t1_content_update_handler();
--- end ---


Re: How to schema-qualify "IS DISTINCT FROM" on trigger definition? (I created a db that cannot be dump/restored)

2024-02-21 Thread Tom Lane
Vick Khera  writes:
> I've created a database which my vendor (Supabase) cannot
> dump/restore/upgrade. Ultimately, it comes down to this trigger statement,
> and the fact that the underlying operations needed to perform the `IS
> DISTINCT FROM` comparison in the WHEN clause need to be found in the
> `public` schema. During the restore, the search path is empty, so it fails.

Yeah.  We've had some discussions about inventing a version of IS
DISTINCT FROM (and some other SQL-spec syntaxes with the same problem)
that supports schema-qualification of the underlying operator.  But
it hasn't gotten further than preliminary discussion.

For the moment, I think the only feasible solution is for your trigger
function to set the search path it needs by adding a "SET search_path
= whatever" clause to the function's CREATE command.

regards, tom lane




Discover PostgreSQL's Graph Power with Apache AGE!

2024-02-21 Thread Nandhini Jayakumar
Hello PostgreSQL Community,

Excited to share how Apache AGE enhances PostgreSQL with smooth graph
features! Handles complex data, and supports SQL and Cypher. Join our
awesome community, check tutorials, and let's dive into those data projects!

More info.: Apache AGE GitHub  & Website


Regards,
Nandhini Jayakumar.


How do you optimize the disk IO when you cannot assume a file will start at a boundary then?

2024-02-21 Thread Siddharth Jain
Hi All,

I understand the storage layer in databases goes to great lengths to ensure:
- a row does not cross a block boundary
- read/writes/allocation happen in units of blocks
etc. The motivation is that at the OS level, it reads and writes pages
(blocks), not individual bytes. I am only concerned about SSDs but I think
the principle applies to HDD as well.

but how can we do all this when we are not even guaranteed that the
beginning of a file will be aligned with a block boundary? refer this

.

Further, I don't see any APIs exposing I/O operations in terms of blocks.
All File I/O APIs I see expose a file as a randomly accessible contiguous
byte buffer. Would it not have been easier if there were APIs that exposed
I/O operations in terms of blocks?

can someone explain this to me?

Sid