"Would probably look at a nested partitioning" I'm not the original poster, but I have a schema with nested (composite) partitions and I do run into some significant inefficiencies compared to flat partitions in various schema metadata operations (queries to get the list of tables, creating foreign keys, etc.) in tables with 1,000+ total partitions.
One example: https://www.postgresql.org/message-id/CAE%2BE%3DSQacy6t_3XzCWnY1eiRcNWfz4pp02FER0N7mU_F%2Bo8G_Q%40mail.gmail.com Alec On Sun, Feb 11, 2024 at 8:25 AM Justin <zzzzz.g...@gmail.com> wrote: > > Hi Marc, > > Nested partitioning still allows for simple data deletion by dropping the > table that falls in that date range. > > Probably thinking of partitioning by multicolomn rules which is very complex > to set up > > On Fri, Feb 9, 2024, 10:29 AM Marc Millas <marc.mil...@mokadb.com> wrote: >> >> >> >> >> On Thu, Feb 8, 2024 at 10:25 PM Justin <zzzzz.g...@gmail.com> wrote: >>> >>> Hi Sud, >>> >>> Would not look at HASH partitioning as it is very expensive to add or >>> subtract the number of partitions. >>> >>> Would probably look at a nested partitioning using customer ID using range >>> or list of IDs then by transaction date, Its easy to add partitions and >>> balance the partitions segments. >> >> >> I'll not do that because, then, when getting rid of obsolete data, you must >> delete a huge number of records, and vacuum each partition. >> if partitioning by date, you will ease greatly the cleaning, by just getting >> rid of obsolete partitions which is quite speedy.( no delete, no vacuum, no >> index updates, ...) >> Marc >> >>> >>> Keep in mind that SELECT queries being used on the partition must use the >>> partitioning KEY in the WHERE clause of the query or performance will >>> suffer. >>> >>> Suggest doing a query analysis before deploying partition to confirm the >>> queries WHERE clauses matched the planned partition rule. I suggest that >>> 80% of the queries of the executed queries must match the partition rule if >>> not don't deploy partitioning or change all the queries in the application >>> to match the partition rule >>> >>> >>> On Thu, Feb 8, 2024 at 3:51 PM Greg Sabino Mullane <htamf...@gmail.com> >>> wrote: >>>>> >>>>> Out of curiosity, As OP mentioned that there will be Joins and also >>>>> filters on column Customer_id column , so why don't you think that >>>>> subpartition by customer_id will be a good option? I understand List >>>>> subpartition may not be an option considering the new customer_ids gets >>>>> added slowly in the future(and default list may not be allowed) and also >>>>> OP mentioned, there is skewed distribution of data for customer_id >>>>> column. However what is the problem if OP will opt for HASH subpartition >>>>> on customer_id in this situation? >>>> >>>> >>>> It doesn't really gain you much, given you would be hashing it, the >>>> customers are unevenly distributed, and OP talked about filtering on the >>>> customer_id column. A hash partition would just be a lot more work and >>>> complexity for us humans and for Postgres. Partitioning for the sake of >>>> partitioning is not a good thing. Yes, smaller tables are better, but they >>>> have to be smaller targeted tables. >>>> >>>> sud wrote: >>>> >>>>> 130GB of storage space as we verified using the "pg_relation_size" >>>>> function, for a sample data set. >>>> >>>> >>>> You might also want to closely examine your schema. At that scale, every >>>> byte saved per row can add up. >>>> >>>> Cheers, >>>> Greg >>>>