Re: Should we warn against using too many partitions?

Justin Pryzby Wed, 05 Jun 2019 22:29:51 -0700

I suggest just minor variations on language.

On Thu, Jun 06, 2019 at 04:43:48PM +1200, David Rowley wrote:


>diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
>index cce1618fc1..ab26630199 100644
>--- a/doc/src/sgml/ddl.sgml
>+++ b/doc/src/sgml/ddl.sgml
>@@ -4674,6 +4675,76 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate 
>&gt;= DATE '2008-01-01';
>    </itemizedlist>
>    </para>
>   </sect2>
>+  
>+  <sect2 id="ddl-partitioning-declarative-best-practices">
>+   <title>Declarative Partitioning Best Practices</title>
>+
>+   <para>
>+    The choice of how to partition a table should be considered carefully as

Either say "How to partition consider should be .." or "The choice should MADE 
carefully" ?

>+   <para>
>+    One of the most critical design decisions will be the column or columns
>+    which you partition your data by.  Often the best choice will be to

by which ?

>+   <para>
>+    Choosing the number of partitions to divide the table into is also a

the TARGET number of partitions BY WHICH to divide the table ?

>+    critical decision to make.  Not having enough partitions may mean that
>+    indexes remain too large and that data locality remains poor which could
>+    result in poor cache hit ratios.  However, dividing the table into too
>+    many partitions can also cause issues.  Too many partitions can mean
>+    slower query planning times and higher memory consumption during both
>+    query planning and execution.  It's also important to consider what
>+    changes may occur in the future when choosing how to partition your table.
>+    For example, if you choose to have one partition per customer and you
>+    currently have a small number of large customers, what will the

have ONLY ?

>+    implications be if in several years you obtain a large number of small
>+    customers.  In this case, it may be better to choose to partition by
>+    <literal>HASH</literal> and choose a reasonable amount of partitions

reasonable NUMBER ?

>+   <para>
>+    It is also important to consider the overhead of partitioning during
>+    query planning and execution.  The query planner is generally able to
>+    handle partition hierarchies up a few thousand partitions fairly well,
>+    providing that the vast majority of them can be pruned during query

provided ?

I would say: "provided that typical queries prune all but a small number of
partitions during planning time".

>+    <command>DELETE</command> commands.  Also, even if most queries are
>+    able to prune a high number of partitions during query planning, it still

LARGE number?

>+    may be undesirable to have a large number of partitions as each partition

may still ?

>+    also will obtain a relation cache entry in each session which uses the

will require ?  Or occupy ?

>+   <para>
>+    With data warehouse type workloads it can make sense to use a larger
>+    number of partitions than with an OLTP type workload.  Generally, in data
>+    warehouses, query planning time is less of a concern as the majority of
>+    processing time is generally spent during query execution.  With either of

remove the 2nd "generally"

>+    these two types of workload, it is important to make the right decisions
>+    early as re-partitioning large quantities of data can be painstakingly

early COMMA ?

PAINFULLY slow

>+    When performance is critical, performing workload simulations to
>+    assist in making the correct decisions can be beneficial.  

I would say:
Simulations of the intended workload are beneficial for optimizing partitioning
strategy.

Thanks,
Justin

Re: Should we warn against using too many partitions?

Reply via email to