I suggest just minor variations on language. On Thu, Jun 06, 2019 at 04:43:48PM +1200, David Rowley wrote:
>diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml >index cce1618fc1..ab26630199 100644 >--- a/doc/src/sgml/ddl.sgml >+++ b/doc/src/sgml/ddl.sgml >@@ -4674,6 +4675,76 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >>= DATE '2008-01-01'; > </itemizedlist> > </para> > </sect2> >+ >+ <sect2 id="ddl-partitioning-declarative-best-practices"> >+ <title>Declarative Partitioning Best Practices</title> >+ >+ <para> >+ The choice of how to partition a table should be considered carefully as Either say "How to partition consider should be .." or "The choice should MADE carefully" ? >+ <para> >+ One of the most critical design decisions will be the column or columns >+ which you partition your data by. Often the best choice will be to by which ? >+ <para> >+ Choosing the number of partitions to divide the table into is also a the TARGET number of partitions BY WHICH to divide the table ? >+ critical decision to make. Not having enough partitions may mean that >+ indexes remain too large and that data locality remains poor which could >+ result in poor cache hit ratios. However, dividing the table into too >+ many partitions can also cause issues. Too many partitions can mean >+ slower query planning times and higher memory consumption during both >+ query planning and execution. It's also important to consider what >+ changes may occur in the future when choosing how to partition your table. >+ For example, if you choose to have one partition per customer and you >+ currently have a small number of large customers, what will the have ONLY ? >+ implications be if in several years you obtain a large number of small >+ customers. In this case, it may be better to choose to partition by >+ <literal>HASH</literal> and choose a reasonable amount of partitions reasonable NUMBER ? >+ <para> >+ It is also important to consider the overhead of partitioning during >+ query planning and execution. The query planner is generally able to >+ handle partition hierarchies up a few thousand partitions fairly well, >+ providing that the vast majority of them can be pruned during query provided ? I would say: "provided that typical queries prune all but a small number of partitions during planning time". >+ <command>DELETE</command> commands. Also, even if most queries are >+ able to prune a high number of partitions during query planning, it still LARGE number? >+ may be undesirable to have a large number of partitions as each partition may still ? >+ also will obtain a relation cache entry in each session which uses the will require ? Or occupy ? >+ <para> >+ With data warehouse type workloads it can make sense to use a larger >+ number of partitions than with an OLTP type workload. Generally, in data >+ warehouses, query planning time is less of a concern as the majority of >+ processing time is generally spent during query execution. With either of remove the 2nd "generally" >+ these two types of workload, it is important to make the right decisions >+ early as re-partitioning large quantities of data can be painstakingly early COMMA ? PAINFULLY slow >+ When performance is critical, performing workload simulations to >+ assist in making the correct decisions can be beneficial. I would say: Simulations of the intended workload are beneficial for optimizing partitioning strategy. Thanks, Justin