Re: [HACKERS] Dynamic Partitioning using Segment Visibility Maps

Richard Huxton Fri, 04 Jan 2008 03:02:42 -0800

Simon Riggs wrote:

On Fri, 2008-01-04 at 10:22 +0000, Richard Huxton wrote:
Simon Riggs wrote:
We would keep a dynamic visibility map at *segment* level, showing which
segments have all rows as 100% visible. No freespace map data would be
held at this level.
Small dumb-user question.
I take it you've considered some more flexible consecutive-run-of-blocksunit of flagging rather than file-segments. That obviously complicatesthe tracking but means you can cope with infrequent updates as well asmark most of the "most recent" segment for log-style tables.
I'm writing the code to abstract that away, so yes.

Now you mention it, it does seem straightforward to have a table storage
parameter for partition size, which defaults to 1GB. The partition size
is simply a number of consecutive blocks, as you say.

The smaller the partition size the greater the overhead of managing it.

Oh, obviously, but with smaller partition sizes this also becomes usefulfor low-end systems as well as high-end ones. Skipping 80% of a seq-scanon a date-range query is a win for even small (by your standards)tables. I shouldn't be surprised if the sensible-number-of-partitionsremained more-or-less constant as you scaled the hardware, but thepartition size grew.

Also I've been looking at read-only tables and compression, as you may
know. My idea was that in the future we could mark segments as either

- read-only- compressed

- able to be shipped off to hierarchical storage

Those ideas work best if the partitioning is based around the physical
file sizes we use for segments.


I can see why you've chosen file segments. It certainly makes things easier.

Hmm - thinking about the date-range scenario above, it occurs to me thatfor seq-scan purposes the correct partition size depends upon thedata value you are interested in. What I want to know is what blocks Jan07 covers (or rather what blocks it doesn't) rather than knowing blocks1-9999999 cover 2005-04-12 to 2007-10-13. Of course that means thatyou'd eventually want different partition sizes tracking visibility fordifferent columns (e.g. id, timestamp).

I suspect the same would be true for read-only/compressed/archivedflags, but I can see how they are tightly linked to physical files(particularly the last two).


--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

               http://www.postgresql.org/about/donate

Re: [HACKERS] Dynamic Partitioning using Segment Visibility Maps

Reply via email to