Ühel kenal päeval, N, 2007-02-01 kell 12:31, kirjutas Tom Lane: > Hannu Krosing <[EMAIL PROTECTED]> writes: > > A more radical variation of the "restricted-use archive table" approach > > is storing all tuple visibility info in a separate file. > > At first it seems to just add overhead, but for lots (most ? ) usecases > > the separately stored visibility should be highly compressible, so for > > example for bulk-loaded tables you could end up with one bit per page > > saying that all tuples on this page are visible. > > The more you compress, the slower and more complicated it will be to > access the information. I'd put my money on this being a net loss in > the majority of scenarios.
define "majority" :) In real life it is often faster to access compressed information, especially if it is stored in something like trie where compression and indeked access are the same thing. the most gain will of course come from bulk-loaded data, where the "compressed" representation can just say something like "pages 1 to 200000 are all visible starting from transaction 5000 and so is first half of page 200001, second half of 200001 and pages up to 250000 are visible from trx 6000. In this case the visibility info will always stay in L1 cache and thus be really fast, maybe even free if we account for cache reloads and such. But it may be better to still have a bitmap there for sake of simplicity and have some of it be accessible from L2 cache (200000 pages times say 2 bits is still only 100kB bitmap for 1.6GB of data). Of course there are cases where this approach is worse, sometimes much woorse, than current one, but the possibility of independently compressing visibility info and making some types of VACUUM vastly cheaper may make it a net win in several cases. Also, for higly dynamic tables a separate visibility heap might also speed up bitmap scans, as access to heap happens only for visible tuples. This can also be one way to get rid of need to write full data tuples thrice - first the original write, then commit bits and then deleted bits. instead we can just write the bits once for a whole set of tuples. Usually most of visibility info can be thrown out quite soon, as the active transaction window advances, so the first level of "compression" is just thtrowing out cmin/cmax and setting commit bit, then setting the tuple to just "visible". -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match