On Fri, May 19, 2006 at 10:02:50PM +0300, Hannu Krosing wrote: > ??hel kenal p??eval, R, 2006-05-19 kell 14:53, kirjutas Tom Lane: > > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > > On Fri, May 19, 2006 at 09:29:03AM +0200, Martijn van Oosterhout wrote: > > >> I'm seeing 250,000 blocks being cut down to 9,500 blocks. That's almost > > >> unbeleiveable. What's in the table? It would seem to imply that our > > >> tuple format is far more compressable than we expected. > > > > > It's just SELECT count(*) FROM (SELECT * FROM accounts ORDER BY bid) a; > > > If the tape routines were actually storing visibility information, I'd > > > expect that to be pretty compressible in this case since all the tuples > > > were presumably created in a single transaction by pgbench. > > > > It's worse than that: IIRC what passes through a heaptuple sort are > > tuples manufactured by heap_form_tuple, which will have consistently > > zeroed header fields. However, the above isn't very helpful since the > > rest of us have no idea what that "accounts" table contains. How wide > > is the tuple data, and what's in it? > > Was he not using pg_bench data ?
I am. For reference: bench=# \d accounts Table "public.accounts" Column | Type | Modifiers ----------+---------------+----------- aid | integer | not null bid | integer | abalance | integer | filler | character(84) | > > (This suggests that we might try harder to strip unnecessary header info > > from tuples being written to tape inside tuplesort.c. I think most of > > the required fields could be reconstructed given the TupleDesc.) > > I guess that tapefiles compress better than averahe table because they > are sorted, and thus at least a little more repetitive than the rest. > If there are varlen types, then they usually also have abundance of > small 4-byte integers, which should also compress at least better than > 4/1, maybe a lot better. If someone wants to provide a patch that strips out the headers I can test that as well. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings