Re: [HACKERS] TOAST compression

Luke Lonergan Sun, 26 Feb 2006 11:06:24 -0800

Jim,

On 2/26/06 10:37 AM, "Jim C. Nasby" <[EMAIL PROTECTED]> wrote:


> So the cutover point (on your system with very fast IO) is 4:1
> compression (is that 20 or 25%?).

Actually the size of the gzipp'ed binary file on disk was 65MB, compared to
177.5MB uncompressed, so the compression ratio is 37% (?), or 2.73:1.

> But that's assuming that PostgreSQL
> can read data as fast as dd, which we all know isn't the case.

Actually, I had factored that in already.  The filesystem delivered
1,200MB/s out of cache in this case - the 300MB/s is what we routinely do
from Postgres seqscan per instance on this system.

> That's
> also assuming a pretty top-notch IO subsystem.

True - actually I'm pretty impressed with 75MB/s gunzip speed.

> Based on that, I'd argue
> that 10% is probably a better setting, though it would be good to test
> an actual case (does dbt3 produce fields large enough to ensure that
> most of them will be toasted?)

No, unfortunately not.  O'Reilly's jobs data have 65K rows, so that would
work.  How do we implement LZW compression on toasted fields?  I've never
done it!
 
> Given the variables involved, maybe it makes sense to add a GUC?

Dunno - I'm not sure how the current scheme works, this is new to me.

We had considered using a compression mechanism for table data, but had some
completely different ideas, more along the lines of a compressing heap
store.  The main problem as I see it is the CPU required to get there at
reasonable performance as you point out.  However, the trend is inevitable -
we'll soon have more CPU than we could otherwise use to work with...

- Luke



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [EMAIL PROTECTED] so that your
       message can get through to the mailing list cleanly

Re: [HACKERS] TOAST compression

Reply via email to