Jim, On 2/26/06 10:37 AM, "Jim C. Nasby" <[EMAIL PROTECTED]> wrote:
> So the cutover point (on your system with very fast IO) is 4:1 > compression (is that 20 or 25%?). Actually the size of the gzipp'ed binary file on disk was 65MB, compared to 177.5MB uncompressed, so the compression ratio is 37% (?), or 2.73:1. > But that's assuming that PostgreSQL > can read data as fast as dd, which we all know isn't the case. Actually, I had factored that in already. The filesystem delivered 1,200MB/s out of cache in this case - the 300MB/s is what we routinely do from Postgres seqscan per instance on this system. > That's > also assuming a pretty top-notch IO subsystem. True - actually I'm pretty impressed with 75MB/s gunzip speed. > Based on that, I'd argue > that 10% is probably a better setting, though it would be good to test > an actual case (does dbt3 produce fields large enough to ensure that > most of them will be toasted?) No, unfortunately not. O'Reilly's jobs data have 65K rows, so that would work. How do we implement LZW compression on toasted fields? I've never done it! > Given the variables involved, maybe it makes sense to add a GUC? Dunno - I'm not sure how the current scheme works, this is new to me. We had considered using a compression mechanism for table data, but had some completely different ideas, more along the lines of a compressing heap store. The main problem as I see it is the CPU required to get there at reasonable performance as you point out. However, the trend is inevitable - we'll soon have more CPU than we could otherwise use to work with... - Luke ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly