On 3/19/19 10:59 AM, Chris Travers wrote: > > > On Mon, Mar 18, 2019 at 11:09 PM Tomas Vondra > <tomas.von...@2ndquadrant.com <mailto:tomas.von...@2ndquadrant.com>> wrote: > > > > On 3/15/19 12:52 PM, Ildus Kurbangaliev wrote: > > On Fri, 15 Mar 2019 14:07:14 +0400 > > David Steele <da...@pgmasters.net <mailto:da...@pgmasters.net>> wrote: > > > >> On 3/7/19 11:50 AM, Alexander Korotkov wrote: > >>> On Thu, Mar 7, 2019 at 10:43 AM David Steele > <da...@pgmasters.net <mailto:da...@pgmasters.net> > >>> <mailto:da...@pgmasters.net <mailto:da...@pgmasters.net>>> wrote: > >>> > >>> On 2/28/19 5:44 PM, Ildus Kurbangaliev wrote: > >>> > >>> > there are another set of patches. > >>> > Only rebased to current master. > >>> > > >>> > Also I will change status on commitfest to 'Needs review'. > >>> > >>> This patch has seen periodic rebases but no code review that I > >>> can see since last January 2018. > >>> > >>> As Andres noted in [1], I think that we need to decide if this > >>> is a feature that we want rather than just continuing to push it > >>> from CF to CF. > >>> > >>> > >>> Yes. I took a look at code of this patch. I think it's in pretty > >>> good shape. But high level review/discussion is required. > >> > >> OK, but I think this patch can only be pushed one more time, > maximum, > >> before it should be rejected. > >> > >> Regards, > > > > Hi, > > in my opinion this patch is usually skipped not because it is not > > needed, but because of its size. It is not hard to maintain it until > > commiters will have time for it or I will get actual response that > > nobody is going to commit it. > > > > That may be one of the reasons, yes. But there are other reasons, which > I think may be playing a bigger role. > > There's one practical issue with how the patch is structured - the docs > and tests are in separate patches towards the end of the patch series, > which makes it impossible to commit the preceding parts. This needs to > change. Otherwise the patch size kills the patch as a whole. > > But there's a more important cost/benefit issue, I think. When I look at > patches as a committer, I naturally have to weight how much time I spend > on getting it in (and then dealing with fallout from bugs etc) vs. what > I get in return (measured in benefits for community, users). This patch > is pretty large and complex, so the "costs" are quite high, while the > benefits from the patch itself is the ability to pick between pg_lz and > zlib. Which is not great, and so people tend to pick other patches. > > Now, I understand there's a lot of potential benefits further down the > line, like column-level compression (which I think is the main goal > here). But that's not included in the patch, so the gains are somewhat > far in the future. > > > Not discussing whether any particular committer should pick this up but > I want to discuss an important use case we have at Adjust for this sort > of patch. > > The PostgreSQL compression strategy is something we find inadequate for > at least one of our large deployments (a large debug log spanning > 10PB+). Our current solution is to set storage so that it does not > compress and then run on ZFS to get compression speedups on spinning disks. > > But running PostgreSQL on ZFS has some annoying costs because we have > copy-on-write on copy-on-write, and when you add file fragmentation... I > would really like to be able to get away from having to do ZFS as an > underlying filesystem. While we have good write throughput, read > throughput is not as good as I would like. > > An approach that would give us better row-level compression would allow > us to ditch the COW filesystem under PostgreSQL approach. > > So I think the benefits are actually quite high particularly for those > dealing with volume/variety problems where things like JSONB might be a > go-to solution. Similarly I could totally see having systems which > handle large amounts of specialized text having extensions for dealing > with these. >
Sure, I don't disagree - the proposed compression approach may be a big win for some deployments further down the road, no doubt about it. But as I said, it's unclear when we get there (or if the interesting stuff will be in some sort of extension, which I don't oppose in principle). > > But hey, I think there are committers working for postgrespro, who might > have the motivation to get this over the line. Of course, assuming that > there are no serious objections to having this functionality or how it's > implemented ... But I don't think that was the case. > > > While I am not currently able to speak for questions of how it is > implemented, I can say with very little doubt that we would almost > certainly use this functionality if it were there and I could see plenty > of other cases where this would be a very appropriate direction for some > other projects as well. > Well, I guess the best thing you can do to move this patch forward is to actually try that on your real-world use case, and report your results and possibly do a review of the patch. IIRC there was an extension [1] leveraging this custom compression interface for better jsonb compression, so perhaps that would work for you (not sure if it's up to date with the current patch, though). [1] https://www.postgresql.org/message-id/20171130182009.1b492eb2%40wp.localdomain regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services