Re: Compressed TOAST Slicing

2019-04-16 Thread Andrey Borodin
> 9 апр. 2019 г., в 22:30, Tom Lane написал(а): > > The proposal is kind of cute, but I'll bet it's a net loss for > small copy lengths --- likely we'd want some cutoff below which > we do it with the dumb byte-at-a-time loop. Ture. I've made simple extension to compare decompression time on

Re: Compressed TOAST Slicing

2019-04-09 Thread Tom Lane
Andres Freund writes: > On 2019-04-09 10:12:56 -0700, Paul Ramsey wrote: >> Wow, well beyond slicing, just being able to decompress 25% faster is a win >> for pretty much any TOAST use case. I guess the $100 question is: >> portability? The whole reason for the old-skool code that’s there now wa

Re: Compressed TOAST Slicing

2019-04-09 Thread Andrey Borodin
> 9 апр. 2019 г., в 22:20, Andres Freund написал(а): > > Just use memmove? It's usually as fast these days. No, unfortunately, it is fixing things incompatible way. In pglz side-effects of overlapping addresses are necessary, not the way memmove avoids it. I.e. bytes 01234 ^ copy here thre

Re: Compressed TOAST Slicing

2019-04-09 Thread Andres Freund
On 2019-04-09 10:12:56 -0700, Paul Ramsey wrote: > > > On Apr 9, 2019, at 10:09 AM, Andrey Borodin wrote: > > > > He advised me to use algorithm that splits copied regions into smaller > > non-overlapping subregions with exponentially increasing size. > > > > while (off <= len) > > { > >me

Re: Compressed TOAST Slicing

2019-04-09 Thread Andrey Borodin
> 9 апр. 2019 г., в 22:12, Paul Ramsey написал(а): > > Wow, well beyond slicing, just being able to decompress 25% faster is a win > for pretty much any TOAST use case. I guess the $100 question is: > portability? The whole reason for the old-skool code that’s there now was > concerns about

Re: Compressed TOAST Slicing

2019-04-09 Thread Paul Ramsey
> On Apr 9, 2019, at 10:09 AM, Andrey Borodin wrote: > > He advised me to use algorithm that splits copied regions into smaller > non-overlapping subregions with exponentially increasing size. > > while (off <= len) > { >memcpy(dp, dp - off, off); >len -= off; >dp += off; >off

Re: Compressed TOAST Slicing

2019-04-09 Thread Andrey Borodin
Hi! > 12 марта 2019 г., в 10:22, Andrey Borodin написал(а): > > 3. And I'd use memmove despite the comment why we do not do that. It is > SSE-optimized and cache-optimized nowadays. So, I've pushed idea a little bit and showed that decompress byte-copy cycle to Vladimir Leskov. while (len--)

Re: Compressed TOAST Slicing

2019-04-02 Thread Stephen Frost
Greetings, * Darafei "Komяpa" Praliaskouski (m...@komzpa.net) wrote: > > I'll plan to push this tomorrow with the above change (and a few > > additional comments to explain what all is going on..). > > Is everything ok? Can it be pushed? This has been pushed now. Thanks! Stephen signature.as

Re: Compressed TOAST Slicing

2019-04-01 Thread Komяpa
Hi! > I'll plan to push this tomorrow with the above change (and a few > additional comments to explain what all is going on..). Is everything ok? Can it be pushed? I'm looking here, haven't found it pushed and worry about this. https://github.com/postgres/postgres/commits/master

Re: Compressed TOAST Slicing

2019-03-30 Thread Stephen Frost
Greetings, * Paul Ramsey (pram...@cleverelephant.ca) wrote: > > On Mar 19, 2019, at 4:47 AM, Stephen Frost wrote: > > * Paul Ramsey (pram...@cleverelephant.ca) wrote: > >>> On Mar 18, 2019, at 7:34 AM, Robert Haas wrote: > >>> +1. I think Paul had it right originally. > >> > >> In that spirit,

Re: Compressed TOAST Slicing

2019-03-19 Thread Paul Ramsey
> On Mar 19, 2019, at 4:47 AM, Stephen Frost wrote: > > Greetings, > > * Paul Ramsey (pram...@cleverelephant.ca) wrote: >>> On Mar 18, 2019, at 7:34 AM, Robert Haas wrote: >>> +1. I think Paul had it right originally. >> >> In that spirit, here is a “one pglz_decompress function, new param

Re: Compressed TOAST Slicing

2019-03-19 Thread Stephen Frost
Greetings, * Paul Ramsey (pram...@cleverelephant.ca) wrote: > > On Mar 18, 2019, at 7:34 AM, Robert Haas wrote: > > +1. I think Paul had it right originally. > > In that spirit, here is a “one pglz_decompress function, new parameter” > version for commit. Alright, I've been working through th

Re: Compressed TOAST Slicing

2019-03-18 Thread Paul Ramsey
> On Mar 18, 2019, at 7:34 AM, Robert Haas wrote: > > On Mon, Mar 18, 2019 at 10:14 AM Tom Lane wrote: >> Stephen Frost writes: >>> * Andres Freund (and...@anarazel.de) wrote: I don't think that should stop us from breaking the API. You've got to do quite low level stuff to need pgl

Re: Compressed TOAST Slicing

2019-03-18 Thread Robert Haas
On Mon, Mar 18, 2019 at 10:14 AM Tom Lane wrote: > Stephen Frost writes: > > * Andres Freund (and...@anarazel.de) wrote: > >> I don't think that should stop us from breaking the API. You've got to > >> do quite low level stuff to need pglz directly, in which case such an > >> API change should be

Re: Compressed TOAST Slicing

2019-03-18 Thread Tom Lane
Stephen Frost writes: > * Andres Freund (and...@anarazel.de) wrote: >> I don't think that should stop us from breaking the API. You've got to >> do quite low level stuff to need pglz directly, in which case such an >> API change should be the least of your problems between major versions. > Agree

Re: Compressed TOAST Slicing

2019-03-17 Thread Stephen Frost
Greetings, * Andres Freund (and...@anarazel.de) wrote: > On 2019-03-12 14:42:14 +0900, Michael Paquier wrote: > > On Mon, Mar 11, 2019 at 08:38:56PM +, Regina Obe wrote: > > > I tested on windows mingw64 (as of a week ago) and confirmed the > > > patch applies cleanly and significantly faster

Re: Compressed TOAST Slicing

2019-03-13 Thread Paul Ramsey
> On Mar 13, 2019, at 9:32 AM, Andrey Borodin wrote: > > > >> 13 марта 2019 г., в 21:05, Paul Ramsey >> написал(а): >> >> Here is a new (final?) patch ... >> >> > > This check > > @@ -744,6 +748,8 @@ pglz_decompress(const char *source, int32 slen, char > *dest, >

Re: Compressed TOAST Slicing

2019-03-13 Thread Andrey Borodin
> 13 марта 2019 г., в 21:05, Paul Ramsey написал(а): > > Here is a new (final?) patch ... > > This check @@ -744,6 +748,8 @@ pglz_decompress(const char *source, int32 slen, char *dest, { *dp = dp[-off];

Re: Compressed TOAST Slicing

2019-03-13 Thread Paul Ramsey
On Mar 13, 2019, at 8:25 AM, Paul Ramsey wrote:On Mar 13, 2019, at 3:09 AM, Tomas Vondra wrote:On 3/13/19 3:19 AM, Michael Paquier wrote:On Tue, Mar 12, 2019 at 07:01:17PM -0700, Andres Freund wrote:I don't think this is even close to popul

Re: Compressed TOAST Slicing

2019-03-13 Thread Paul Ramsey
> On Mar 13, 2019, at 3:09 AM, Tomas Vondra > wrote: > > On 3/13/19 3:19 AM, Michael Paquier wrote: >> On Tue, Mar 12, 2019 at 07:01:17PM -0700, Andres Freund wrote: >>> I don't think this is even close to popular enough to incur the >>> maybe of a separate function / more complicated interfa

Re: Compressed TOAST Slicing

2019-03-13 Thread Tomas Vondra
On 3/13/19 3:19 AM, Michael Paquier wrote: > On Tue, Mar 12, 2019 at 07:01:17PM -0700, Andres Freund wrote: >> I don't think this is even close to popular enough to incur the >> maybe of a separate function / more complicated interface. By this >> logic we can change basically no APIs anymore. >

Re: Compressed TOAST Slicing

2019-03-12 Thread Michael Paquier
On Tue, Mar 12, 2019 at 07:01:17PM -0700, Andres Freund wrote: > I don't think this is even close to popular enough to incur the > maybe of a separate function / more complicated interface. By this > logic we can change basically no APIs anymore. Well, if folks here think that it is not worth wor

Re: Compressed TOAST Slicing

2019-03-12 Thread Andres Freund
On March 12, 2019 6:58:12 PM PDT, Michael Paquier wrote: >On Tue, Mar 12, 2019 at 11:08:15AM -0700, Paul Ramsey wrote: >>> On Mar 12, 2019, at 9:45 AM, Paul Ramsey >wrote: >>> I was going to say that the function is only used twice in the code >>> base, but I see it’s now used four times. So m

Re: Compressed TOAST Slicing

2019-03-12 Thread Michael Paquier
On Tue, Mar 12, 2019 at 11:08:15AM -0700, Paul Ramsey wrote: >> On Mar 12, 2019, at 9:45 AM, Paul Ramsey wrote: >> I was going to say that the function is only used twice in the code >> base, but I see it’s now used four times. So maybe leave the old >> signature in place and add the new one for m

Re: Compressed TOAST Slicing

2019-03-12 Thread Paul Ramsey
> On Mar 11, 2019, at 10:22 PM, Andrey Borodin wrote: > > Hi! > >> 21 февр. 2019 г., в 23:50, Paul Ramsey >> написал(а): >> >> Merci! Attached are updated patches. >> > > As noted before, patches are extremely useful. > So, I've looked into the code too. > > I've got some questions about

Re: Compressed TOAST Slicing

2019-03-12 Thread Paul Ramsey
> On Mar 12, 2019, at 9:45 AM, Paul Ramsey wrote: > > > >> On Mar 12, 2019, at 9:13 AM, Andres Freund wrote: >> >> On 2019-03-12 14:42:14 +0900, Michael Paquier wrote: >>> On Mon, Mar 11, 2019 at 08:38:56PM +, Regina Obe wrote: I tested on windows mingw64 (as of a week ago) and con

Re: Compressed TOAST Slicing

2019-03-12 Thread Paul Ramsey
> On Mar 12, 2019, at 9:13 AM, Andres Freund wrote: > > On 2019-03-12 14:42:14 +0900, Michael Paquier wrote: >> On Mon, Mar 11, 2019 at 08:38:56PM +, Regina Obe wrote: >>> I tested on windows mingw64 (as of a week ago) and confirmed the >>> patch applies cleanly and significantly faster fo

Re: Compressed TOAST Slicing

2019-03-12 Thread Andres Freund
On 2019-03-12 14:42:14 +0900, Michael Paquier wrote: > On Mon, Mar 11, 2019 at 08:38:56PM +, Regina Obe wrote: > > I tested on windows mingw64 (as of a week ago) and confirmed the > > patch applies cleanly and significantly faster for left, substr > > tests than head. > > int32 > pglz_decompr

Re: Compressed TOAST Slicing

2019-03-12 Thread Andrey Borodin
> 12 марта 2019 г., в 19:40, Paul Ramsey написал(а): > >> On Mar 11, 2019, at 10:42 PM, Michael Paquier wrote: >> >> int32 >> pglz_decompress(const char *source, int32 slen, char *dest, >> - int32 rawsize) >> + int32 rawsize, bool i

Re: Compressed TOAST Slicing

2019-03-12 Thread Paul Ramsey
> On Mar 11, 2019, at 10:42 PM, Michael Paquier wrote: > > On Mon, Mar 11, 2019 at 08:38:56PM +, Regina Obe wrote: >> I tested on windows mingw64 (as of a week ago) and confirmed the >> patch applies cleanly and significantly faster for left, substr >> tests than head. > > int32 > pglz_d

Re: Compressed TOAST Slicing

2019-03-11 Thread Michael Paquier
On Mon, Mar 11, 2019 at 08:38:56PM +, Regina Obe wrote: > I tested on windows mingw64 (as of a week ago) and confirmed the > patch applies cleanly and significantly faster for left, substr > tests than head. int32 pglz_decompress(const char *source, int32 slen, char *dest, -

Re: Compressed TOAST Slicing

2019-03-11 Thread Andrey Borodin
Hi! > 21 февр. 2019 г., в 23:50, Paul Ramsey написал(а): > > Merci! Attached are updated patches. > As noted before, patches are extremely useful. So, I've looked into the code too. I've got some questions about pglz_decompress() changes: 1. + if (dp >=

Re: Compressed TOAST Slicing

2019-03-11 Thread Regina Obe
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:not tested No need for documentation as this is a performance improvement pa

Re: Compressed TOAST Slicing

2019-03-11 Thread Alvaro Herrera
On 2019-Mar-11, Darafei Praliaskouski wrote: > The feature is super valuable for complex PostGIS-enabled databases. After having to debug a perf problem in this area, I agree, +1 for the patch. Thanks -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Su

Re: Compressed TOAST Slicing

2019-03-11 Thread Darafei Praliaskouski
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: tested, passed Spec compliant: not tested Documentation:not tested I have read the patch and have no problems with it. The feature is super

Re: Compressed TOAST Slicing

2019-02-21 Thread Paul Ramsey
On Wed, Feb 20, 2019 at 1:12 PM Stephen Frost wrote: > > * Paul Ramsey (pram...@cleverelephant.ca) wrote: > > On Wed, Feb 20, 2019 at 10:50 AM Daniel Verite > > wrote: > > > > > > What about starts_with(string, prefix)? > > Thanks, I'll add that. > > That sounds good to me, I look forward to an

Re: Compressed TOAST Slicing

2019-02-20 Thread Stephen Frost
Greetings, * Paul Ramsey (pram...@cleverelephant.ca) wrote: > On Wed, Feb 20, 2019 at 10:50 AM Daniel Verite > wrote: > > > > Paul Ramsey wrote: > > > > > Oddly enough, I couldn't find many/any things that were sensitive to > > > left-end decompression. The only exception is "LIKE this%"

Re: Compressed TOAST Slicing

2019-02-20 Thread Tomas Vondra
On 2/20/19 7:50 PM, Robert Haas wrote: > On Wed, Feb 20, 2019 at 1:45 PM Paul Ramsey wrote: >> What this does not support: any function that probably wants >> less-than-everything, but doesn’t know how big a slice to look >> for. Stephen thinks I should put an iterator on decompression, >> whic

Re: Compressed TOAST Slicing

2019-02-20 Thread Tom Lane
Paul Ramsey writes: >> On Feb 20, 2019, at 10:37 AM, Simon Riggs wrote: >> If we add one set of code now and need to add another different one later, >> we will have 2 sets of code that do similar things. > Note that adding an iterator isn’t adding two ways to do the same thing, > since the ite

Re: Compressed TOAST Slicing

2019-02-20 Thread Daniel Verite
Paul Ramsey wrote: > > text_starts_with(arg1,arg2) in varlena.c does a full decompression > > of arg1 when it could limit itself to the length of the smaller arg2: > > Nice catch, I didn't find that one as it's not user visible, seems to > be only called in spgist (!!) It's also exposed

Re: Compressed TOAST Slicing

2019-02-20 Thread Paul Ramsey
On Wed, Feb 20, 2019 at 10:50 AM Daniel Verite wrote: > > Paul Ramsey wrote: > > > Oddly enough, I couldn't find many/any things that were sensitive to > > left-end decompression. The only exception is "LIKE this%" which > > clearly would be helped, but unfortunately wouldn't be a quick >

Re: Compressed TOAST Slicing

2019-02-20 Thread Robert Haas
On Wed, Feb 20, 2019 at 1:45 PM Paul Ramsey wrote: > What this does not support: any function that probably wants > less-than-everything, but doesn’t know how big a slice to look for. Stephen > thinks I should put an iterator on decompression, which would be an > interesting piece of work. Havi

Re: Compressed TOAST Slicing

2019-02-20 Thread Daniel Verite
Paul Ramsey wrote: > Oddly enough, I couldn't find many/any things that were sensitive to > left-end decompression. The only exception is "LIKE this%" which > clearly would be helped, but unfortunately wouldn't be a quick > drop-in, but a rather major reorganization of the regex handling.

Re: Compressed TOAST Slicing

2019-02-20 Thread Paul Ramsey
> On Feb 20, 2019, at 10:37 AM, Simon Riggs wrote: > > -1, I think this is blowing up the complexity of a already useful patch, > even though there's no increase in complexity due to the patch proposed > here. I totally get wanting incremental decompression for jsonb, but I > don't see why Paul

Re: Compressed TOAST Slicing

2019-02-20 Thread Simon Riggs
On Wed, 20 Feb 2019 at 16:27, Andres Freund wrote: > > > Sure, but we have the choice between something that benefits just a few > > cases or one that benefits more widely. > > > > If we all only work on the narrow use cases that are right in front of us > > at the present moment then we would no

Re: Compressed TOAST Slicing

2019-02-20 Thread Robert Haas
On Wed, Feb 20, 2019 at 11:27 AM Andres Freund wrote: > -1, I think this is blowing up the complexity of a already useful patch, > even though there's no increase in complexity due to the patch proposed > here. I totally get wanting incremental decompression for jsonb, but I > don't see why Paul

Re: Compressed TOAST Slicing

2019-02-20 Thread Andres Freund
On 2019-02-20 08:39:38 +, Simon Riggs wrote: > On Tue, 19 Feb 2019 at 23:09, Paul Ramsey wrote: > > > On Sat, Feb 16, 2019 at 7:25 AM Simon Riggs wrote: > > > > > Could we get an similarly optimized implementation of -> operator for > > JSONB as well? > > > Are there any other potential uses

Re: Compressed TOAST Slicing

2019-02-20 Thread Simon Riggs
On Tue, 19 Feb 2019 at 23:09, Paul Ramsey wrote: > On Sat, Feb 16, 2019 at 7:25 AM Simon Riggs wrote: > > > Could we get an similarly optimized implementation of -> operator for > JSONB as well? > > Are there any other potential uses? Best to fix em all up at once and > then move on to other thi

Re: Compressed TOAST Slicing

2019-02-19 Thread Юрий Соколов
Some time ago I posted PoC patch with alternative TOAST compression scheme: instead of "compress-then-chunk" I suggested "chunk-then-compress". It decrease compression level, but allows efficient arbitrary slicing. ср, 20 февр. 2019 г., 2:09 Paul Ramsey pram...@cleverelephant.ca: > On Sat, Feb 16

Re: Compressed TOAST Slicing

2019-02-19 Thread Paul Ramsey
On Sat, Feb 16, 2019 at 7:25 AM Simon Riggs wrote: > Could we get an similarly optimized implementation of -> operator for JSONB > as well? > Are there any other potential uses? Best to fix em all up at once and then > move on to other things. Thanks. Oddly enough, I couldn't find many/any thi

Re: Compressed TOAST Slicing

2019-02-16 Thread Simon Riggs
On Thu, 6 Dec 2018 at 20:54, Paul Ramsey wrote: > On Sun, Dec 2, 2018 at 7:03 AM Rafia Sabih > wrote: > > > > The idea looks good and believing your performance evaluation it seems > > like a practical one too. > > Thank you kindly for the review! > Sounds good. Could we get an similarly optim

Re: Compressed TOAST Slicing

2019-02-15 Thread Andres Freund
Hi Stephen, On 2018-12-06 12:54:18 -0800, Paul Ramsey wrote: > On Sun, Dec 2, 2018 at 7:03 AM Rafia Sabih > wrote: > > > > The idea looks good and believing your performance evaluation it seems > > like a practical one too. > > Thank you kindly for the review! > > > A comment explaining how th

Re: Compressed TOAST Slicing

2018-12-06 Thread Paul Ramsey
On Sun, Dec 2, 2018 at 7:03 AM Rafia Sabih wrote: > > The idea looks good and believing your performance evaluation it seems > like a practical one too. Thank you kindly for the review! > A comment explaining how this check differs for is_slice case would be helpful. > Looks like PG indentation

Re: Compressed TOAST Slicing

2018-12-02 Thread Rafia Sabih
On Fri, Nov 2, 2018 at 11:55 PM Paul Ramsey wrote: > > As threatened, I have also added a patch to left() to also use sliced access. Hi Paul, The idea looks good and believing your performance evaluation it seems like a practical one too. I had a look at this patch and here are my initial comme

Re: Compressed TOAST Slicing

2018-11-02 Thread Paul Ramsey
As threatened, I have also added a patch to left() to also use sliced access. compressed-datum-slicing-20190102a.patch Description: Binary data compressed-datum-slicing-left-20190102a.patch Description: Binary data

Re: Compressed TOAST Slicing

2018-11-02 Thread Paul Ramsey
On Thu, Nov 1, 2018 at 4:02 PM Tom Lane wrote: > Paul Ramsey writes: > > On Thu, Nov 1, 2018 at 2:29 PM Stephen Frost wrote: > >> and secondly, why we wouldn't consider > >> handling a non-zero offset. A non-zero offset would, of course, still > >> require decompressing from the start and then

Re: Compressed TOAST Slicing

2018-11-01 Thread Tom Lane
Paul Ramsey writes: > On Thu, Nov 1, 2018 at 2:29 PM Stephen Frost wrote: >> and secondly, why we wouldn't consider >> handling a non-zero offset. A non-zero offset would, of course, still >> require decompressing from the start and then just throwing away what we >> skip over, but we're going t

Re: Compressed TOAST Slicing

2018-11-01 Thread Paul Ramsey
On Thu, Nov 1, 2018 at 2:29 PM Stephen Frost wrote: > Greetings, > > * Paul Ramsey (pram...@cleverelephant.ca) wrote: > > The attached patch adds in a code path to do a partial decompression of > the > > TOAST entry, when the requested slice is at the start of the object. > > There two things tha

Re: Compressed TOAST Slicing

2018-11-01 Thread Stephen Frost
Greetings, * Paul Ramsey (pram...@cleverelephant.ca) wrote: > The attached patch adds in a code path to do a partial decompression of the > TOAST entry, when the requested slice is at the start of the object. Neat! > As usual, doing less work is faster. Definitely. > Interesting note to motiva