Re: [PATCH] Compression dictionaries for JSONB

2024-01-26 Thread vignesh C
On Wed, 17 Jan 2024 at 19:52, Aleksander Alekseev wrote: > > Hi Shubham, > > > > > 8272749e added a few more arguments to CastCreate(). Here is the > > > > rebased patch. > > > > > > After merging afbfc029 [1] the patch needed a rebase. PFA v10. > > > > > > The patch is still in a PoC state and t

Re: [PATCH] Compression dictionaries for JSONB

2024-01-17 Thread Nikita Malakhov
Hi, Aleksander, there was a quite straightforward answer regarding Pluggable TOAST in other thread - the Pluggable TOAST feature is not desired by the community, and advanced TOAST mechanics would be accepted as parts of problematic datatypes extended functionality, on a par with in and out functi

Re: [PATCH] Compression dictionaries for JSONB

2024-01-17 Thread Aleksander Alekseev
Hi again, > Yes it does for a while now. Until we reach any agreement regarding > questions (1) and (2) personally I don't see the point in submitting > rebased patches. We can continue the discussion but mark the CF entry > as RwF for now it will be helpful. Sorry, what I actually meant were the

Re: [PATCH] Compression dictionaries for JSONB

2024-01-17 Thread Aleksander Alekseev
Hi Shubham, > > > 8272749e added a few more arguments to CastCreate(). Here is the rebased > > > patch. > > > > After merging afbfc029 [1] the patch needed a rebase. PFA v10. > > > > The patch is still in a PoC state and this is exactly why comments and > > suggestions from the community are most

Re: [PATCH] Compression dictionaries for JSONB

2024-01-17 Thread Shubham Khanna
On Wed, Jan 17, 2024 at 4:16 PM Aleksander Alekseev wrote: > > Hi hackers, > > > 8272749e added a few more arguments to CastCreate(). Here is the rebased > > patch. > > After merging afbfc029 [1] the patch needed a rebase. PFA v10. > > The patch is still in a PoC state and this is exactly why com

Re: [PATCH] Compression dictionaries for JSONB

2023-10-12 Thread Aleksander Alekseev
Hi hackers, I would like to continue discussing compression dictionaries. > So I summarized the requirements we agreed on so far and ended up with > the following list: [...] Again, here is the summary of our current agreements, at least how I understand them. Please feel free to correct me wher

Re: [PATCH] Compression dictionaries for JSONB

2023-05-17 Thread Nikita Malakhov
Hi hackers! As discussed above, I've created a new thread on the Extension of the TOAST pointer subject - https://www.postgresql.org/message-id/flat/CAN-LCVMq2X%3Dfhx7KLxfeDyb3P%2BBXuCkHC0g%3D9GF%2BJD4izfVa0Q%40mail.gmail.com Please check and comment. On Thu, Apr 27, 2023 at 1:43 PM Nikita Malakh

Re: [PATCH] Compression dictionaries for JSONB

2023-04-27 Thread Nikita Malakhov
Hi, I think I should open a new thread related to TOAST pointer refactoring based on Pluggable TOAST, COPY and looping in retrieving new TOAST value OID issues. On Wed, Apr 26, 2023 at 4:00 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi Nikita, > > > The External TOAST pointer is

Re: [PATCH] Compression dictionaries for JSONB

2023-04-26 Thread Aleksander Alekseev
Hi Nikita, > The External TOAST pointer is very limited to the amount of service data > it could keep, that's why we introduced the Custom TOAST pointers in the > Pluggable TOAST. But keep in mind that changing the TOAST pointer > structure requires a lot of quite heavy modifications in the core -

Re: [PATCH] Compression dictionaries for JSONB

2023-04-18 Thread Aleksander Alekseev
Matthias, Nikita, Many thanks for the feedback! > Any type with typlen < 0 should work, right? Right. > The use of dictionaries should be dependent on only the use of a > compression method that supports pre-computed compression > dictionaries. I think storage=MAIN + compression dictionaries sh

Re: [PATCH] Compression dictionaries for JSONB

2023-04-18 Thread Nikita Malakhov
Hi, I don't think it's a good idea to interfere with the storage strategies. Dictionary should be a kind of storage option, like a compression, but not the strategy declining all others. >> While thinking about how a user interface could look like it occured >> to me that what we are discussing c

Re: [PATCH] Compression dictionaries for JSONB

2023-04-18 Thread Matthias van de Meent
On Tue, 18 Apr 2023 at 17:28, Aleksander Alekseev wrote: > > Hi Andres, > > > As I said, I don't think we should extend dictionaries. For this to work > > we'll > > likely need a new / extended compressed toast datum header of some form, > > with > > a reference to the dictionary. That'd likely

Re: [PATCH] Compression dictionaries for JSONB

2023-04-18 Thread Aleksander Alekseev
Hi Andres, > As I said, I don't think we should extend dictionaries. For this to work we'll > likely need a new / extended compressed toast datum header of some form, with > a reference to the dictionary. That'd likely be needed even with updatable > dictionaries, as we IIRC don't know which colum

Re: [PATCH] Compression dictionaries for JSONB

2023-02-10 Thread Andres Freund
Hi, On 2023-02-10 21:22:14 +0300, Nikita Malakhov wrote: > If I understand Andres' message correctly - the proposition is to > make use of compression dictionaries automatic, possibly just setting > a parameter when the table is created, something like > CREATE TABLE t ( ..., t JSONB USE DICTIONAR

Re: [PATCH] Compression dictionaries for JSONB

2023-02-10 Thread Nikita Malakhov
Hi! If I understand Andres' message correctly - the proposition is to make use of compression dictionaries automatic, possibly just setting a parameter when the table is created, something like CREATE TABLE t ( ..., t JSONB USE DICTIONARY); The question is then how to create such dictionaries auto

Re: [PATCH] Compression dictionaries for JSONB

2023-02-09 Thread Andres Freund
Hi, On February 9, 2023 2:50:57 AM PST, Aleksander Alekseev wrote: >Hi Andres, > >> > So to clarify, are we talking about tuple-level compression? Or >> > perhaps page-level compression? >> >> Tuple level. > >> although my own patch proposed attribute-level compression, not >> tuple-level one,

Re: [PATCH] Compression dictionaries for JSONB

2023-02-09 Thread Aleksander Alekseev
Hi Andres, > > So to clarify, are we talking about tuple-level compression? Or > > perhaps page-level compression? > > Tuple level. > although my own patch proposed attribute-level compression, not > tuple-level one, it is arguably closer to tuple-level approach than > page-level one Just wanted

Re: [PATCH] Compression dictionaries for JSONB

2023-02-07 Thread Aleksander Alekseev
Hi, > > The complexity of page-level compression is significant, as pages are > > currently a base primitive of our persistency and consistency scheme. > > +many > > It's also not all a panacea performance-wise, datum-level decompression can > often be deferred much longer than page level decompre

Re: [PATCH] Compression dictionaries for JSONB

2023-02-07 Thread Alvaro Herrera
On 2023-Feb-05, Aleksander Alekseev wrote: > Since PostgreSQL is not a specified document-oriented DBMS I think we > better focus our (far from being infinite) resources on something more > people would benefit from: AIO/DIO [1] or perhaps getting rid of > freezing [2], to name a few examples. Fo

Re: [PATCH] Compression dictionaries for JSONB

2023-02-06 Thread Nikita Malakhov
Hi, On updating dictionary - >You cannot "just" refresh a dictionary used once to compress an >object, because you need it to decompress the object too. and when you have many - updating an existing dictionary requires going through all objects compressed with it in the whole database. It's a ve

Re: [PATCH] Compression dictionaries for JSONB

2023-02-06 Thread Andres Freund
Hi, On 2023-02-06 16:16:41 +0100, Matthias van de Meent wrote: > On Mon, 6 Feb 2023 at 15:03, Aleksander Alekseev > wrote: > > > > Hi, > > > > I see your point regarding the fact that creating dictionaries on a > > training set is too beneficial to neglect it. Can't argue with this. > > > > What

Re: [PATCH] Compression dictionaries for JSONB

2023-02-06 Thread Matthias van de Meent
On Mon, 6 Feb 2023 at 15:03, Aleksander Alekseev wrote: > > Hi, > > I see your point regarding the fact that creating dictionaries on a > training set is too beneficial to neglect it. Can't argue with this. > > What puzzles me though is: what prevents us from doing this on a page > level as sugges

Re: [PATCH] Compression dictionaries for JSONB

2023-02-06 Thread Aleksander Alekseev
Hi, > > So to clarify, are we talking about tuple-level compression? Or > > perhaps page-level compression? > > Tuple level. > > What I think we should do is basically this: > > When we compress datums, we know the table being targeted. If there's a > pg_attribute parameter indicating we should, w

Re: [PATCH] Compression dictionaries for JSONB

2023-02-05 Thread Andres Freund
Hi, On 2023-02-05 20:05:51 +0300, Aleksander Alekseev wrote: > > I don't think we'd want much of the infrastructure introduced in the > > patch for type agnostic cross-row compression. A dedicated "dictionary" > > type as a wrapper around other types IMO is the wrong direction. This > > should be

Re: [PATCH] Compression dictionaries for JSONB

2023-02-05 Thread Aleksander Alekseev
Hi, > I assume that manually specifying dictionary entries is a consequence of > the prototype state? I don't think this is something humans are very > good at, just analyzing the data to see what's useful to dictionarize > seems more promising. No, humans are not good at it. The idea was to aut

Re: [PATCH] Compression dictionaries for JSONB

2023-02-05 Thread Andres Freund
Hi, On 2023-02-05 13:41:17 +0300, Aleksander Alekseev wrote: > > I don't think the approaches in either of these threads is > > promising. They add a lot of complexity, require implementation effort > > for each type, manual work by the administrator for column, etc. > > I would like to point out

Re: [PATCH] Compression dictionaries for JSONB

2023-02-05 Thread Aleksander Alekseev
Hi, > I don't think the approaches in either of these threads is > promising. They add a lot of complexity, require implementation effort > for each type, manual work by the administrator for column, etc. I would like to point out that compression dictionaries don't require per-type work. Curren

Re: [PATCH] Compression dictionaries for JSONB

2023-02-04 Thread Andres Freund
Hi, On 2023-02-03 14:39:31 +0400, Pavel Borisov wrote: > On Fri, 3 Feb 2023 at 14:04, Alvaro Herrera wrote: > > > > This patch came up at the developer meeting in Brussels yesterday. > > https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2023_Developer_Meeting#v16_Patch_Triage > > > > First, as far as

Re: [PATCH] Compression dictionaries for JSONB

2023-02-03 Thread Pavel Borisov
On Fri, 3 Feb 2023 at 14:04, Alvaro Herrera wrote: > > This patch came up at the developer meeting in Brussels yesterday. > https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2023_Developer_Meeting#v16_Patch_Triage > > First, as far as I can tell, there is a large overlap between this patch > and "Plug

Re: [PATCH] Compression dictionaries for JSONB

2023-02-03 Thread Alvaro Herrera
This patch came up at the developer meeting in Brussels yesterday. https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2023_Developer_Meeting#v16_Patch_Triage First, as far as I can tell, there is a large overlap between this patch and "Pluggable toaster" patch. The approaches are completely different,

Re: [PATCH] Compression dictionaries for JSONB

2022-11-17 Thread Aleksander Alekseev
Hi hackers, > 8272749e added a few more arguments to CastCreate(). Here is the rebased > patch. After merging afbfc029 [1] the patch needed a rebase. PFA v10. The patch is still in a PoC state and this is exactly why comments and suggestions from the community are most welcome! Particularly I w

Re: [PATCH] Compression dictionaries for JSONB

2022-11-04 Thread Aleksander Alekseev
Hi hackers, > For the record, Nikita and I agreed offlist that Nikita will join this > effort as a co-author in order to implement the suggested improvements > (and perhaps some improvements that were not suggested yet). Meanwhile > I'm going to keep the current version of the patch up to date wit

Re: [PATCH] Compression dictionaries for JSONB

2022-10-06 Thread Aleksander Alekseev
Hi hackers, > For the record, Nikita and I agreed offlist that Nikita will join this > effort as a co-author in order to implement the suggested improvements > (and perhaps some improvements that were not suggested yet). Meanwhile > I'm going to keep the current version of the patch up to date wit

Re: [PATCH] Compression dictionaries for JSONB

2022-09-02 Thread Aleksander Alekseev
Hi hackers, Here is the rebased version of the patch. > I invite anyone interested to join this effort as a co-author! (since, > honestly, rewriting the same feature over and over again alone is > quite boring :D). > Overall structure could look like this: > pg_dict >| >| dictionary

Re: [PATCH] Compression dictionaries for JSONB

2022-08-19 Thread Nikita Malakhov
Hi hackers! I've got a partly question, partly proposal for the future development of this feature: What if we use pg_dict table not to store dictionaries but to store dictionaries' meta, and actual dictionaries to be stored in separate tables like it is done with TOAST tables (i.e. pg_dict. --> p

Re: [PATCH] Compression dictionaries for JSONB

2022-08-01 Thread Aleksander Alekseev
Hi hackers, > So far we seem to have a consensus to: > > 1. Use bytea instead of NameData to store dictionary entries; > > 2. Assign monotonically ascending IDs to the entries instead of using > Oids, as it is done with pg_class.relnatts. In order to do this we > should either add a corresponding

Re: [PATCH] Compression dictionaries for JSONB

2022-07-27 Thread Matthias van de Meent
On Wed, 27 Jul 2022 at 09:36, Simon Riggs wrote: > > On Sun, 17 Jul 2022 at 19:15, Nikita Malakhov wrote: > > > For using in special Toaster for JSON datatype compression dictionaries > > seem to be very valuable addition, but now I > > have to agree that this feature in current state is competi

Re: [PATCH] Compression dictionaries for JSONB

2022-07-27 Thread Simon Riggs
On Sun, 17 Jul 2022 at 19:15, Nikita Malakhov wrote: > we suggest that as an improvement compression should be put inside the > Toaster as an option, > thus the Toaster could have maximum benefits from knowledge of data internal > structure (and in future use JSON Schema). Very much agreed. >

Re: [PATCH] Compression dictionaries for JSONB

2022-07-18 Thread Aleksander Alekseev
Hi Nikita, Thanks for your feedback! > Aleksander, I've carefully gone over discussion and still have some questions > to ask - > > 1) Is there any means of measuring overhead of dictionaries over vanilla > implementation? IMO it is a must because > JSON is a widely used functionality. Also, as

Re: [PATCH] Compression dictionaries for JSONB

2022-07-17 Thread Nikita Malakhov
Hi hackers! Aleksander, I've carefully gone over discussion and still have some questions to ask - 1) Is there any means of measuring overhead of dictionaries over vanilla implementation? IMO it is a must because JSON is a widely used functionality. Also, as it was mentioned before, to check the

Re: [PATCH] Compression dictionaries for JSONB

2022-07-12 Thread Aleksander Alekseev
Hi Nikita, > Aleksander, please point me in the right direction if it was mentioned > before, I have a few questions: Thanks for your feedback. These are good questions indeed. > 1) It is not clear for me, how do you see the life cycle of such a > dictionary? If it is meant to keep growing wit

Re: [PATCH] Compression dictionaries for JSONB

2022-07-12 Thread Nikita Malakhov
Hi hackers! Aleksander, please point me in the right direction if it was mentioned before, I have a few questions: 1) It is not clear for me, how do you see the life cycle of such a dictionary? If it is meant to keep growing without cleaning up/rebuilding it could affect performance in an undesir

Re: [PATCH] Compression dictionaries for JSONB

2022-07-11 Thread Aleksander Alekseev
Hi hackers, > I invite anyone interested to join this effort as a co-author! Here is v5. Same as v4 but with a fixed compiler warning (thanks, cfbot). Sorry for the noise. -- Best regards, Aleksander Alekseev v5-0001-Compression-dictionaries-for-JSONB.patch Description: Binary data

Re: [PATCH] Compression dictionaries for JSONB

2022-07-11 Thread Aleksander Alekseev
Hi hackers, > OK, I see your point now. And I think this is a very good point. > Basing "Compression dictionaries" on the API provided by "pluggable > TOASTer" can also be less hacky than what I'm currently doing with > `typmod` argument. I'm going to switch the implementation at some > point, unl

Re: [PATCH] Compression dictionaries for JSONB

2022-07-04 Thread Aleksander Alekseev
Hi Matthias, > > Although there is also a high-level idea (according to the > > presentations) to share common data between different TOASTed values, > > similarly to what compression dictionaries do, by looking at the > > current feedback and considering the overall complexity and the amount > >

Re: [PATCH] Compression dictionaries for JSONB

2022-07-04 Thread Matthias van de Meent
Hi Alexander, On Fri, 17 Jun 2022 at 17:04, Aleksander Alekseev wrote: >> These are just my initial thoughts I would like to share though. I may >> change my mind after diving deeper into a "pluggable TOASTer" patch. > > I familiarized myself with the "pluggable TOASTer" thread and joined > the d

Re: [PATCH] Compression dictionaries for JSONB

2022-06-28 Thread Aleksander Alekseev
Hi Simon, Many thanks for your feedback! I'm going to submit an updated version of the patch in a bit. I just wanted to reply to some of your questions / comments. > Dictionaries have no versioning. [...] > Does the order of entries in the dictionary allow us to express a priority? > i.e. to a

Re: [PATCH] Compression dictionaries for JSONB

2022-06-23 Thread Simon Riggs
On Thu, 2 Jun 2022 at 14:30, Aleksander Alekseev wrote: > > I saw there was some previous discussion about dictionary size. It > > looks like this approach would put all dictionaries into a shared OID > > pool. Since I don't know what a "standard" use case is, is there any > > risk of OID exhaust

Re: [PATCH] Compression dictionaries for JSONB

2022-06-17 Thread Aleksander Alekseev
Hi Matthias, > These are just my initial thoughts I would like to share though. I may > change my mind after diving deeper into a "pluggable TOASTer" patch. I familiarized myself with the "pluggable TOASTer" thread and joined the discussion [1]. I'm afraid so far I failed to understand your sugg

Re: [PATCH] Compression dictionaries for JSONB

2022-06-15 Thread Aleksander Alekseev
Hi Matthias, > The bulk of the patch > should still be usable, but I think that the way it interfaces with > the CREATE TABLE (column ...) APIs would need reworking to build on > top of the api's of the "pluggable toaster" patches (so, creating > toasters instead of types). I think that would allo

Re: [PATCH] Compression dictionaries for JSONB

2022-06-05 Thread Matthias van de Meent
On Fri, 13 May 2022 at 10:09, Aleksander Alekseev wrote: > > Hi hackers, > > > Here it the 2nd version of the patch: > > > > - Includes changes named above > > - Fixes a warning reported by cfbot > > - Fixes some FIXME's > > - The path includes some simple tests now > > - A proper commit message w

Re: [PATCH] Compression dictionaries for JSONB

2022-06-02 Thread Jacob Champion
On Thu, Jun 2, 2022 at 6:30 AM Aleksander Alekseev wrote: > > I saw there was some previous discussion about dictionary size. It > > looks like this approach would put all dictionaries into a shared OID > > pool. Since I don't know what a "standard" use case is, is there any > > risk of OID exhaus

Re: [PATCH] Compression dictionaries for JSONB

2022-06-02 Thread Aleksander Alekseev
Hi Jacob, Many thanks for your feedback! > I saw there was some previous discussion about dictionary size. It > looks like this approach would put all dictionaries into a shared OID > pool. Since I don't know what a "standard" use case is, is there any > risk of OID exhaustion for larger deployme

Re: [PATCH] Compression dictionaries for JSONB

2022-06-01 Thread Jacob Champion
On Wed, Jun 1, 2022 at 1:44 PM Aleksander Alekseev wrote: > This is a follow-up thread to `RFC: compression dictionaries for JSONB` [1]. > I would like to share my current progress in order to get early feedback. The > patch is currently in a draft state but implements the basic functionality. I

Re: [PATCH] Compression dictionaries for JSONB

2022-05-13 Thread Aleksander Alekseev
Hi hackers, > Here it the 2nd version of the patch: > > - Includes changes named above > - Fixes a warning reported by cfbot > - Fixes some FIXME's > - The path includes some simple tests now > - A proper commit message was added Here is the rebased version of the patch. Changes compared to v2 ar

Re: [PATCH] Compression dictionaries for JSONB

2022-04-25 Thread Aleksander Alekseev
Hi Zhihong, Many thanks for your feedback! > For src/backend/catalog/pg_dict.c, please add license header. Fixed. > + elog(ERROR, "skipbytes > decoded_size - outoffset"); > > Include the values for skipbytes, decoded_size and outoffset. In fact, this code should never be executed, an

Re: [PATCH] Compression dictionaries for JSONB

2022-04-22 Thread Zhihong Yu
On Fri, Apr 22, 2022 at 1:30 AM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi hackers, > > This is a follow-up thread to `RFC: compression dictionaries for JSONB` > [1]. I would like to share my current progress in order to get early > feedback. The patch is currently in a draft stat

[PATCH] Compression dictionaries for JSONB

2022-04-22 Thread Aleksander Alekseev
Hi hackers, This is a follow-up thread to `RFC: compression dictionaries for JSONB` [1]. I would like to share my current progress in order to get early feedback. The patch is currently in a draft state but implements the basic functionality. I did my best to account for all the great feedback I p