On Wed, 17 Jan 2024 at 19:52, Aleksander Alekseev
wrote:
>
> Hi Shubham,
>
> > > > 8272749e added a few more arguments to CastCreate(). Here is the
> > > > rebased patch.
> > >
> > > After merging afbfc029 [1] the patch needed a rebase. PFA v10.
> > >
> > > The patch is still in a PoC state and t
Hi,
Aleksander, there was a quite straightforward answer regarding Pluggable
TOAST
in other thread - the Pluggable TOAST feature is not desired by the
community,
and advanced TOAST mechanics would be accepted as parts of problematic
datatypes extended functionality, on a par with in and out functi
Hi again,
> Yes it does for a while now. Until we reach any agreement regarding
> questions (1) and (2) personally I don't see the point in submitting
> rebased patches. We can continue the discussion but mark the CF entry
> as RwF for now it will be helpful.
Sorry, what I actually meant were the
Hi Shubham,
> > > 8272749e added a few more arguments to CastCreate(). Here is the rebased
> > > patch.
> >
> > After merging afbfc029 [1] the patch needed a rebase. PFA v10.
> >
> > The patch is still in a PoC state and this is exactly why comments and
> > suggestions from the community are most
On Wed, Jan 17, 2024 at 4:16 PM Aleksander Alekseev
wrote:
>
> Hi hackers,
>
> > 8272749e added a few more arguments to CastCreate(). Here is the rebased
> > patch.
>
> After merging afbfc029 [1] the patch needed a rebase. PFA v10.
>
> The patch is still in a PoC state and this is exactly why com
Hi hackers,
I would like to continue discussing compression dictionaries.
> So I summarized the requirements we agreed on so far and ended up with
> the following list: [...]
Again, here is the summary of our current agreements, at least how I
understand them. Please feel free to correct me wher
Hi hackers!
As discussed above, I've created a new thread on the Extension of the TOAST
pointer subject -
https://www.postgresql.org/message-id/flat/CAN-LCVMq2X%3Dfhx7KLxfeDyb3P%2BBXuCkHC0g%3D9GF%2BJD4izfVa0Q%40mail.gmail.com
Please check and comment.
On Thu, Apr 27, 2023 at 1:43 PM Nikita Malakh
Hi,
I think I should open a new thread related to TOAST pointer refactoring
based on Pluggable TOAST, COPY and looping in retrieving new TOAST
value OID issues.
On Wed, Apr 26, 2023 at 4:00 PM Aleksander Alekseev <
aleksan...@timescale.com> wrote:
> Hi Nikita,
>
> > The External TOAST pointer is
Hi Nikita,
> The External TOAST pointer is very limited to the amount of service data
> it could keep, that's why we introduced the Custom TOAST pointers in the
> Pluggable TOAST. But keep in mind that changing the TOAST pointer
> structure requires a lot of quite heavy modifications in the core -
Matthias, Nikita,
Many thanks for the feedback!
> Any type with typlen < 0 should work, right?
Right.
> The use of dictionaries should be dependent on only the use of a
> compression method that supports pre-computed compression
> dictionaries. I think storage=MAIN + compression dictionaries sh
Hi,
I don't think it's a good idea to interfere with the storage strategies.
Dictionary
should be a kind of storage option, like a compression, but not the strategy
declining all others.
>> While thinking about how a user interface could look like it occured
>> to me that what we are discussing c
On Tue, 18 Apr 2023 at 17:28, Aleksander Alekseev
wrote:
>
> Hi Andres,
>
> > As I said, I don't think we should extend dictionaries. For this to work
> > we'll
> > likely need a new / extended compressed toast datum header of some form,
> > with
> > a reference to the dictionary. That'd likely
Hi Andres,
> As I said, I don't think we should extend dictionaries. For this to work we'll
> likely need a new / extended compressed toast datum header of some form, with
> a reference to the dictionary. That'd likely be needed even with updatable
> dictionaries, as we IIRC don't know which colum
Hi,
On 2023-02-10 21:22:14 +0300, Nikita Malakhov wrote:
> If I understand Andres' message correctly - the proposition is to
> make use of compression dictionaries automatic, possibly just setting
> a parameter when the table is created, something like
> CREATE TABLE t ( ..., t JSONB USE DICTIONAR
Hi!
If I understand Andres' message correctly - the proposition is to
make use of compression dictionaries automatic, possibly just setting
a parameter when the table is created, something like
CREATE TABLE t ( ..., t JSONB USE DICTIONARY);
The question is then how to create such dictionaries auto
Hi,
On February 9, 2023 2:50:57 AM PST, Aleksander Alekseev
wrote:
>Hi Andres,
>
>> > So to clarify, are we talking about tuple-level compression? Or
>> > perhaps page-level compression?
>>
>> Tuple level.
>
>> although my own patch proposed attribute-level compression, not
>> tuple-level one,
Hi Andres,
> > So to clarify, are we talking about tuple-level compression? Or
> > perhaps page-level compression?
>
> Tuple level.
> although my own patch proposed attribute-level compression, not
> tuple-level one, it is arguably closer to tuple-level approach than
> page-level one
Just wanted
Hi,
> > The complexity of page-level compression is significant, as pages are
> > currently a base primitive of our persistency and consistency scheme.
>
> +many
>
> It's also not all a panacea performance-wise, datum-level decompression can
> often be deferred much longer than page level decompre
On 2023-Feb-05, Aleksander Alekseev wrote:
> Since PostgreSQL is not a specified document-oriented DBMS I think we
> better focus our (far from being infinite) resources on something more
> people would benefit from: AIO/DIO [1] or perhaps getting rid of
> freezing [2], to name a few examples.
Fo
Hi,
On updating dictionary -
>You cannot "just" refresh a dictionary used once to compress an
>object, because you need it to decompress the object too.
and when you have many - updating an existing dictionary requires
going through all objects compressed with it in the whole database.
It's a ve
Hi,
On 2023-02-06 16:16:41 +0100, Matthias van de Meent wrote:
> On Mon, 6 Feb 2023 at 15:03, Aleksander Alekseev
> wrote:
> >
> > Hi,
> >
> > I see your point regarding the fact that creating dictionaries on a
> > training set is too beneficial to neglect it. Can't argue with this.
> >
> > What
On Mon, 6 Feb 2023 at 15:03, Aleksander Alekseev
wrote:
>
> Hi,
>
> I see your point regarding the fact that creating dictionaries on a
> training set is too beneficial to neglect it. Can't argue with this.
>
> What puzzles me though is: what prevents us from doing this on a page
> level as sugges
Hi,
> > So to clarify, are we talking about tuple-level compression? Or
> > perhaps page-level compression?
>
> Tuple level.
>
> What I think we should do is basically this:
>
> When we compress datums, we know the table being targeted. If there's a
> pg_attribute parameter indicating we should, w
Hi,
On 2023-02-05 20:05:51 +0300, Aleksander Alekseev wrote:
> > I don't think we'd want much of the infrastructure introduced in the
> > patch for type agnostic cross-row compression. A dedicated "dictionary"
> > type as a wrapper around other types IMO is the wrong direction. This
> > should be
Hi,
> I assume that manually specifying dictionary entries is a consequence of
> the prototype state? I don't think this is something humans are very
> good at, just analyzing the data to see what's useful to dictionarize
> seems more promising.
No, humans are not good at it. The idea was to aut
Hi,
On 2023-02-05 13:41:17 +0300, Aleksander Alekseev wrote:
> > I don't think the approaches in either of these threads is
> > promising. They add a lot of complexity, require implementation effort
> > for each type, manual work by the administrator for column, etc.
>
> I would like to point out
Hi,
> I don't think the approaches in either of these threads is
> promising. They add a lot of complexity, require implementation effort
> for each type, manual work by the administrator for column, etc.
I would like to point out that compression dictionaries don't require
per-type work.
Curren
Hi,
On 2023-02-03 14:39:31 +0400, Pavel Borisov wrote:
> On Fri, 3 Feb 2023 at 14:04, Alvaro Herrera wrote:
> >
> > This patch came up at the developer meeting in Brussels yesterday.
> > https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2023_Developer_Meeting#v16_Patch_Triage
> >
> > First, as far as
On Fri, 3 Feb 2023 at 14:04, Alvaro Herrera wrote:
>
> This patch came up at the developer meeting in Brussels yesterday.
> https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2023_Developer_Meeting#v16_Patch_Triage
>
> First, as far as I can tell, there is a large overlap between this patch
> and "Plug
This patch came up at the developer meeting in Brussels yesterday.
https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2023_Developer_Meeting#v16_Patch_Triage
First, as far as I can tell, there is a large overlap between this patch
and "Pluggable toaster" patch. The approaches are completely different,
Hi hackers,
> 8272749e added a few more arguments to CastCreate(). Here is the rebased
> patch.
After merging afbfc029 [1] the patch needed a rebase. PFA v10.
The patch is still in a PoC state and this is exactly why comments and
suggestions from the community are most welcome! Particularly I w
Hi hackers,
> For the record, Nikita and I agreed offlist that Nikita will join this
> effort as a co-author in order to implement the suggested improvements
> (and perhaps some improvements that were not suggested yet). Meanwhile
> I'm going to keep the current version of the patch up to date wit
Hi hackers,
> For the record, Nikita and I agreed offlist that Nikita will join this
> effort as a co-author in order to implement the suggested improvements
> (and perhaps some improvements that were not suggested yet). Meanwhile
> I'm going to keep the current version of the patch up to date wit
Hi hackers,
Here is the rebased version of the patch.
> I invite anyone interested to join this effort as a co-author! (since,
> honestly, rewriting the same feature over and over again alone is
> quite boring :D).
> Overall structure could look like this:
> pg_dict
>|
>| dictionary
Hi hackers!
I've got a partly question, partly proposal for the future development of
this
feature:
What if we use pg_dict table not to store dictionaries but to store
dictionaries'
meta, and actual dictionaries to be stored in separate tables like it is
done with
TOAST tables (i.e. pg_dict. --> p
Hi hackers,
> So far we seem to have a consensus to:
>
> 1. Use bytea instead of NameData to store dictionary entries;
>
> 2. Assign monotonically ascending IDs to the entries instead of using
> Oids, as it is done with pg_class.relnatts. In order to do this we
> should either add a corresponding
On Wed, 27 Jul 2022 at 09:36, Simon Riggs wrote:
>
> On Sun, 17 Jul 2022 at 19:15, Nikita Malakhov wrote:
>
> > For using in special Toaster for JSON datatype compression dictionaries
> > seem to be very valuable addition, but now I
> > have to agree that this feature in current state is competi
On Sun, 17 Jul 2022 at 19:15, Nikita Malakhov wrote:
> we suggest that as an improvement compression should be put inside the
> Toaster as an option,
> thus the Toaster could have maximum benefits from knowledge of data internal
> structure (and in future use JSON Schema).
Very much agreed.
>
Hi Nikita,
Thanks for your feedback!
> Aleksander, I've carefully gone over discussion and still have some questions
> to ask -
>
> 1) Is there any means of measuring overhead of dictionaries over vanilla
> implementation? IMO it is a must because
> JSON is a widely used functionality. Also, as
Hi hackers!
Aleksander, I've carefully gone over discussion and still have some
questions to ask -
1) Is there any means of measuring overhead of dictionaries over vanilla
implementation? IMO it is a must because
JSON is a widely used functionality. Also, as it was mentioned before, to
check the
Hi Nikita,
> Aleksander, please point me in the right direction if it was mentioned
> before, I have a few questions:
Thanks for your feedback. These are good questions indeed.
> 1) It is not clear for me, how do you see the life cycle of such a
> dictionary? If it is meant to keep growing wit
Hi hackers!
Aleksander, please point me in the right direction if it was mentioned
before, I have a few questions:
1) It is not clear for me, how do you see the life cycle of such a
dictionary? If it is meant to keep growing without
cleaning up/rebuilding it could affect performance in an undesir
Hi hackers,
> I invite anyone interested to join this effort as a co-author!
Here is v5. Same as v4 but with a fixed compiler warning (thanks,
cfbot). Sorry for the noise.
--
Best regards,
Aleksander Alekseev
v5-0001-Compression-dictionaries-for-JSONB.patch
Description: Binary data
Hi hackers,
> OK, I see your point now. And I think this is a very good point.
> Basing "Compression dictionaries" on the API provided by "pluggable
> TOASTer" can also be less hacky than what I'm currently doing with
> `typmod` argument. I'm going to switch the implementation at some
> point, unl
Hi Matthias,
> > Although there is also a high-level idea (according to the
> > presentations) to share common data between different TOASTed values,
> > similarly to what compression dictionaries do, by looking at the
> > current feedback and considering the overall complexity and the amount
> >
Hi Alexander,
On Fri, 17 Jun 2022 at 17:04, Aleksander Alekseev
wrote:
>> These are just my initial thoughts I would like to share though. I may
>> change my mind after diving deeper into a "pluggable TOASTer" patch.
>
> I familiarized myself with the "pluggable TOASTer" thread and joined
> the d
Hi Simon,
Many thanks for your feedback!
I'm going to submit an updated version of the patch in a bit. I just
wanted to reply to some of your questions / comments.
> Dictionaries have no versioning. [...]
> Does the order of entries in the dictionary allow us to express a priority?
> i.e. to a
On Thu, 2 Jun 2022 at 14:30, Aleksander Alekseev
wrote:
> > I saw there was some previous discussion about dictionary size. It
> > looks like this approach would put all dictionaries into a shared OID
> > pool. Since I don't know what a "standard" use case is, is there any
> > risk of OID exhaust
Hi Matthias,
> These are just my initial thoughts I would like to share though. I may
> change my mind after diving deeper into a "pluggable TOASTer" patch.
I familiarized myself with the "pluggable TOASTer" thread and joined
the discussion [1].
I'm afraid so far I failed to understand your sugg
Hi Matthias,
> The bulk of the patch
> should still be usable, but I think that the way it interfaces with
> the CREATE TABLE (column ...) APIs would need reworking to build on
> top of the api's of the "pluggable toaster" patches (so, creating
> toasters instead of types). I think that would allo
On Fri, 13 May 2022 at 10:09, Aleksander Alekseev
wrote:
>
> Hi hackers,
>
> > Here it the 2nd version of the patch:
> >
> > - Includes changes named above
> > - Fixes a warning reported by cfbot
> > - Fixes some FIXME's
> > - The path includes some simple tests now
> > - A proper commit message w
On Thu, Jun 2, 2022 at 6:30 AM Aleksander Alekseev
wrote:
> > I saw there was some previous discussion about dictionary size. It
> > looks like this approach would put all dictionaries into a shared OID
> > pool. Since I don't know what a "standard" use case is, is there any
> > risk of OID exhaus
Hi Jacob,
Many thanks for your feedback!
> I saw there was some previous discussion about dictionary size. It
> looks like this approach would put all dictionaries into a shared OID
> pool. Since I don't know what a "standard" use case is, is there any
> risk of OID exhaustion for larger deployme
On Wed, Jun 1, 2022 at 1:44 PM Aleksander Alekseev
wrote:
> This is a follow-up thread to `RFC: compression dictionaries for JSONB` [1].
> I would like to share my current progress in order to get early feedback. The
> patch is currently in a draft state but implements the basic functionality. I
Hi hackers,
> Here it the 2nd version of the patch:
>
> - Includes changes named above
> - Fixes a warning reported by cfbot
> - Fixes some FIXME's
> - The path includes some simple tests now
> - A proper commit message was added
Here is the rebased version of the patch. Changes compared to v2 ar
Hi Zhihong,
Many thanks for your feedback!
> For src/backend/catalog/pg_dict.c, please add license header.
Fixed.
> + elog(ERROR, "skipbytes > decoded_size - outoffset");
>
> Include the values for skipbytes, decoded_size and outoffset.
In fact, this code should never be executed, an
On Fri, Apr 22, 2022 at 1:30 AM Aleksander Alekseev <
aleksan...@timescale.com> wrote:
> Hi hackers,
>
> This is a follow-up thread to `RFC: compression dictionaries for JSONB`
> [1]. I would like to share my current progress in order to get early
> feedback. The patch is currently in a draft stat
Hi hackers,
This is a follow-up thread to `RFC: compression dictionaries for JSONB`
[1]. I would like to share my current progress in order to get early
feedback. The patch is currently in a draft state but implements the basic
functionality. I did my best to account for all the great feedback I
p
58 matches
Mail list logo