Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2020-03-06 Thread Peter Geoghegan
On Fri, Mar 6, 2020 at 11:00 AM Andres Freund wrote: > > Pushed. > > Congrats! Thanks Andres! -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2020-03-06 Thread Andres Freund
On 2020-02-26 14:43:27 -0800, Peter Geoghegan wrote: > On Mon, Feb 24, 2020 at 4:54 PM Peter Geoghegan wrote: > > Attached is v34, which has this change. My plan is to commit something > > very close to this on Wednesday morning (barring any objections). > > Pushed. Congrats!

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2020-02-27 Thread Peter Geoghegan
On Wed, Feb 26, 2020 at 10:03 PM Fujii Masao wrote: > Thanks for committing this nice feature! You're welcome! > Here is one minor comment. > > + deduplicate_items > + storage parameter > > This should be > > deduplicate_items storage parameter I pushed a fix for this. Thanks --

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2020-02-26 Thread Fujii Masao
On 2020/02/27 7:43, Peter Geoghegan wrote: On Mon, Feb 24, 2020 at 4:54 PM Peter Geoghegan wrote: Attached is v34, which has this change. My plan is to commit something very close to this on Wednesday morning (barring any objections). Pushed. Thanks for committing this nice feature! Her

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2020-02-26 Thread Peter Geoghegan
On Mon, Feb 24, 2020 at 4:54 PM Peter Geoghegan wrote: > Attached is v34, which has this change. My plan is to commit something > very close to this on Wednesday morning (barring any objections). Pushed. I'm going to delay committing the pageinspect patch until tomorrow, since I haven't thought

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2020-02-20 Thread Peter Geoghegan
On Thu, Feb 20, 2020 at 10:58 AM Peter Geoghegan wrote: > I think that there is a good chance that it just won't matter. The > number of indexes that won't be able to support deduplication will be > very small in practice. The important exceptions are INCLUDE indexes > and nondeterministic collati

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2020-02-20 Thread Peter Geoghegan
On Thu, Feb 20, 2020 at 7:38 AM Anastasia Lubennikova wrote: > I don't think this patch really needs more nitpicking ) But when has that ever stopped it? :-) > User can discover this with a complex query to pg_index and pg_opclass. > To simplify this, we can probably wrap this into function or

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2020-02-20 Thread Anastasia Lubennikova
On 19.02.2020 22:16, Peter Geoghegan wrote: On Wed, Feb 19, 2020 at 8:14 AM Anastasia Lubennikova wrote: Thank you for this work. I've looked through the patches and they seem to be ready for commit. I haven't yet read recent documentation and readme changes, so maybe I'll send some more feedba

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2020-02-19 Thread Peter Geoghegan
On Wed, Feb 19, 2020 at 11:16 AM Peter Geoghegan wrote: > On Wed, Feb 19, 2020 at 8:14 AM Anastasia Lubennikova > wrote: > > Thank you for this work. I've looked through the patches and they seem > > to be ready for commit. > > I haven't yet read recent documentation and readme changes, so maybe

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2020-02-19 Thread Peter Geoghegan
On Wed, Feb 19, 2020 at 8:14 AM Anastasia Lubennikova wrote: > Thank you for this work. I've looked through the patches and they seem > to be ready for commit. > I haven't yet read recent documentation and readme changes, so maybe > I'll send some more feedback tomorrow. Great. > Is there any sp

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2020-02-19 Thread Anastasia Lubennikova
On 14.02.2020 05:57, Peter Geoghegan wrote: Attached is v33, which adds the last piece we need: opclass infrastructure that tells nbtree whether or not deduplication can be applied safely. This is based on work by Anastasia that was shared with me privately. Thank you for this work. I've looked

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2020-01-13 Thread Peter Geoghegan
On Fri, Jan 10, 2020 at 1:36 PM Peter Geoghegan wrote: > * HEIKKI: Do we only generate one posting list in one WAL record? I > would assume it's better to deduplicate everything on the page, since > we're modifying it anyway. Still thinking about this one. > * HEIKKI: Does xl_btree_vacuum WAL re

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2020-01-08 Thread Peter Geoghegan
On Wed, Jan 8, 2020 at 5:56 AM Heikki Linnakangas wrote: > On 04/01/2020 03:47, Peter Geoghegan wrote: > > Attached is v28, which fixes bitrot from my recent commits to refactor > > VACUUM-related code in nbtpage.c. > > I started to read through this gigantic patch. Oh come on, it's not that big.

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-12-17 Thread Peter Geoghegan
On Thu, Dec 12, 2019 at 6:21 PM Peter Geoghegan wrote: > Still waiting for some review of the first patch, to get it out of the > way. Anastasia? I plan to commit this first patch [1] in the next day or two, barring any objections. It's clear that the nbtree "pin scan" VACUUM code is totally unn

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-12-17 Thread Peter Geoghegan
On Tue, Dec 17, 2019 at 5:18 PM Bruce Momjian wrote: > On Tue, Dec 17, 2019 at 03:30:33PM -0800, Peter Geoghegan wrote: > > With many real world unique indexes, the true reason behind most or > > all B-Tree page splits is "version churn". I view these page splits as > > a permanent solution to a t

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-12-17 Thread Bruce Momjian
On Tue, Dec 17, 2019 at 03:30:33PM -0800, Peter Geoghegan wrote: > With many real world unique indexes, the true reason behind most or > all B-Tree page splits is "version churn". I view these page splits as > a permanent solution to a temporary problem -- we *permanently* > degrade the index struc

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-12-17 Thread Peter Geoghegan
On Tue, Dec 17, 2019 at 1:58 PM Bruce Momjian wrote: > > Attached is v26, which adds this new criteria/heuristic for unique > > indexes. We now seem to consistently get good results with unique > > indexes. > > In the past we tried to increase the number of cases where HOT updates > can happen but

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-12-17 Thread Bruce Momjian
On Thu, Dec 12, 2019 at 06:21:20PM -0800, Peter Geoghegan wrote: > On Tue, Dec 3, 2019 at 12:13 PM Peter Geoghegan wrote: > > The new criteria/heuristic for unique indexes is very simple: If a > > unique index has an existing item that is a duplicate on the incoming > > item at the point that we m

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-12-03 Thread Peter Geoghegan
On Tue, Nov 12, 2019 at 3:21 PM Peter Geoghegan wrote: > * Decided to go back to turning deduplication on by default with > non-unique indexes, and off by default using unique indexes. > > The unique index stuff was regressed enough with INSERT-heavy > workloads that I was put off, despite my init

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-11-25 Thread Michael Paquier
On Mon, Nov 18, 2019 at 05:26:37PM -0800, Peter Geoghegan wrote: > Attached is v24. This revision doesn't fix the problem with > xl_btree_insert record bloat, but it does fix the bitrot against the > master branch that was caused by commit 50d22de9. (This patch has had > a surprisingly large number

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-11-16 Thread Te
Moin, On 2019-11-16 01:04, Peter Geoghegan wrote: On Fri, Nov 15, 2019 at 5:16 AM Robert Haas wrote: Hmm. Well, maybe I'm just behind the times. But that same wikipedia article also says that deduplication works on large chunks "such as entire files or large sections of files" thus differentia

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-11-15 Thread Peter Geoghegan
On Fri, Nov 15, 2019 at 5:43 PM Mark Dilger wrote: > On 11/13/19 11:51 AM, Peter Geoghegan wrote: > > Can you suggest an alternative? > > Dupression This suggestion makes me feel better about "deduplication". -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-11-15 Thread Mark Dilger
On 11/13/19 11:51 AM, Peter Geoghegan wrote: Can you suggest an alternative? Dupression -- Mark Dilger

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-11-15 Thread Peter Geoghegan
On Wed, Sep 11, 2019 at 2:04 PM Peter Geoghegan wrote: > > I haven't measured how these changes affect WAL size yet. > > Do you have any suggestions on how to automate testing of new WAL records? > > Is there any suitable place in regression tests? > > I don't know about the regression tests (I do

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-11-15 Thread Peter Geoghegan
On Fri, Nov 15, 2019 at 5:16 AM Robert Haas wrote: > Hmm. Well, maybe I'm just behind the times. But that same wikipedia > article also says that deduplication works on large chunks "such as > entire files or large sections of files" thus differentiating it from > compression algorithms which work

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-11-15 Thread Robert Haas
On Wed, Nov 13, 2019 at 2:51 PM Peter Geoghegan wrote: > "Deduplication" never means that you get rid of duplicates. According > to Wikipedia's deduplication article: "Whereas compression algorithms > identify redundant data inside individual files and encodes this > redundant data more efficientl

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-11-13 Thread Peter Geoghegan
On Wed, Nov 13, 2019 at 11:33 AM Robert Haas wrote: > On Tue, Nov 12, 2019 at 6:22 PM Peter Geoghegan wrote: > > * Disabled deduplication in system catalog indexes by deeming it > > generally unsafe. > > I (continue to) think that deduplication is a terrible name, because > you're not getting rid

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-11-13 Thread Robert Haas
On Tue, Nov 12, 2019 at 6:22 PM Peter Geoghegan wrote: > * Disabled deduplication in system catalog indexes by deeming it > generally unsafe. I (continue to) think that deduplication is a terrible name, because you're not getting rid of the duplicates. You are using a compressed representation of

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-10-02 Thread Peter Geoghegan
On Mon, Sep 30, 2019 at 7:39 PM Peter Geoghegan wrote: > I've found that my "regular pgbench, but with a low cardinality index > on pgbench_accounts(abalance)" benchmark works best with the specific > heuristics used in the patch, especially over many hours. I ran pgbench without the pgbench_acco

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-27 Thread Peter Geoghegan
On Fri, Sep 27, 2019 at 9:43 AM Anastasia Lubennikova wrote: > Attached is v19. Cool. > * By default deduplication is on for non-unique indexes and off for > unique ones. I think that it makes sense to enable deduplication by default -- even with unique indexes. It looks like deduplication can

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-25 Thread Peter Geoghegan
On Wed, Sep 25, 2019 at 8:05 AM Anastasia Lubennikova wrote: > Attached is v18. In this version bt_dedup_one_page() is refactored so that: > - no temp page is used, all updates are applied to the original page. > - each posting tuple wal logged separately. > This also allowed to simplify btree_xlo

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-25 Thread Anastasia Lubennikova
24.09.2019 3:13, Peter Geoghegan wrote: On Wed, Sep 18, 2019 at 7:25 PM Peter Geoghegan wrote: I attach version 16. This revision merges your recent work on WAL logging with my recent work on simplifying _bt_dedup_one_page(). See my e-mail from earlier today for details. I attach version 17. T

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-24 Thread Peter Geoghegan
On Mon, Sep 23, 2019 at 5:13 PM Peter Geoghegan wrote: > I attach version 17. I attach a patch that applies on top of v17. It adds support for deduplication within unique indexes. Actually, this is a snippet of code that appeared in my prototype from August 5 (we need very little extra code for t

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-18 Thread Peter Geoghegan
On Wed, Sep 18, 2019 at 10:43 AM Peter Geoghegan wrote: > This also suggests that making _bt_dedup_one_page() do raw page adds > and page deletes to the page in shared_buffers (i.e. don't use a temp > buffer page) could pay off. As I went into at the start of this > e-mail, unnecessarily doing exp

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-18 Thread Peter Geoghegan
On Tue, Sep 17, 2019 at 9:43 AM Anastasia Lubennikova wrote: > > 3. Third, there is incremental writing of the page itself -- avoiding > > using a temp buffer. Not sure where I stand on this. > > I think it's a good idea. memmove must be much faster than copying > items tuple by tuple. > I'll sen

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-17 Thread Anastasia Lubennikova
16.09.2019 21:58, Peter Geoghegan wrote: On Mon, Sep 16, 2019 at 8:48 AM Anastasia Lubennikova wrote: I tested patch with nbtree_wal_test, and found out that the real issue is not the dedup WAL records themselves, but the full page writes that they trigger. Here are test results (config is sta

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-16 Thread Peter Geoghegan
On Mon, Sep 16, 2019 at 8:48 AM Anastasia Lubennikova wrote: > Attached is v14 based on v12 (v13 changes are not merged). > > In this version, I fixed the bug you mentioned and also fixed nbtinsert, > so that it doesn't save newposting in xlog record anymore. Cool. > I tested patch with nbtree_w

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-16 Thread Anastasia Lubennikova
13.09.2019 4:04, Peter Geoghegan wrote: On Wed, Sep 11, 2019 at 2:04 PM Peter Geoghegan wrote: I think that the new WAL record has to be created once per posting list that is generated, not once per page that is deduplicated -- that's the only way that I can see that avoids a huge increase in t

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-11 Thread Peter Geoghegan
On Wed, Sep 11, 2019 at 3:09 PM Peter Geoghegan wrote: > Hmm. So v12 seems to have some problems with the WAL logging for > posting list splits. With wal_debug = on and > wal_consistency_checking='all', I can get a replica to fail > consistency checking very quickly when "make installcheck" is run

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-11 Thread Peter Geoghegan
On Wed, Sep 11, 2019 at 5:38 AM Anastasia Lubennikova wrote: > Attached is v12, which contains WAL optimizations for posting split and > page > deduplication. Hmm. So v12 seems to have some problems with the WAL logging for posting list splits. With wal_debug = on and wal_consistency_checking='al

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-11 Thread Peter Geoghegan
On Wed, Sep 11, 2019 at 5:38 AM Anastasia Lubennikova wrote: > I reviewed them and everything looks good except the idea of not > splitting dead posting tuples. > According to comments to scan->ignore_killed_tuples in genam.c:107, > it may lead to incorrect tuple order on a replica. > I don't sure

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-11 Thread Anastasia Lubennikova
09.09.2019 22:54, Peter Geoghegan wrote: Attached is v11, which makes the kill_prior_tuple optimization work with posting list tuples. The only catch is that it can only work when all "logical tuples" within a posting list are known-dead, since of course there is only one LP_DEAD bit available f

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-31 Thread Peter Geoghegan
On Thu, Aug 29, 2019 at 10:10 PM Peter Geoghegan wrote: > I see some Valgrind errors on v9, all of which look like the following > two sample errors I go into below. I've found a fix for these Valgrind issues. It's a matter of making sure that _bt_truncate() sizes new pivot tuples properly, which

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-29 Thread Peter Geoghegan
On Thu, Aug 29, 2019 at 5:07 PM Peter Geoghegan wrote: > I agree that v9 might be ever so slightly more space efficient than v5 > was, on balance. I see some Valgrind errors on v9, all of which look like the following two sample errors I go into below. First one: ==11193== VALGRINDERROR-BEGIN =

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-29 Thread Peter Geoghegan
On Thu, Aug 29, 2019 at 5:13 AM Anastasia Lubennikova wrote: > Your explanation helped me to understand that this approach can be > extended to > the case of insertion into posting list, that doesn't trigger posting > split, > and that nbtsplitloc indeed doesn't need to know about posting tuples >

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-29 Thread Anastasia Lubennikova
28.08.2019 6:19, Peter Geoghegan wrote: On Fri, Aug 16, 2019 at 8:56 AM Anastasia Lubennikova wrote: Now the algorithm is the following: - In case page split is needed, pass both tuples to _bt_split(). _bt_findsplitloc() is now aware of upcoming replacement of origtup with neworigtup, so it

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-27 Thread Peter Geoghegan
On Fri, Aug 16, 2019 at 8:56 AM Anastasia Lubennikova wrote: > Now the algorithm is the following: > - In case page split is needed, pass both tuples to _bt_split(). > _bt_findsplitloc() is now aware of upcoming replacement of origtup with > neworigtup, so it uses correct item size where needed.

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-23 Thread Anastasia Lubennikova
23.08.2019 7:33, Peter Geoghegan wrote: On Wed, Aug 21, 2019 at 10:19 AM Anastasia Lubennikova wrote: I'm going to look through the patch once more to update nbtxlog comments, where needed and answer to your remarks that are still left in the comments. Have you been using amcheck's rootdescend

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-22 Thread Peter Geoghegan
On Wed, Aug 21, 2019 at 10:19 AM Anastasia Lubennikova wrote: > I'm going to look through the patch once more to update nbtxlog > comments, where needed and > answer to your remarks that are still left in the comments. Have you been using amcheck's rootdescend verification? I see this problem wit

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-21 Thread Anastasia Lubennikova
20.08.2019 4:04, Peter Geoghegan wrote: On Fri, Aug 16, 2019 at 8:56 AM Anastasia Lubennikova wrote: It seems that now all replace operations are crash-safe. The new patch passes all regression tests, so I think it's ready for review again. I'm looking at it now. I'm going to spend a signific

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-19 Thread Peter Geoghegan
On Fri, Aug 16, 2019 at 8:56 AM Anastasia Lubennikova wrote: > Now the algorithm is the following: > > - If bt_findinsertloc() found out that tuple belongs to existing posting > tuple's > TID interval, it sets 'in_posting_offset' variable and passes it to > _bt_insertonpg() > > - If 'in_posting_o

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-19 Thread Peter Geoghegan
On Tue, Aug 13, 2019 at 8:45 AM Anastasia Lubennikova wrote: > > I still need to think about the exact details of alignment within > > _bt_insertonpg_in_posting(). I'm worried about boundary cases there. I > > could be wrong. > Could you explain more about these cases? > Now I don't understand the

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-16 Thread Anastasia Lubennikova
13.08.2019 18:45, Anastasia Lubennikova wrote:   I also added a nearby FIXME comment to _bt_insertonpg_in_posting() -- I don't think think that the code for splitting a posting list in two is currently crash-safe. Good catch. It seems, that I need to rearrange the code. I'll send updated patch

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-13 Thread Anastasia Lubennikova
06.08.2019 4:28, Peter Geoghegan wrote: Attached is v5, which is based on your v4. The three main differences between this and v4 are: * Removed BT_COMPRESS_THRESHOLD stuff, for the reasons explained in my July 24 e-mail. We can always add something like this back during performance validation o

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-05 Thread Peter Geoghegan
On Wed, Jul 31, 2019 at 9:23 AM Anastasia Lubennikova wrote: > > * Included my own pageinspect hack to visualize the minimum TIDs in > > posting lists. It's broken out into a separate patch file. The code is > > very rough, but it might help someone else, so I thought I'd include > > it. > Cool, I

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-31 Thread Anastasia Lubennikova
24.07.2019 4:22, Peter Geoghegan wrote: Attached is a revised version of your v2 that fixes this issue -- I'll call this v3. In general, my goal for the revision was to make sure that all of my old tests from the v12 work passed, and to make sure that amcheck can detect almost any possible probl

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-31 Thread Rafia Sabih
On Thu, 25 Jul 2019 at 05:49, Peter Geoghegan wrote: > > On Wed, Jul 24, 2019 at 3:06 PM Peter Geoghegan wrote: > > There seems to be a kind of "synergy" between the nbtsplitloc.c > > handling of pages that have lots of duplicates and posting list > > compression. It seems as if the former mechan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-24 Thread Peter Geoghegan
On Wed, Jul 24, 2019 at 3:06 PM Peter Geoghegan wrote: > There seems to be a kind of "synergy" between the nbtsplitloc.c > handling of pages that have lots of duplicates and posting list > compression. It seems as if the former mechanism "sets up the bowling > pins", while the latter mechanism "kn

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-24 Thread Peter Geoghegan
On Tue, Jul 23, 2019 at 6:22 PM Peter Geoghegan wrote: > Attached is a revised version of your v2 that fixes this issue -- I'll > call this v3. Remember that index that I said was 5.5x smaller with the patch applied, following retail insertions (a single big INSERT ... SELECT ...)? Well, it's 6.5

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-23 Thread Peter Geoghegan
On Fri, Jul 19, 2019 at 7:24 PM Peter Geoghegan wrote: > Hmm. So, the attached test case fails amcheck verification for me with > the latest version of the patch: Attached is a revised version of your v2 that fixes this issue -- I'll call this v3. In general, my goal for the revision was to make

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-19 Thread Peter Geoghegan
On Fri, Jul 19, 2019 at 12:32 PM Peter Geoghegan wrote: > On Fri, Jul 19, 2019 at 10:53 AM Anastasia Lubennikova > wrote: > > Patch 0002 (must be applied on top of 0001) implements preserving of > > correct TID order > > inside posting list when inserting new tuples. > > This version passes all r

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-19 Thread Peter Geoghegan
On Fri, Jul 19, 2019 at 10:53 AM Anastasia Lubennikova wrote: > Patch 0002 (must be applied on top of 0001) implements preserving of > correct TID order > inside posting list when inserting new tuples. > This version passes all regression tests including amcheck test. > I also used following scrip

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-19 Thread Anastasia Lubennikova
17.07.2019 19:36, Anastasia Lubennikova: There is one major issue left - preserving TID order in posting lists. For a start, I added a sort into BTreeFormPostingTuple function. It turned out to be not very helpful, because we cannot check this invariant lazily. Now I work on patching _bt_bins

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-17 Thread Anastasia Lubennikova
11.07.2019 21:19, Peter Geoghegan wrote: On Thu, Jul 11, 2019 at 8:34 AM Rafia Sabih wrote: Hi, Peter, Rafia, thanks for the review. New version is attached. + elog(DEBUG4, "insert_itupprev_to_page. compressState->ntuples %d IndexTupleSize %zu free %zu", + compressState->ntuples, IndexTupleS

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Peter Geoghegan
On Thu, Jul 11, 2019 at 10:42 AM Peter Geoghegan wrote: > > I think unique indexes may benefit from deduplication not only because > > of NULL values. Non-HOT updates produce duplicates of non-NULL values > > in unique indexes. And those duplicates can take significant space. > > I agree that we

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Peter Geoghegan
On Thu, Jul 11, 2019 at 8:34 AM Rafia Sabih wrote: > I tried this patch and found the improvements impressive. However, > when I tried with multi-column indexes it wasn't giving any > improvement, is it the known limitation of the patch? It'll only deduplicate full duplicates. It works with multi

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Peter Geoghegan
On Thu, Jul 11, 2019 at 8:09 AM Alexander Korotkov wrote: > BTW, I think deduplication could cause some small performance > degradation in some particular cases, because page-level locks became > more coarse grained once pages hold more tuples. However, this > doesn't seem like something we shoul

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Peter Geoghegan
On Thu, Jul 11, 2019 at 8:02 AM Alexander Korotkov wrote: > Could you please elaborate more on preserving the logical contents? I > can understand it as following: "B-Tree should have the same structure > and invariants as if each TID in posting list be a separate tuple". That's exactly what I m

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Peter Geoghegan
On Thu, Jul 11, 2019 at 7:30 AM Bruce Momjian wrote: > Wow, I never thought of that. The only things I know we lock until > transaction end are rows we update (against concurrent updates), and > additions to unique indexes. By definition, indexes with many > duplicates are not unique, so that do

Fwd: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Rafia Sabih
On Sun, 7 Jul 2019 at 01:08, Peter Geoghegan wrote: > * Maybe we could do compression with unique indexes when inserting > values with NULLs? Note that we now treat an insertion of a tuple with +1 I tried this patch and found the improvements impressive. However, when I tried with multi-column i

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Alexander Korotkov
On Thu, Jul 11, 2019 at 7:53 AM Peter Geoghegan wrote: > Anyway, I think that *hundreds* or even *thousands* of rows are > effectively locked all at once when a bitmap index needs to be updated > in these other systems -- and I mean a heavyweight lock that lasts > until the xact commits or aborts,

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Alexander Korotkov
Hi Peter, Thank you very much for your attention to this patch. Let me comment some points of your message. On Sun, Jul 7, 2019 at 2:09 AM Peter Geoghegan wrote: > On Thu, Jul 4, 2019 at 5:06 AM Anastasia Lubennikova > wrote: > > The new version of the patch is attached. > > This version is eve

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Bruce Momjian
On Wed, Jul 10, 2019 at 09:53:04PM -0700, Peter Geoghegan wrote: > Anyway, I think that *hundreds* or even *thousands* of rows are > effectively locked all at once when a bitmap index needs to be updated > in these other systems -- and I mean a heavyweight lock that lasts > until the xact commits o

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-10 Thread Peter Geoghegan
On Sat, Jul 6, 2019 at 4:08 PM Peter Geoghegan wrote: > I took a closer look at this patch, and have some general thoughts on > its design, and specific feedback on the implementation. I have some high level concerns about how the patch might increase contention, which could make queries slower.

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-06 Thread Peter Geoghegan
On Thu, Jul 4, 2019 at 5:06 AM Anastasia Lubennikova wrote: > The new version of the patch is attached. > This version is even simpler than the previous one, > thanks to the recent btree design changes and all the feedback I received. > I consider it ready for review and testing. I took a closer

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-05 Thread Bruce Momjian
On Thu, Jul 4, 2019 at 05:06:09PM -0700, Peter Geoghegan wrote: > This result is very impressive. We'll need to revisit what the right > trade-off is for the compression scheme, which Heikki had some > thoughts on when we left off 3 years ago, but that should be a lot > easier now. I am very encou

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-04 Thread Peter Geoghegan
On Thu, Jul 4, 2019 at 10:38 AM Peter Geoghegan wrote: > I tried this on my own "UK land registry" test data [1], which was > originally used for the v12 nbtree work. My test case has a low > cardinality, multi-column text index. I chose this test case because > it was convenient for me. > > On v1

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-04 Thread Peter Geoghegan
On Thu, Jul 4, 2019 at 5:06 AM Anastasia Lubennikova wrote: > i - number of distinct values in the index. > So i=1 means that all rows have the same key, > and i=1000 means that all keys are different. > > i / old size (MB) / new size (MB) > 121588 > 100021590 > 100

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-04 Thread Anastasia Lubennikova
The new version of the patch is attached. This version is even simpler than the previous one, thanks to the recent btree design changes and all the feedback I received. I consider it ready for review and testing. [feature overview] This patch implements the deduplication of btree non-pivot tuples