Re: [HACKERS] Brain dump: btree collapsing

2003-02-28 Thread Oleg Bartunov
Alvaro, you're welcome to put your hands on to GiST. Me and Teodor hoped to add concurrency last year but we were too busy. I consider concurrency issue the most important. There are also some thoughts about improvement of GiST interface, though. Regards, Oleg On Thu, 27

Re: [HACKERS] Brain dump: btree collapsing

2003-02-27 Thread Tom Lane
"Christopher Kings-Lynne" <[EMAIL PROTECTED]> writes: >> The other is that now I am left without a graduate project :-( > [ various good suggestions ] FWIW, I would much rather see effort put into GiST than hash indexes. I think that btree and GiST will be our two primary index types in the long

Re: [HACKERS] Brain dump: btree collapsing

2003-02-27 Thread Christopher Kings-Lynne
> Two things I regret: one is being unable to see the changes as patches > the way you applied them, to get a sense of how the code evolved. > Unfortunately the interface to CVS via web does not allow me to see it, > or I don't know how to use it. It's not that important, however, > because I was

Re: [HACKERS] Brain dump: btree collapsing

2003-02-27 Thread Christopher Kings-Lynne
> Two things I regret: one is being unable to see the changes as patches > the way you applied them, to get a sense of how the code evolved. > Unfortunately the interface to CVS via web does not allow me to see it, > or I don't know how to use it. It's not that important, however, > because I was

Re: [HACKERS] Brain dump: btree collapsing

2003-02-27 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Putting the freelist on FSM rather on the metapage still strikes me as > kind of strange; I remember you said the metapage was not enough space > for all the possible candidate pages, but the FSM is even more limited. Well, on modern machines there shou

Re: [HACKERS] Brain dump: btree collapsing

2003-02-27 Thread Alvaro Herrera
On Wed, Feb 12, 2003 at 05:42:44PM -0500, Tom Lane wrote: > I've been thinking hard for the last few days about how to do space > reclamation in b-tree indexes, i.e., recycle pages that are in > no-longer-useful portions of the tree structure. Hi Tom, I've seen your applied changes. It looks rea

Re: [HACKERS] Brain dump: btree collapsing

2003-02-18 Thread Curtis Faith
> tom lane wrote: > > Hm. A single lock that must be grabbed for operations anywhere in > > the index is a concurrency bottleneck. I replied a bit later: > I don't think it would be that bad. It's not a lock but a > mutex that would only be held for relatively brief duration. > It looks like

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Tom Lane
"Curtis Faith" <[EMAIL PROTECTED]> writes: >> "Stored in the index"? And how will you do that portably? > Sorry for the lack of rigorous language. I meant that there would be one > mutex per index stored in the header or internal data structures > associated with each index somewhere. Probably

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Curtis Faith
tom lane wrote: > Hmmm ... that might be made to work, but it would complicate > inserts. By the time an insert navigates to the page it > should insert on, it might find the page is dead, and then it > would have no easy way to get to the replacement page (unless > you want to dedicate another

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Tom Lane
"Curtis Faith" <[EMAIL PROTECTED]> writes: > 4) This could easily be reordered into: > buf = ReadBuffer(rel, blkno); /* pin next page > */ > LockBuffer(buf, BUFFER_LOCK_UNLOCK);/* release lock on > current page */ > LockBuffer(buf, BT_READ);

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Curtis Faith
I previously wrote: > 5) A mutex/spinlock that was stored in the index could be > acquired by the scan code like this: > > buf = ReadBuffer(rel, blkno); /* pin next page > */ > > SpinLockAcquire( indexSpecificMutex );/* lock the index > reorg mutex */ > >

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Curtis Faith
tom lane wrote: > How does that help? The left-moving indexscan still has no > way to recover. It can't go back to the page it was on > before and try to determine which entries you added there, > because it has no reliable reference point to do so. The > entry it previously returned might n

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Tom Lane
Manfred Koizar <[EMAIL PROTECTED]> writes: > Given that we have a mostly empty metapage per index, and the metapage > is in memory most of the time, using it for the freelist looks almost > like a free lunch. No, because of locking. Every time you write-lock the metapage to add or remove freelist

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Tom Lane
"Curtis Faith" <[EMAIL PROTECTED]> writes: > Another way this could be handled is by not merging onto one of the > existing pages but to a brand new page, a kind of special case shadow > index page. Hmmm ... that might be made to work, but it would complicate inserts. By the time an insert navigat

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Manfred Koizar
On Wed, 12 Feb 2003 17:42:44 -0500, Tom Lane <[EMAIL PROTECTED]> wrote: >Instead of an actively maintained freelist on disk as per Alvaro Herrera's >patch, I plan to use the FSM to remember where recyclable pages are, much >as we do for tables. Given that we have a mostly empty metapage per index,

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Curtis Faith
tom lane wrote: > How does that help? The left-moving indexscan still has no > way to recover. It can't go back to the page it was on > before and try to determine which entries you added there, > because it has no reliable reference point to do so. The > entry it previously returned might

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Tom Lane
Hannu Krosing <[EMAIL PROTECTED]> writes: > Tom Lane kirjutas R, 14.02.2003 kell 01:13: >> How is returning the same data twice not an "ill effect"? > From earlier discussions I understood that there had been some work done > on using btrees for indexing arrays by storing each separate element in

Re: [HACKERS] Brain dump: btree collapsing

2003-02-14 Thread Michael Paesold
Hannu Krosing <[EMAIL PROTECTED]> wrote: > could we just not lock (for more than just to ensure atomic writes) the > page but instead increment a "page version" on each write to detect > changes? Sounds like Index MVCC..., very nice. ;-) (Of course I have no clue about feasibility, just liked the

Re: [HACKERS] Brain dump: btree collapsing

2003-02-13 Thread Hannu Krosing
Tom Lane kirjutas R, 14.02.2003 kell 01:13: > Hannu Krosing <[EMAIL PROTECTED]> writes: > > But if we would allow the scans to find the same keys twice without ill > > effects (as was suggested earlier, for using btrees to index arrays), > > How is returning the same data twice not an "ill effect"

Re: [HACKERS] Brain dump: btree collapsing

2003-02-13 Thread Bruce Momjian
Tom Lane wrote: > Hannu Krosing <[EMAIL PROTECTED]> writes: > > But if we would allow the scans to find the same keys twice without ill > > effects (as was suggested earlier, for using btrees to index arrays), > > How is returning the same data twice not an "ill effect"? > > > then we could possi

Re: [HACKERS] Brain dump: btree collapsing

2003-02-13 Thread Tom Lane
Hannu Krosing <[EMAIL PROTECTED]> writes: > But if we would allow the scans to find the same keys twice without ill > effects (as was suggested earlier, for using btrees to index arrays), How is returning the same data twice not an "ill effect"? > then we could possibly 1) copy the key to the rig

Re: [HACKERS] Brain dump: btree collapsing

2003-02-13 Thread Hannu Krosing
Tom Lane kirjutas N, 13.02.2003 kell 20:10: > "Curtis Faith" <[EMAIL PROTECTED]> writes: > > I don't dispute their conclusions in that context and under the > > circumstances they outline of random distribution of deletion and > > insertion values for the index keys. [But the random-distribution >

Re: [HACKERS] Brain dump: btree collapsing

2003-02-13 Thread Curtis Faith
tom lane wrote: > "Curtis Faith" <[EMAIL PROTECTED]> writes: > > I don't dispute their conclusions in that context and under the > > circumstances they outline of random distribution of deletion and > > insertion values for the index keys. [But the random-distribution > > assumption doesn't alwa

Re: [HACKERS] Brain dump: btree collapsing

2003-02-13 Thread Tom Lane
"Curtis Faith" <[EMAIL PROTECTED]> writes: > I don't dispute their conclusions in that context and under the > circumstances they outline of random distribution of deletion and > insertion values for the index keys. [But the random-distribution > assumption doesn't always hold.] That's a fair poi

Re: [HACKERS] Brain dump: btree collapsing

2003-02-13 Thread Curtis Faith
tom lane wrote: > Got any evidence to back that up? I was relying on > > [Johnson89] Johnson, T. and Shasha, D. Utilization of > B-trees with Inserts, Deletes and Modifies ACM Symp. on > PODS, 235-246, 1989. > > which provides a ton of math and simulations leading up to > the conclusion tha

Re: [HACKERS] Brain dump: btree collapsing

2003-02-13 Thread Tom Lane
"Curtis Faith" <[EMAIL PROTECTED]> writes: > ISTM, that a VACUUM that only reclaims empty pages will be helpful in > certain cases but won't help much at all in many other common "normal > operation" cases which would be helped by partial reorganization. Got any evidence to back that up? I was re

Re: [HACKERS] Brain dump: btree collapsing

2003-02-13 Thread Tom Lane
Daniel Kalchev <[EMAIL PROTECTED]> writes: > ... Thus we should 'free' the index > pages with one VACUUM run, instead of two. Bear in mind that the only thing that deletes index entries is ... VACUUM. Thus what we're really discussing is whether VACUUM should delete an index page at the same time

Re: [HACKERS] Brain dump: btree collapsing

2003-02-13 Thread Curtis Faith
tom lane initially wrote: > Restructuring the tree during page deletion > --- > > We will delete only completely-empty pages. If we were to > merge nearly-empty pages by moving data items from one page > to an adjacent one, this would imply changing the pa

Re: [HACKERS] Brain dump: btree collapsing

2003-02-13 Thread Daniel Kalchev
>>>Bruce Momjian said: > > This brings up one item it would be nice to address at the same time. > It would be nice if VACUUM FULL would be able to compress the actual > index file and return unused space to the operating system. REINDEX > does this, but I was thinking of something a little

Re: [HACKERS] Brain dump: btree collapsing

2003-02-13 Thread Daniel Kalchev
>>>Justin Clift said: > > > In theory, if we find recyclable page(s) at the physical end of the index, > > we could truncate the file (ie, give the space back to the filesystem) > > instead of reporting these pages to FSM. I am not sure if this is worth > > doing --- in most cases it's likel

Re: [HACKERS] Brain dump: btree collapsing

2003-02-13 Thread Daniel Kalchev
Tom, Sound excellent. Index growth has been something that always bothered me (not the disk space usage, but the slow searches). I believe it's best to have pages marked dead at the time the last key contained in the page is deleted (you didn't discuss how efficient this is), because this will

Re: [HACKERS] Brain dump: btree collapsing

2003-02-12 Thread Bruce Momjian
I think having VACUUM record free index pages just like free heap pages makes perfect sense, and is consistent. This brings up one item it would be nice to address at the same time. It would be nice if VACUUM FULL would be able to compress the actual index file and return unused space to the ope

Re: [HACKERS] Brain dump: btree collapsing

2003-02-12 Thread Bruce Momjian
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > It would be nice if VACUUM FULL would be able to compress the actual > > index file and return unused space to the operating system. REINDEX > > does this, but I was thinking of something a little lighter that could > > be done automa

Re: [HACKERS] Brain dump: btree collapsing

2003-02-12 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes: > It would be nice if VACUUM FULL would be able to compress the actual > index file and return unused space to the operating system. REINDEX > does this, but I was thinking of something a little lighter that could > be done automatically as part of VACUUM

Re: [HACKERS] Brain dump: btree collapsing

2003-02-12 Thread Tom Lane
Justin Clift <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> The deletion procedure could be triggered immediately upon removal of the >> last item in a page, or when the next VACUUM scan finds an empty page. >> Not sure yet which way is better. > Having it triggered immediately upon removal of t

Re: [HACKERS] Brain dump: btree collapsing

2003-02-12 Thread Justin Clift
Tom Lane wrote: The deletion procedure could be triggered immediately upon removal of the last item in a page, or when the next VACUUM scan finds an empty page. Not sure yet which way is better. Having it triggered immediately upon removal of the last item in a page would make for a more "self

[HACKERS] Brain dump: btree collapsing

2003-02-12 Thread Tom Lane
I've been thinking hard for the last few days about how to do space reclamation in b-tree indexes, i.e., recycle pages that are in no-longer-useful portions of the tree structure. We know we need this to solve the "index bloat" problem that occurs when the distribution of keys changes over time.