Re: MaxOffsetNumber for Table AMs

2021-05-18 Thread Peter Geoghegan
On Thu, May 6, 2021 at 4:10 AM Matthias van de Meent wrote: > See below. I'm not saying we need it _right now_, but at the very > least I'd like to argue that we should not close the door on varlena > TIDs, because there _are_ reasons for those TID types. See also below. Perhaps I was a bit too s

Re: MaxOffsetNumber for Table AMs

2021-05-06 Thread Matthias van de Meent
On Thu, 6 May 2021 at 01:22, Peter Geoghegan wrote: > > On Wed, May 5, 2021 at 3:18 PM Matthias van de Meent > wrote: > > I believe that the TID is the unique identifier of that tuple, within > > context. > > > > For normal indexes, the TID as supplied directly by the TableAM is > > sufficient,

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Robert Haas
On Wed, May 5, 2021 at 10:53 PM Jeff Davis wrote: > On Thu, 2021-05-06 at 03:26 +0200, Hannu Krosing wrote: > > How hard would it be to declare TID as current ItemPointerData with > > some values prohibited (NULL, SpecTokenOffsetNumber = 0xfffe, > > MovedPartitionsOffsetNumber = 0xfffd, presumably

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Hannu Krosing
- Hannu Krosing On Thu, May 6, 2021 at 4:53 AM Jeff Davis wrote: > > On Thu, 2021-05-06 at 03:26 +0200, Hannu Krosing wrote: > > How hard would it be to declare TID as current ItemPointerData with > > some values prohibited (NULL, SpecTokenOffsetNumber = 0xfffe, > > MovedPartitionsOffsetNumbe

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Jeff Davis
On Thu, 2021-05-06 at 03:26 +0200, Hannu Krosing wrote: > How hard would it be to declare TID as current ItemPointerData with > some values prohibited (NULL, SpecTokenOffsetNumber = 0xfffe, > MovedPartitionsOffsetNumber = 0xfffd, presumably also 0x ?). I don't think there's consensus in this t

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Hannu Krosing
On Thu, May 6, 2021 at 3:07 AM Robert Haas wrote: > > On Wed, May 5, 2021 at 3:43 PM Matthias van de Meent > > Storage gains for index-oriented tables can become as large as the > > size of the primary key by not having to store all primary key values > > in both the index and the table; which can

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Hannu Krosing
How hard would it be to declare TID as current ItemPointerData with some values prohibited (NULL, SpecTokenOffsetNumber = 0xfffe, MovedPartitionsOffsetNumber = 0xfffd, presumably also 0x ?). And then commit to fixing usage outside access/heap/ which depend on small value for MaxHeapTuplesPerPa

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Robert Haas
On Wed, May 5, 2021 at 3:43 PM Matthias van de Meent wrote: > I believe that it cannot be "just" an additive thing, at least not > through a normal INCLUDEd column, as you'd get duplicate TIDs in the > index, with its related problems. You also cannot add it as a key > column, as this would disabl

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Peter Geoghegan
On Wed, May 5, 2021 at 3:18 PM Matthias van de Meent wrote: > I believe that the TID is the unique identifier of that tuple, within context. > > For normal indexes, the TID as supplied directly by the TableAM is > sufficient, as the context is that table. > For global indexes, this TID must includ

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Matthias van de Meent
On Wed, 5 May 2021 at 22:09, Peter Geoghegan wrote: > > On Wed, May 5, 2021 at 12:43 PM Matthias van de Meent > wrote: > > I believe that it cannot be "just" an additive thing, at least not > > through a normal INCLUDEd column, as you'd get duplicate TIDs in the > > index, with its related proble

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Peter Geoghegan
On Wed, May 5, 2021 at 12:43 PM Matthias van de Meent wrote: > I believe that it cannot be "just" an additive thing, at least not > through a normal INCLUDEd column, as you'd get duplicate TIDs in the > index, with its related problems. You also cannot add it as a key > column, as this would disab

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Matthias van de Meent
On Wed, 5 May 2021 at 19:15, Peter Geoghegan wrote: > > On Wed, May 5, 2021 at 9:42 AM Robert Haas wrote: > > On Wed, May 5, 2021 at 11:50 AM Peter Geoghegan wrote: > > > I'm being very vocal here because I'm concerned that we're going about > > > generalizing TIDs in the wrong way. To me it fee

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Peter Geoghegan
On Wed, May 5, 2021 at 12:09 PM Jeff Davis wrote: > Like anything, we make the decision at the time we have a reason to > break something. But why are are exensions disfavored in this > calculation vs. in-core? Isn't it a lot easier to update in-core code > to new APIs? We don't really have an AP

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Jeff Davis
On Wed, 2021-05-05 at 11:22 -0700, Andres Freund wrote: > Yea. I think it would be actively *bad* if tableam were too > stable. tableam is at best an 80% solution to the abstraction needs > (those 80% were pretty painful to achieve already, I don't think we > could have gotten much more initially).

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Peter Geoghegan
On Wed, May 5, 2021 at 11:25 AM Andres Freund wrote: > Agreed. And we can increase the fit a good bit without needing invasive > all-over changes. With those changes often even helping heap. > > E.g. tidbitmap.c's harcoded use of MaxHeapTuplesPerPage is a problem > even for heap - we waste a lot o

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Peter Geoghegan
On Wed, May 5, 2021 at 10:56 AM Jeff Davis wrote: > What has little chance of succeeding? Table AMs? > > And why isn't columnar an example of someting that can "get by with > heapam's idea of TID"? I mean, it's not a perfect fit, but my primary > complaint this whole thread is that it's undefined,

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Andres Freund
Hi, On 2021-05-05 10:56:56 -0700, Jeff Davis wrote: > On Wed, 2021-05-05 at 10:48 -0700, Peter Geoghegan wrote: > > What we have right now has little chance of failing. It also has > > little chance of succeeding (except for something like zheap, which > > can presumably get by with the heapam's i

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Peter Geoghegan
On Wed, May 5, 2021 at 10:57 AM Robert Haas wrote: > One advantage of indirect indexes is that you can potentially avoid a > lot of writes to the index. If a non-HOT update is performed, but the > primary key is not updated, the index does not need to be touched. I > think that's a potentially sig

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Andres Freund
Hi, On 2021-05-05 13:32:57 -0400, Robert Haas wrote: > I don't know what to say here. I think it's unrealistic to believe > that a very new API that has only 1 in-core user is going to be fully > stable, or that we can know how it might evolve. I can understand why > you and probably other people

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Robert Haas
On Wed, May 5, 2021 at 1:15 PM Peter Geoghegan wrote: > > I don't think this is true at all. If you have a clustered index - > > i.e. the table is physically arranged according to the index ordering > > - then your secondary indexes all pretty much have to be what we're > > calling indirect indexe

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Jeff Davis
On Wed, 2021-05-05 at 10:48 -0700, Peter Geoghegan wrote: > What we have right now has little chance of failing. It also has > little chance of succeeding (except for something like zheap, which > can presumably get by with the heapam's idea of TID). What has little chance of succeeding? Table AMs

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Peter Geoghegan
On Wed, May 5, 2021 at 10:33 AM Robert Haas wrote: > I don't know what to say here. I think it's unrealistic to believe > that a very new API that has only 1 in-core user is going to be fully > stable, or that we can know how it might evolve. I can understand why > you and probably other people wa

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Robert Haas
On Wed, May 5, 2021 at 1:13 PM Jeff Davis wrote: > "In core" shouldn't matter. In fact, if it's in core, stability of the > APIs is much less important. I don't know what to say here. I think it's unrealistic to believe that a very new API that has only 1 in-core user is going to be fully stable,

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Jeff Davis
On Wed, 2021-05-05 at 08:50 -0700, Peter Geoghegan wrote: > There just isn't that > many table AM TID designs that could ever work, and even among those > schemes that could ever work there is a pretty clear hierarchy. This > blue sky thinking about generalizing TIDs 2 years in seems *weird* to > m

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Peter Geoghegan
On Wed, May 5, 2021 at 9:42 AM Robert Haas wrote: > On Wed, May 5, 2021 at 11:50 AM Peter Geoghegan wrote: > > I'm being very vocal here because I'm concerned that we're going about > > generalizing TIDs in the wrong way. To me it feels like there is a > > loss of perspective about what really ma

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Jeff Davis
On Wed, 2021-05-05 at 10:27 -0400, Robert Haas wrote: > It's too early for the project to commit to stability in > this area; we have not managed to get a single AM apart from heapam > into core "In core" shouldn't matter. In fact, if it's in core, stability of the APIs is much less important. >

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Robert Haas
On Wed, May 5, 2021 at 11:50 AM Peter Geoghegan wrote: > I'm being very vocal here because I'm concerned that we're going about > generalizing TIDs in the wrong way. To me it feels like there is a > loss of perspective about what really matters. Well, which things matter is a question of opinion,

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Peter Geoghegan
On Wed, May 5, 2021 at 7:27 AM Robert Haas wrote: > It seems to me that we're doing a lot of disagreeing given that, as I > see it, there are only relatively minor differences between the > positions of the various people here. I'm being very vocal here because I'm concerned that we're going abou

Re: MaxOffsetNumber for Table AMs

2021-05-05 Thread Robert Haas
On Tue, May 4, 2021 at 9:24 PM Peter Geoghegan wrote: > Here is my concern: I have an obligation to make it clear that I think > that you really ought to straighten out this business with > generalizing TIDs before too long. Not because I say so, but because > it's holding up progress in general.

Re: MaxOffsetNumber for Table AMs

2021-05-04 Thread Peter Geoghegan
On Tue, May 4, 2021 at 5:40 PM Andres Freund wrote: > What does the deduplication actually require from tids? Isn't it just > that you need to be able to compare tids? It's hard to know for sure what is essential to the design, and what can be discarded. Though I can say for sure that it depends

Re: MaxOffsetNumber for Table AMs

2021-05-04 Thread Jeff Davis
On Tue, 2021-05-04 at 13:51 -0700, Peter Geoghegan wrote: > I think maybe it is possible for GIN to work with your column store > table AM in particular. Why aren't we talking about that concrete > issue, or something like that? Happy to. At this point I'd rather obey the constraint that the offs

Re: MaxOffsetNumber for Table AMs

2021-05-04 Thread Andres Freund
Hi, On 2021-05-04 14:13:36 -0700, Peter Geoghegan wrote: > On Mon, May 3, 2021 at 10:01 PM Andres Freund wrote: > > > For example, the TIDs should always work like unsigned integers -- the > > > table AM must be willing to work with that restriction. > > > > Isn't that more a question of the enco

Re: MaxOffsetNumber for Table AMs

2021-05-04 Thread Peter Geoghegan
On Mon, May 3, 2021 at 10:01 PM Andres Freund wrote: > > For example, the TIDs should always work like unsigned integers -- the > > table AM must be willing to work with that restriction. > > Isn't that more a question of the encoding than the concrete representation? I don't think so, no. How do

Re: MaxOffsetNumber for Table AMs

2021-05-04 Thread Peter Geoghegan
On Tue, May 4, 2021 at 11:52 AM Jeff Davis wrote: > On Mon, 2021-05-03 at 15:07 -0700, Peter Geoghegan wrote: > > It seems senseless to *require* table AMs to support something like a > > bitmap scan. > > I thought about this some more, and this framing is backwards. > ItemPointers are fundamental

Re: MaxOffsetNumber for Table AMs

2021-05-04 Thread Jeff Davis
On Tue, 2021-05-04 at 12:56 -0400, Robert Haas wrote: > b. If you actually meant "less than or equal to MaxOffsetNumber", > > that will fail with the GIN posting list issue raised in my first > > email. Do you agree that's a bug? > > Given the above, yes. If we just subtracted one, it would fit

Re: MaxOffsetNumber for Table AMs

2021-05-04 Thread Jeff Davis
On Mon, 2021-05-03 at 15:07 -0700, Peter Geoghegan wrote: > It seems senseless to *require* table AMs to support something like a > bitmap scan. I thought about this some more, and this framing is backwards. ItemPointers are fundamental to the table AM API: they are passed in to required methods,

Re: MaxOffsetNumber for Table AMs

2021-05-04 Thread Robert Haas
On Mon, May 3, 2021 at 2:13 PM Jeff Davis wrote: > That's not clear to me at all, and is the whole reason I began this > thread. > > a. You say "smaller than MaxOffsetNumber", but that's a little weird. > If an offset can't be MaxOffsetNumber, it's not really the maximum, is > it? I wasn't tryi

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Andres Freund
Hi, On 2021-04-30 11:51:07 -0700, Peter Geoghegan wrote: > I think that it's reasonable to impose some cost on index AMs here, > but that needs to be bounded sensibly and unambiguously. For example, > it would probably be okay if you had either 6 byte or 8 byte TIDs, but > no other variations. You

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Jeff Davis
On Mon, 2021-05-03 at 18:12 -0700, Peter Geoghegan wrote: > But look at the details: tidbitmap.c uses MaxHeapTuplesPerPage as its > MAX_TUPLES_PER_PAGE, which seems like a problem -- that's 291 with > default BLCKSZ. I doubt that that restriction is something that you > can afford to live with, eve

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Peter Geoghegan
On Mon, May 3, 2021 at 5:15 PM Jeff Davis wrote: > I don't see why in-core changes are a strict requirement. It doesn't > make too much difference if a lossy TID doesn't correspond exactly to > the columnar layout -- it should be fine as long as there's locality, > right? But look at the details:

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Jeff Davis
On Mon, 2021-05-03 at 15:07 -0700, Peter Geoghegan wrote: > Sure, but it either makes sense for the columnar table AM to support > bitmap scans (or some analogous type of scan that works only slightly > differently) or it doesn't. It's not at all clear which it is right > now. It makes sense for m

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Peter Geoghegan
On Mon, May 3, 2021 at 2:03 PM Jeff Davis wrote: > Another point: the idea of supporting only some kinds of indexes > doesn't mix well with partitioning. If you declare an index on the > parent, we should do something reasonable if one partition's table AM > doesn't support that index AM. Sure, b

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Jeff Davis
On Fri, 2021-04-30 at 10:55 -0700, Jeff Davis wrote: > On Fri, 2021-04-30 at 12:35 -0400, Tom Lane wrote: > > ISTM that would be up to the index AM. We'd need some interlocks > > on > > which index AMs could be used with which table AMs in any case, I > > think. > > I'm not sure why? It seems lik

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Peter Geoghegan
On Mon, May 3, 2021 at 12:06 PM Matthias van de Meent wrote: > One could relatively easily disable bitmap scans on the table AM by > not installing the relevant bitmap support functions on the registered > TableAM structure, and thus not touch that problem. I have no idea how much it'll hurt thin

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Matthias van de Meent
On Mon, 3 May 2021 at 20:43, Peter Geoghegan wrote: > > On Mon, May 3, 2021 at 10:57 AM Jeff Davis wrote: > > For the purposes of this discussion, what's making my life difficult is > > that we don't have a good definition for TID, leaving me with two > > options: > > > > 1. be overly conservat

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Peter Geoghegan
On Mon, May 3, 2021 at 10:57 AM Jeff Davis wrote: > For the purposes of this discussion, what's making my life difficult is > that we don't have a good definition for TID, leaving me with two > options: > > 1. be overly conservative, accept MaxOffsetNumber=2048, wasting a > bunch of address spac

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Jeff Davis
On Mon, 2021-05-03 at 10:38 -0700, Peter Geoghegan wrote: > I don't think it's much good to just do that. You probably need a > full > 64-bits for something like a column store. But that's all you need. I would definitely like that for citus columnar, and it would definitely make it easier to mana

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Jeff Davis
On Mon, 2021-05-03 at 13:22 -0400, Robert Haas wrote: > to look and work like heap TIDs; that is, there had better be a block > number portion and an item number portion, Right (at least for now). > and the item number had > better be smaller than MaxOffsetNumber, That's not clear to me at all,

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Peter Geoghegan
On Mon, May 3, 2021 at 10:22 AM Matthias van de Meent wrote: > For IoT, as far as I know, one of the constraints is that there exists > some unique constraint on the table, which also defines the ordering. > Assuming that that is the case, we can use + transaction id> to identify tuple versions.

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Jeff Davis
On Mon, 2021-05-03 at 09:59 -0700, Peter Geoghegan wrote: > You don't accept any of that, though. Fair enough. I predict that > avoiding making a hard choice will make Jeff's work here a lot > harder, > though. For the purposes of this discussion, what's making my life difficult is that we don't h

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Peter Geoghegan
On Mon, May 3, 2021 at 10:22 AM Robert Haas wrote: > I don't really think so, or at least I don't see a reason why it > should. As things stand today, I don't think it's possible for a table > AM author to make any other choice than to assume that their TIDs have > to look and work like heap TIDs;

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Matthias van de Meent
On Mon, 3 May 2021 at 19:00, Peter Geoghegan wrote: > > On Mon, May 3, 2021 at 9:45 AM Robert Haas wrote: > > But if you're saying those identifiers have to be fixed-width and 48 > > (or even 64) bits, I disagree that we wish to have such a requirement > > in perpetuity. > > Once you require that

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Robert Haas
On Mon, May 3, 2021 at 1:00 PM Peter Geoghegan wrote: > You don't accept any of that, though. Fair enough. I predict that > avoiding making a hard choice will make Jeff's work here a lot harder, > though. I don't really think so, or at least I don't see a reason why it should. As things stand tod

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Peter Geoghegan
On Mon, May 3, 2021 at 9:45 AM Robert Haas wrote: > But if you're saying those identifiers have to be fixed-width and 48 > (or even 64) bits, I disagree that we wish to have such a requirement > in perpetuity. Once you require that TID-like identifiers must point to particular versions (as oppose

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Robert Haas
On Mon, May 3, 2021 at 11:26 AM Peter Geoghegan wrote: > It just has to be able to accept the restriction that > indexes must have a unique TID-like identifier for each version (not > quite a version actually -- whatever the equivalent of a HOT chain > is). This is a restriction that Jeff had pret

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Peter Geoghegan
On Mon, May 3, 2021 at 8:03 AM Robert Haas wrote: > It's reasonable to wonder. I think it depends on whether the problem > is bloat or just general slowness. To the extent that the problem is > bloat, bottom-index deletion will help a lot, but it's not going to > help with slowness because, as you

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Peter Geoghegan
On Mon, May 3, 2021 at 7:41 AM Robert Haas wrote: > So here. The complexity of getting a table AM that does anything > non-trivial working is formidable, and I don't expect it to happen > right away. Picking one that is essentially block-based and can use > 48-bit TIDs is very likely the right ini

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Robert Haas
On Fri, Apr 30, 2021 at 6:19 PM Peter Geoghegan wrote: > A remaining problem is that we must generate a new round of index > tuples for each and every index when only one indexed column is > logically modified by an UPDATE statement. I think that this is much > less of a problem now due to bottom-

Re: MaxOffsetNumber for Table AMs

2021-05-03 Thread Robert Haas
On Fri, Apr 30, 2021 at 5:22 PM Peter Geoghegan wrote: > I strongly suspect that index-organized tables (or indirect indexes, > or anything else that assumes that TID-like identifiers map directly > to logical rows as opposed to physical versions) are going to break > too many assumptions to ever

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Peter Geoghegan
On Fri, Apr 30, 2021 at 2:07 PM Robert Haas wrote: > OK. I thought about this in regards to zheap, which has this exact > problem, because it wants to do so-called "in place" updates where the > new version of the row goes right on top of the old one in the table > page, and the old version of the

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Peter Geoghegan
On Fri, Apr 30, 2021 at 2:13 PM Jeff Davis wrote: > FWIW, this is not a problem in my table AM. I am fine having different > TIDs for each version, just like heapam. This means that we are largely in agreement about the general nature of the problem. That seems like a good basis to redefine TID-l

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Jeff Davis
On Fri, 2021-04-30 at 12:29 -0700, Peter Geoghegan wrote: > > Is the problem you're worried about here that, with something like > > an > > index-organized table, you can have multiple row versions that have > > the same logical tuple ID, i.e. primary key value? > > That's what I'm talking about.

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Robert Haas
On Fri, Apr 30, 2021 at 3:30 PM Peter Geoghegan wrote: > > Is the problem you're worried about here that, with something like an > > index-organized table, you can have multiple row versions that have > > the same logical tuple ID, i.e. primary key value? And that the > > interfaces aren't well-su

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Peter Geoghegan
On Fri, Apr 30, 2021 at 12:20 PM Robert Haas wrote: > Why can't it do what it does already? It's not broken for heap, so why > should it be broken for anything else? And why are non-HOT updates > specifically a problem? No reason. > > You obviously cannot have the equivalent of > > duplicate TID

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Robert Haas
On Fri, Apr 30, 2021 at 2:23 PM Peter Geoghegan wrote: > I don't know how it's possible to do any of this without first > addressing what the table AM does in cases where heapam currently does > a non-HOT update. Why can't it do what it does already? It's not broken for heap, so why should it be

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Peter Geoghegan
On Fri, Apr 30, 2021 at 11:23 AM Robert Haas wrote: > On Fri, Apr 30, 2021 at 2:05 PM Peter Geoghegan wrote: > > I agree in principle, but making that work well is very hard in > > practice because of the format of IndexTuple -- which bleeds into > > everything. That TID is special is probably a

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Jeff Davis
On Fri, 2021-04-30 at 13:56 -0400, Robert Haas wrote: > I think that would be the best long-term plan. We should still have *some* answer in the short term for table AM authors like me. If I use offset numbers as high as MaxOffsetNumber, then itemptr_to_uint64 will fail. If I base my calculations

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Jeff Davis
On Fri, 2021-04-30 at 10:50 -0700, Peter Geoghegan wrote: > I don't know. This conversation is still too abstract for me to be > able to take a firm position. ISTM that we tend to talk about the > table AM in excessively abstract terms. It would be a lot easier if > we > had clear fixed goals for a

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Robert Haas
On Fri, Apr 30, 2021 at 2:05 PM Peter Geoghegan wrote: > I agree in principle, but making that work well is very hard in > practice because of the format of IndexTuple -- which bleeds into > everything. That TID is special is probably a natural consequence of > the fact that we don't have an offse

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Peter Geoghegan
On Fri, Apr 30, 2021 at 10:56 AM Robert Haas wrote: > I think that would be the best long-term plan. I guess I have two > distinguishable concerns. One is that I want to be able to have > indexes with a payload that's not just a 6-byte TID; e.g. adding a > partition identifier to support global in

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Peter Geoghegan
On Fri, Apr 30, 2021 at 10:39 AM Robert Haas wrote: > I agree up to a point but ... are you imagining that the TID continues > to have its own special place in the page, while the partition > identifier is stored more like a regular tuple column? Because it > seems to me that it would be better to

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Robert Haas
On Fri, Apr 30, 2021 at 1:37 PM Jeff Davis wrote: > The particular problem I have now is that Table AMs seem to support > indexes just fine, but TIDs are under-specified so I don't know what I > really have to work with. BlockNumber seems well-specified as > 0..0XFFFE (inclusive), but I don't

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Jeff Davis
On Fri, 2021-04-30 at 12:35 -0400, Tom Lane wrote: > ISTM that would be up to the index AM. We'd need some interlocks on > which index AMs could be used with which table AMs in any case, I > think. I'm not sure why? It seems like we should be able to come up with something that's generic enough.

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Peter Geoghegan
On Fri, Apr 30, 2021 at 10:04 AM Jeff Davis wrote: > On Fri, 2021-04-30 at 08:36 -0700, Peter Geoghegan wrote: > > Compatibility with index AMs is more than a matter of switching out > > the tuple identifier -- if you invent something that has totally > > different performance characteristics for

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Robert Haas
On Fri, Apr 30, 2021 at 1:28 PM Peter Geoghegan wrote: > Global indexes should work by adding an extra column that is somewhat > like a TID, that may even have its own pg_attribute entry. It's much > more natural to make the partition number a separate column IMV -- > nbtree suffix truncation and

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Jeff Davis
On Fri, 2021-04-30 at 12:51 -0400, Robert Haas wrote: > There are two major reasons why I want variable-width tuple IDs. One > is global indexes, where you need as many bits as the AMs > implementing > the partitions need, plus some extra bits to identify which partition > is relevant for a particu

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Robert Haas
On Fri, Apr 30, 2021 at 1:10 PM Tom Lane wrote: > I agree that global indexes need more bits, but it doesn't necessarily > follow that we must have variable-width TIDs. We could for example > say that "real" TIDs are only 48 bits and index AMs that want to be > usable as global indexes must be ca

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Peter Geoghegan
On Fri, Apr 30, 2021 at 10:10 AM Tom Lane wrote: > > There are two major reasons why I want variable-width tuple IDs. One > > is global indexes, where you need as many bits as the AMs implementing > > the partitions need, plus some extra bits to identify which partition > > is relevant for a parti

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Tom Lane
Robert Haas writes: > On Fri, Apr 30, 2021 at 11:06 AM Tom Lane wrote: >> Andres seems to feel that we should try to allow variable-width >> tupleids, but I'm afraid that the cost/benefit ratio for that >> would be pretty poor. > There are two major reasons why I want variable-width tuple IDs. O

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Jeff Davis
On Fri, 2021-04-30 at 08:36 -0700, Peter Geoghegan wrote: > Compatibility with index AMs is more than a matter of switching out > the tuple identifier -- if you invent something that has totally > different performance characteristics for index AMs, then it's likely > to break tacit assumptions abo

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Robert Haas
On Fri, Apr 30, 2021 at 11:06 AM Tom Lane wrote: > My thought at the moment is that all APIs above the AM level ought > to be redefined to use uint64 for tuple identifiers. heapam and > related index AMs could map block + offset into that in some > convenient way, and other AMs could do what they

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Jeff Davis
On Fri, 2021-04-30 at 12:04 +0200, Matthias van de Meent wrote: > Other than that, I believe you've also missed the special offset > numbers specified in itemptr.h (SpecTokenOffsetNumber and > MovedPartitionsOffsetNumber). I am not well enough aware of the usage > of these OffsetNumber values, but

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Tom Lane
Jeff Davis writes: > On Fri, 2021-04-30 at 11:06 -0400, Tom Lane wrote: >> My thought at the moment is that all APIs above the AM level ought >> to be redefined to use uint64 for tuple identifiers. > Do you mean that indexes would be expected to hold a uint64, a 48-bit > int (that directly maps t

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Jeff Davis
On Fri, 2021-04-30 at 11:06 -0400, Tom Lane wrote: > My thought at the moment is that all APIs above the AM level ought > to be redefined to use uint64 for tuple identifiers. One challenge might be reliance on InvalidOffsetNumber as a special value in a number of places (e.g. bitmap index scans).

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Peter Geoghegan
On Fri, Apr 30, 2021 at 8:06 AM Tom Lane wrote: > My thought at the moment is that all APIs above the AM level ought > to be redefined to use uint64 for tuple identifiers. heapam and > related index AMs could map block + offset into that in some > convenient way, and other AMs could do what they

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Tom Lane
Jeff Davis writes: > The notion of TID is based on pages and line pointers, which makes > sense for heapam, but that's not likely to make sense for a custom > table AM. > The obvious answer is to make a simple mapping between a TID and > whatever makes sense to the AM (for the sake of discussion,

Re: MaxOffsetNumber for Table AMs

2021-04-30 Thread Matthias van de Meent
On Fri, 30 Apr 2021, 09:46 Jeff Davis, wrote: > > The notion of TID is based on pages and line pointers, which makes > sense for heapam, but that's not likely to make sense for a custom > table AM. > > The obvious answer is to make a simple mapping between a TID and > whatever makes sense to the

MaxOffsetNumber for Table AMs

2021-04-30 Thread Jeff Davis
The notion of TID is based on pages and line pointers, which makes sense for heapam, but that's not likely to make sense for a custom table AM. The obvious answer is to make a simple mapping between a TID and whatever makes sense to the AM (for the sake of discussion, let's say a plain row numbe