Thank you . On Mon, 12 Feb 2024 at 03:52, Peter J. Holzer <[email protected]> wrote:
> On 2024-02-11 12:08:47 -0500, Ron Johnson wrote: > > On Sun, Feb 11, 2024 at 11:54 AM veem v <[email protected]> wrote: > > When you said "you would normally prefer those over numeric " I was > > thinking the opposite. As you mentioned integer is a fixed length > data type > > and will occupy 4 bytes whether you store 15 or 99999999.But in case > of > > variable length type like Number or numeric , it will resize itself > based > > on the actual data, So is there any downside of going with the > variable > > length data type like Numeric, > > > > > > Consider a table with a bunch of NUMERIC fields. One of those records > has > > small values (aka three bytes). It fits neatly in 2KiB. > > > > And then you update all those NUMERIC fields to big numbers that take 15 > > bytes. Suddenly (or eventually, if you update them at different times), > the > > record does not fit in 2KiB, and so must be moved to its own.page. > That causes > > extra IO. > > I think that's not much of a concern with PostgreSQL because you can't > update a row in-place anyway because of MVCC. So in any case you're > writing a new row. If you're lucky there is enough free space in the same > page and you can do a HOT update, but that's quite independent on > whether the row changes size. > > > Good to know. So it means here in postgres, there is no such concern like "row chaining", "row migration" etc. which we normally have in a non mvcc database (like Oracle say). And there its not advisable to have more than ~255 columns in a table even its technically possible. And if such requirement arises, we normally break the table into 2 different tables with some columns in common to join them. https://jonathanlewis.wordpress.com/2015/02/19/255-columns/ So we were thinking, adding many column to a table should be fine in postgres (as here we have a use case in which total number of columns may go till ~500+). But then, considering the access of columns towards the end of a row is going to add more time as compared to the column which is at the start of the row. As you mentioned, accessing 100th column may add 4 to 5 times more as compared to the access of first column. So , is it advisable here to go for similar approach of breaking the table into two , if the total number of column reaches certain number/threshold for a table? Regards Veem
