Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-09 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > On Sun, 2005-11-06 at 11:26 -0500, Tom Lane wrote: >> Really? After I woke up a bit more I realized there was only one bit >> and change to spare, not two, so I don't see how it would work. > Not sure why you think that. Seems to fit [ counts on fing

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-09 Thread Simon Riggs
On Sun, 2005-11-06 at 11:26 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > On Thu, 2005-11-03 at 10:32 -0500, Tom Lane wrote: > >> I think we could make it go by cramming the sign and > >> the high-order dscale bit into the first NumericDigit --- the > >> digit itself can only

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-06 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > On Thu, 2005-11-03 at 10:32 -0500, Tom Lane wrote: >> I think we could make it go by cramming the sign and >> the high-order dscale bit into the first NumericDigit --- the >> digit itself can only be 0.. so there are a couple of bits >> to spare. > I'v

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-06 Thread Simon Riggs
On Thu, 2005-11-03 at 10:32 -0500, Tom Lane wrote: > I'd feel a lot happier about this if we could keep the dynamic range > up to, say, 10^512 so that it's still true that NUMERIC can be a > universal parse-time representation. That would also make it even > more unlikely that anyone would complai

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-05 Thread Harald Fuchs
In article <[EMAIL PROTECTED]>, Gregory Maxwell <[EMAIL PROTECTED]> writes: > On 11/4/05, Martijn van Oosterhout wrote: >> Yeah, and while one way of removing that dependance is to use ICU, that >> library wants everything in UTF-16. So we replace "copying to add NULL >> to string" with "converti

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-05 Thread Martijn van Oosterhout
On Fri, Nov 04, 2005 at 07:15:22PM -0500, Gregory Maxwell wrote: > Other lame aspects of using unicode encodings other than UTF-8 > internally is that it's harder to figure out what is text in GDB > output and such.. can make debugging more difficult. Yeah, that's one of the reasons I think UTF-16

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Gregory Maxwell
On 11/4/05, Martijn van Oosterhout wrote: [snip] > : ICU does not use UCS-2. UCS-2 is a subset of UTF-16. UCS-2 does not > : support surrogates, and UTF-16 does support surrogates. This means > : that UCS-2 only supports UTF-16's Base Multilingual Plane (BMP). The > : notion of UCS-2 is deprecated

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Martijn van Oosterhout
On Fri, Nov 04, 2005 at 02:58:05PM -0500, Gregory Maxwell wrote: > The correct question to ask is something like "Does it support non-bmp > characters?" or "Does it really support UTF-16 or just UCS2?" > > UTF-16 is (now) a variable width encoding which is a strict superset > of UCS2 which allows

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Jim C. Nasby
On Fri, Nov 04, 2005 at 04:30:27PM -0500, Tom Lane wrote: > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > On Thu, Nov 03, 2005 at 10:32:03AM -0500, Tom Lane wrote: > >> I'd feel a lot happier about this if we could keep the dynamic range > >> up to, say, 10^512 so that it's still true that NUMERIC

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > On Thu, Nov 03, 2005 at 10:32:03AM -0500, Tom Lane wrote: >> I'd feel a lot happier about this if we could keep the dynamic range >> up to, say, 10^512 so that it's still true that NUMERIC can be a >> universal parse-time representation. That would also

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Jim C. Nasby
On Thu, Nov 03, 2005 at 10:32:03AM -0500, Tom Lane wrote: > I'd feel a lot happier about this if we could keep the dynamic range > up to, say, 10^512 so that it's still true that NUMERIC can be a > universal parse-time representation. That would also make it even > more unlikely that anyone would

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Jim C. Nasby
On Thu, Nov 03, 2005 at 04:07:41PM +0100, Marcus Engene wrote: > Simon Riggs wrote: > >On Thu, 2005-11-03 at 11:13 -0300, Alvaro Herrera wrote: > > > >>Simon Riggs wrote: > >> > >>>On PostgreSQL, CHAR(12) is a bpchar datatype with all instantiations of > >>>that datatype having a 4 byte varlena hea

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Tom Lane
Martijn van Oosterhout writes: > Yeah, and while one way of removing that dependance is to use ICU, that > library wants everything in UTF-16. Really? Can't it do UCS4 (UTF-32)? There's a nontrivial population of our users that isn't satisfied with UTF-16 anyway, so if that really is a restrict

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Gregory Maxwell
On 11/4/05, Tom Lane <[EMAIL PROTECTED]> wrote: > Martijn van Oosterhout writes: > > Yeah, and while one way of removing that dependance is to use ICU, that > > library wants everything in UTF-16. > > Really? Can't it do UCS4 (UTF-32)? There's a nontrivial population > of our users that isn't sa

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Gregory Maxwell
On 11/4/05, Martijn van Oosterhout wrote: > Yeah, and while one way of removing that dependance is to use ICU, that > library wants everything in UTF-16. So we replace "copying to add NULL > to string" with "converting UTF-8 to UTF-16 on each call. Ugh! The > argument for UTF-16 is that if you're

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Martijn van Oosterhout
On Fri, Nov 04, 2005 at 01:54:04PM -0500, Tom Lane wrote: > [EMAIL PROTECTED] writes: > > I read "the backend is by and large an ASCII, null-terminated-string > > engine" with "we use UTF-8 [for varlena strings?]" as, a lot of the > > code assumes varlena strings are '\0' terminated, and an assumpt

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Tom Lane
[EMAIL PROTECTED] writes: > I read "the backend is by and large an ASCII, null-terminated-string > engine" with "we use UTF-8 [for varlena strings?]" as, a lot of the > code assumes varlena strings are '\0' terminated, and an assumption > on my part, that the varlena strings are not stored in the b

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread mark
On Fri, Nov 04, 2005 at 04:13:29PM +0100, Martijn van Oosterhout wrote: > On Fri, Nov 04, 2005 at 08:38:38AM -0500, [EMAIL PROTECTED] wrote: > > On Thu, Nov 03, 2005 at 09:17:43PM -0500, Tom Lane wrote: > > > Actually, the real reason we use UTF-8 and not any of the > > > sorta-fixed-size represent

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Martijn van Oosterhout
On Fri, Nov 04, 2005 at 08:38:38AM -0500, [EMAIL PROTECTED] wrote: > On Thu, Nov 03, 2005 at 09:17:43PM -0500, Tom Lane wrote: > > Actually, the real reason we use UTF-8 and not any of the > > sorta-fixed-size representations of Unicode is that the backend is by > > and large an ASCII, null-termina

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread mark
On Thu, Nov 03, 2005 at 09:17:43PM -0500, Tom Lane wrote: > Gregory Maxwell <[EMAIL PROTECTED]> writes: > > Another way to look at this is in the context of compression: With > > unicode, characters are really 32bit values... But only a small range > > of these values is common. So we store and wo

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Tom Lane
Gregory Maxwell <[EMAIL PROTECTED]> writes: > Another way to look at this is in the context of compression: With > unicode, characters are really 32bit values... But only a small range > of these values is common. So we store and work with them in a > compressed format, UTF-8. > As such it might

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Gregory Maxwell
On 11/3/05, Martijn van Oosterhout wrote: > That's called UTF-16 and is currently not supported by PostgreSQL at > all. That may change, since the locale library ICU requires UTF-16 for > everything. UTF-16 doesn't get us out of the variable length character game, for that we need UTF-32... Unles

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Simon Riggs
On Thu, 2005-11-03 at 10:32 -0500, Tom Lane wrote: > Bruce Momjian writes: > >> On Thu, 3 Nov 2005, Simon Riggs wrote: > >>> At the moment we've established we can do this fairly much for free. > > > Agreed. With the proposal, we are saving perhaps 5% storage space for > > numeric fields, but ar

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Andrew - Supernews
On 2005-11-03, Martijn van Oosterhout wrote: >> For "other databases", the column could be encoded as 2 byte characters >> or 4 byte characters, allowing it to be fixed. I find myself doubting >> that ASCII characters could be encoded more efficiently in such formats, >> than the inclusion of a le

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Martijn van Oosterhout
On Thu, Nov 03, 2005 at 12:28:02PM -0500, [EMAIL PROTECTED] wrote: > It's unfortunate that the length is encoded multiple times. In UTF-8, > for instance, each character has its length encoded in the most > significant bits. Complicated to extract, however, the data is encoded > twice. 1 in the hea

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Simon Riggs
On Thu, 2005-11-03 at 11:36 -0500, Andrew Dunstan wrote: > Well, it could also be argued that DW apps could often get away with > using floating point types, even where the primary source needs to be in > fixed point for accuracy, and that could generate lots of savings in > speed and space. B

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread mark
On Thu, Nov 03, 2005 at 03:09:26PM +0100, Martijn van Oosterhout wrote: > On Thu, Nov 03, 2005 at 01:49:46PM +, Simon Riggs wrote: > > In other databases, CHAR(12) and NUMERIC(12) are fixed length datatypes. > > In PostgreSQL, they are dynamically varying datatypes. > Please explain how a CHAR(

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Andrew Dunstan
Simon Riggs wrote: On Wed, 2005-11-02 at 19:12 -0500, Tom Lane wrote: Andrew Dunstan <[EMAIL PROTECTED]> writes: Could someone please quantify how much bang we might get for what seems like quite a lot of bucks? I appreciate the need for speed, but the saving here strikes me as marg

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Simon Riggs
On Thu, 2005-11-03 at 07:03 -0800, Stephan Szabo wrote: > I don't believe the above is safe to say, yet. AFAICS, this has been > discussed only on hackers (and patches) in this discussion, whereas this > sort of change should probably be brought up on general as well to get a > greater understandi

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Tom Lane
Bruce Momjian writes: >> On Thu, 3 Nov 2005, Simon Riggs wrote: >>> At the moment we've established we can do this fairly much for free. > Agreed. With the proposal, we are saving perhaps 5% storage space for > numeric fields, but are adding code complexity and reducing its possible > precision.

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Bruce Momjian
Stephan Szabo wrote: > > On Thu, 3 Nov 2005, Simon Riggs wrote: > > > On Wed, 2005-11-02 at 19:12 -0500, Tom Lane wrote: > > > Andrew Dunstan <[EMAIL PROTECTED]> writes: > > > > Could someone please quantify how much bang we might get for what seems > > > > like quite a lot of bucks? > > > > I ap

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Marcus Engene
Simon Riggs wrote: On Thu, 2005-11-03 at 11:13 -0300, Alvaro Herrera wrote: Simon Riggs wrote: On PostgreSQL, CHAR(12) is a bpchar datatype with all instantiations of that datatype having a 4 byte varlena header. In this example, all of those instantiations having the varlena header set to 12

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Stephan Szabo
On Thu, 3 Nov 2005, Simon Riggs wrote: > On Wed, 2005-11-02 at 19:12 -0500, Tom Lane wrote: > > Andrew Dunstan <[EMAIL PROTECTED]> writes: > > > Could someone please quantify how much bang we might get for what seems > > > like quite a lot of bucks? > > > I appreciate the need for speed, but the

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Simon Riggs
On Thu, 2005-11-03 at 11:13 -0300, Alvaro Herrera wrote: > Simon Riggs wrote: > > On PostgreSQL, CHAR(12) is a bpchar datatype with all instantiations of > > that datatype having a 4 byte varlena header. In this example, all of > > those instantiations having the varlena header set to 12, so essent

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Alvaro Herrera
Simon Riggs wrote: > On PostgreSQL, CHAR(12) is a bpchar datatype with all instantiations of > that datatype having a 4 byte varlena header. In this example, all of > those instantiations having the varlena header set to 12, so essentially > wasting the 4 byte header. We need the length word beca

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Martijn van Oosterhout
On Thu, Nov 03, 2005 at 01:49:46PM +, Simon Riggs wrote: > In other databases, CHAR(12) and NUMERIC(12) are fixed length datatypes. > In PostgreSQL, they are dynamically varying datatypes. Please explain how a CHAR(12) can store 12 UTF-8 characters when each character may be 1 to 4 bytes, unle

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Simon Riggs
On Thu, 2005-11-03 at 08:27 +, Simon Riggs wrote: > On Wed, 2005-11-02 at 19:12 -0500, Tom Lane wrote: > > If we were willing to invent the "varlena2" datum format then we could > > save four bytes per numeric, plus reduce numeric's alignment requirement > > from int to short which would probab

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Simon Riggs
On Wed, 2005-11-02 at 19:12 -0500, Tom Lane wrote: > Andrew Dunstan <[EMAIL PROTECTED]> writes: > > Could someone please quantify how much bang we might get for what seems > > like quite a lot of bucks? > > I appreciate the need for speed, but the saving here strikes me as > > marginal at best, u

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes: > Could someone please quantify how much bang we might get for what seems > like quite a lot of bucks? > I appreciate the need for speed, but the saving here strikes me as > marginal at best, unless my instincts are all wrong (quite possible) Two bytes

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Andrew Dunstan
[patches removed] Tom Lane wrote: Simon Riggs <[EMAIL PROTECTED]> writes: It seems straightforward enough to put in an additional test, similar to the ones already there so that if its too big for a decimal we make it a float straight away - only a float can be that big in that case. After

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Jim C. Nasby
On Wed, Nov 02, 2005 at 06:12:37PM -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > It seems straightforward enough to put in an additional test, similar to > > the ones already there so that if its too big for a decimal we make it a > > float straight away - only a float can be

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > It seems straightforward enough to put in an additional test, similar to > the ones already there so that if its too big for a decimal we make it a > float straight away - only a float can be that big in that case. After > that I can't really see what the p

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Simon Riggs
On Wed, 2005-11-02 at 15:09 -0500, Tom Lane wrote: > [ thinks for a moment... ] Actually, neither proposal is going to get > off the ground, because the parser's handling of numeric constants is > predicated on the assumption that type NUMERIC can handle any valid > value of FLOAT8, and so we can

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Simon Riggs
On Wed, 2005-11-02 at 15:09 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > I wasn't trying to claim the bit assignment made sense. My point was > > that the work to mangle the two fields together to make it make sense > > looked like it would take more CPU (since the standard

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Martijn van Oosterhout
On Wed, Nov 02, 2005 at 12:53:07PM -0600, Jim C. Nasby wrote: > > This is one of those issues where we need to run tests and take input. > > We cannot decide this sort of thing just by debate alone. So, I'll leave > > this as a less potentially fruitful line of enquiry. > > Is it worth comming up

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > I wasn't trying to claim the bit assignment made sense. My point was > that the work to mangle the two fields together to make it make sense > looked like it would take more CPU (since the standard representation of > signed integers is different for +ve an

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Simon Riggs
On Wed, 2005-11-02 at 13:46 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > On Tue, 2005-11-01 at 17:55 -0500, Tom Lane wrote: > >> I don't think it'd be worth having 2 types. Remember that the weight is > >> measured in base-10k digits. Suppose for instance > >>sign

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Jim C. Nasby
On Wed, Nov 02, 2005 at 08:48:25AM +, Simon Riggs wrote: > On Tue, 2005-11-01 at 18:15 -0500, Tom Lane wrote: > > Simon Riggs <[EMAIL PROTECTED]> writes: > > > Anybody like to work out a piece of SQL to perform data profiling and > > > derive the distribution of values with trailing zeroes? > >

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > On Tue, 2005-11-01 at 17:55 -0500, Tom Lane wrote: >> I don't think it'd be worth having 2 types. Remember that the weight is >> measured in base-10k digits. Suppose for instance >> sign1 bit >> weight 7 bits (-64 .. +63) >>

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Simon Riggs
On Tue, 2005-11-01 at 17:55 -0500, Tom Lane wrote: > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > FWIW, most databases I've used limit NUMERIC to 38 digits, presumably to > > fit length info into 1 or 2 bytes. So there's something to be said for a > > small numeric type that has less overhead and

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Pollard, Mike
I am not able to quickly find your numeric format, so I'll just throw this in. MaxDB (I only mention this because the format and algorithms are now under the GPL, so they can be reviewed by the public) uses a nifty number format that allows the use memcpy to compare two numbers when they are in t

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Mike Rylander
On 11/2/05, Simon Riggs <[EMAIL PROTECTED]> wrote: > On Tue, 2005-11-01 at 18:15 -0500, Tom Lane wrote: > > Simon Riggs <[EMAIL PROTECTED]> writes: > > > Anybody like to work out a piece of SQL to perform data profiling and > > > derive the distribution of values with trailing zeroes? > > > > Don't

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-02 Thread Simon Riggs
On Tue, 2005-11-01 at 18:15 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > Anybody like to work out a piece of SQL to perform data profiling and > > derive the distribution of values with trailing zeroes? > > Don't forget leading zeroes. And all-zero (we omit digits entirely

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-01 Thread J. Andrew Rogers
On 11/1/05 2:38 PM, "Jim C. Nasby" <[EMAIL PROTECTED]> wrote: > > FWIW, most databases I've used limit NUMERIC to 38 digits, presumably to > fit length info into 1 or 2 bytes. So there's something to be said for a > small numeric type that has less overhead and a large numeric (what we > have toda

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-01 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > Anybody like to work out a piece of SQL to perform data profiling and > derive the distribution of values with trailing zeroes? Don't forget leading zeroes. And all-zero (we omit digits entirely in that case). I don't think you can claim that zero isn't

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-01 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > On Tue, Nov 01, 2005 at 05:40:35PM -0500, Tom Lane wrote: >> Maybe if we had a few other datatypes that could also use the feature. >> [ thinks... ] inet/cidr comes to mind but I don't see any others. >> The case seems a bit weak :-( > Would varchar(25

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-01 Thread Simon Riggs
On Tue, 2005-11-01 at 23:16 +0100, Martijn van Oosterhout wrote: lots of useful things, thank you. > > So, assuming I have this all correct, means we could reduce the on disk > > storage for NUMERIC datatypes to the following struct. This gives an > > overhead of just 2.5 bytes, plus the loss of

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-01 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > FWIW, most databases I've used limit NUMERIC to 38 digits, presumably to > fit length info into 1 or 2 bytes. So there's something to be said for a > small numeric type that has less overhead and a large numeric (what we > have today). I don't think it'

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-01 Thread Jim C. Nasby
On Tue, Nov 01, 2005 at 05:40:35PM -0500, Tom Lane wrote: > Martijn van Oosterhout writes: > > You are proposing a fourth type, say VARLENA2 which looks a lot like a > > verlena but it's not. I think the shear volume of code that would need > > to be checked is huge. Also, things like pg_attribute

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-01 Thread Simon Riggs
On Tue, 2005-11-01 at 16:54 -0500, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > varlen is int32 to match the standard varlena header. However, the max > > number of digits of the datatype is less than the threshold at which > > values get toasted. So no NUMERIC values ever get toas

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-01 Thread Jim C. Nasby
On Tue, Nov 01, 2005 at 11:16:58PM +0100, Martijn van Oosterhout wrote: > Consider the algorithm: A number is stored as base + exponent. To > multiply two numbers you can multiply the bases and add the exponents. > OTOH, if you store the decimal inside the data, now you have to extract > it again b

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-01 Thread Tom Lane
Martijn van Oosterhout writes: > You are proposing a fourth type, say VARLENA2 which looks a lot like a > verlena but it's not. I think the shear volume of code that would need > to be checked is huge. Also, things like pg_attribute would need > changing because you have to represent this new stat

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-01 Thread Jim C. Nasby
On Tue, Nov 01, 2005 at 04:54:11PM -0500, Tom Lane wrote: > It might be reasonable to restrict the range of NUMERIC to the point > that we could fit the weight/sign/dscale into 2 bytes instead of 4, > thereby saving 2 bytes per NUMERIC. I'm not excited about the other > aspects of this, though. F

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-01 Thread Martijn van Oosterhout
On Tue, Nov 01, 2005 at 09:22:17PM +, Simon Riggs wrote: > varlen is int32 to match the standard varlena header. However, the max > number of digits of the datatype is less than the threshold at which > values get toasted. So no NUMERIC values ever get toasted - in which > case, why worry about

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-01 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > varlen is int32 to match the standard varlena header. However, the max > number of digits of the datatype is less than the threshold at which > values get toasted. So no NUMERIC values ever get toasted - in which > case, why worry about matching the size of

[HACKERS] Reducing the overhead of NUMERIC data

2005-11-01 Thread Simon Riggs
Currently, the overhead of NUMERIC datatype is 8 bytes. Each value is stored on disk as typedef struct NumericData { int32 varlen; /* Variable size (std varlena header) */ int16 n_weight; /* Weight of 1st digit */ uint16 n_sign_dscale; /* Sign + display