Re: Non-decimal integer literals

2023-01-24 Thread Ranier Vilela
Em ter., 24 de jan. de 2023 às 07:24, Dean Rasheed escreveu: > On Tue, 24 Jan 2023 at 00:47, Ranier Vilela wrote: > > > > On 13.01.23 11:01, Dean Rasheed wrote: > > > So I'm feeling quite good about the end result -- I set out hoping not > > > to make performance noticeably worse, but ended up m

Re: Non-decimal integer literals

2023-01-24 Thread Dean Rasheed
On Tue, 24 Jan 2023 at 00:47, Ranier Vilela wrote: > > On 13.01.23 11:01, Dean Rasheed wrote: > > So I'm feeling quite good about the end result -- I set out hoping not > > to make performance noticeably worse, but ended up making it > > significantly better. > Hi Dean, thanks for your work. > > B

Re: Non-decimal integer literals

2023-01-23 Thread Ranier Vilela
On 13.01.23 11:01, Dean Rasheed wrote: > So I'm feeling quite good about the end result -- I set out hoping not > to make performance noticeably worse, but ended up making it > significantly better. Hi Dean, thanks for your work. But since PG_RETURN_NULL, is a simple return, now the "value" var i

Re: Non-decimal integer literals

2023-01-23 Thread Dean Rasheed
On Mon, 23 Jan 2023 at 20:00, Joel Jacobson wrote: > > Nice! This also simplifies when dealing with non-negative integers > represented as byte arrays, > common in e.g. cryptography code. > Ah, interesting. I hadn't thought of that use-case. > create function numeric_from_bytes(bytea) returns n

Re: Non-decimal integer literals

2023-01-23 Thread Joel Jacobson
On Fri, Jan 13, 2023, at 07:01, Dean Rasheed wrote: > Attachments: > * 0001-Add-non-decimal-integer-support-to-type-numeric.patch Nice! This also simplifies when dealing with non-negative integers represented as byte arrays, common in e.g. cryptography code. Before, one had to implement numeric_

Re: Non-decimal integer literals

2023-01-23 Thread Dean Rasheed
On Mon, 23 Jan 2023 at 15:55, Peter Eisentraut wrote: > > On 13.01.23 11:01, Dean Rasheed wrote: > > So I'm feeling quite good about the end result -- I set out hoping not > > to make performance noticeably worse, but ended up making it > > significantly better. > > This is great! How do you want

Re: Non-decimal integer literals

2023-01-23 Thread Peter Eisentraut
On 13.01.23 11:01, Dean Rasheed wrote: So I'm feeling quite good about the end result -- I set out hoping not to make performance noticeably worse, but ended up making it significantly better. This is great! How do you want to proceed? You also posted an updated patch in the "underscores" th

Re: Non-decimal integer literals

2023-01-13 Thread Dean Rasheed
On Wed, 14 Dec 2022 at 05:47, Peter Eisentraut wrote: > > committed Now that we have this for integer types, I think it's worth doing for numeric as well, since the parser will now pass such things through to numeric_in() when they don't fit in an int64, and it seems plausible that at least some

Re: Non-decimal integer literals

2022-12-13 Thread Peter Eisentraut
On 08.12.22 12:16, Peter Eisentraut wrote: On 29.11.22 21:22, David Rowley wrote: There seems to be a small bug in the pg_strtointXX functions in the code that checks that there's at least 1 digit.  This causes 0x to be a valid representation of zero.  That does not seem to be allowed by the par

Re: Non-decimal integer literals

2022-12-08 Thread Peter Eisentraut
On 29.11.22 21:22, David Rowley wrote: There seems to be a small bug in the pg_strtointXX functions in the code that checks that there's at least 1 digit. This causes 0x to be a valid representation of zero. That does not seem to be allowed by the parser, so I think we should likely reject it i

Re: Non-decimal integer literals

2022-11-30 Thread Tom Lane
David Rowley writes: > I agree that it should be a separate patch. But thinking about what > Tom mentioned in [1], I had in mind this patch would need to wait > until the new standard is out so that we have a more genuine reason > for breaking existing queries. Well, we already broke them in v15

Re: Non-decimal integer literals

2022-11-30 Thread David Rowley
On Thu, 1 Dec 2022 at 00:34, Dean Rasheed wrote: > So something > like: > > // Accumulate positive value using unsigned int, with approximate > // overflow check. If acc >= 1 - INT_MIN / 10, then acc * 10 is > // sure to exceed -INT_MIN. > unsigned int cutoff = 1 - INT_MIN / 10; >

Re: Non-decimal integer literals

2022-11-30 Thread Dean Rasheed
On Wed, 30 Nov 2022 at 05:50, David Rowley wrote: > > I spent a bit more time trying to figure out why the compiler does > imul instead of bit shifting and it just seems to be down to a > combination of signed-ness plus the overflow check. See [1]. Neither > of the two compilers I tested could use

Re: Non-decimal integer literals

2022-11-29 Thread David Rowley
On Wed, 23 Nov 2022 at 22:19, John Naylor wrote: > > > On Wed, Nov 23, 2022 at 3:54 PM David Rowley wrote: > > > > Going by [1], clang will actually use multiplication by 16 to > > implement the former. gcc is better and shifts left by 4, so likely > > won't improve things for gcc. It seems wort

Re: Non-decimal integer literals

2022-11-29 Thread David Rowley
On Tue, 29 Nov 2022 at 03:00, Peter Eisentraut wrote: > Fixed in new patch. There seems to be a small bug in the pg_strtointXX functions in the code that checks that there's at least 1 digit. This causes 0x to be a valid representation of zero. That does not seem to be allowed by the parser, so

Re: Non-decimal integer literals

2022-11-29 Thread David Rowley
On Tue, 29 Nov 2022 at 23:11, Dean Rasheed wrote: > > On Wed, 23 Nov 2022 at 08:56, David Rowley wrote: > > > > On Wed, 23 Nov 2022 at 21:54, David Rowley wrote: > > > I wonder if you'd be better off with something like: > > > > > > while (*ptr && isxdigit((unsigned char) *ptr)) > > >

Re: Non-decimal integer literals

2022-11-29 Thread Dean Rasheed
On Wed, 23 Nov 2022 at 08:56, David Rowley wrote: > > On Wed, 23 Nov 2022 at 21:54, David Rowley wrote: > > I wonder if you'd be better off with something like: > > > > while (*ptr && isxdigit((unsigned char) *ptr)) > > { > > if (unlikely(tmp & UINT64CONST(0xF0

Re: Non-decimal integer literals

2022-11-28 Thread David Rowley
On Sat, 26 Nov 2022 at 05:13, Peter Eisentraut wrote: > > On 24.11.22 10:13, David Rowley wrote: > > I > > remember many years ago and several jobs ago when working with SQL > > Server being able to speed up importing data using hexadecimal > > DATETIMEs. I can't think why else you might want to r

Re: Non-decimal integer literals

2022-11-28 Thread Peter Eisentraut
On 23.11.22 17:25, Dean Rasheed wrote: Taking a quick look, I noticed that there are no tests for negative values handled in the parser. Giving that a spin shows that make_const() fails to correctly identify the base of negative non-decimal integers in the T_Float case, causing it to fall throug

Re: Non-decimal integer literals

2022-11-25 Thread Peter Eisentraut
On 24.11.22 10:13, David Rowley wrote: On Thu, 24 Nov 2022 at 21:35, Peter Eisentraut wrote: My code follows the style used for parsing the decimal integers. Keeping that consistent is valuable I think. I think the proposed change makes the code significantly harder to understand. Also, what

Re: Non-decimal integer literals

2022-11-24 Thread David Rowley
On Thu, 24 Nov 2022 at 21:35, Peter Eisentraut wrote: > My code follows the style used for parsing the decimal integers. > Keeping that consistent is valuable I think. I think the proposed > change makes the code significantly harder to understand. Also, what > you are suggesting here would amou

Re: Non-decimal integer literals

2022-11-24 Thread Peter Eisentraut
On 23.11.22 09:54, David Rowley wrote: On Wed, 23 Nov 2022 at 02:37, Peter Eisentraut wrote: Here is a new patch. This looks like quite an inefficient way to convert a hex string into an int64: while (*ptr && isxdigit((unsigned char) *ptr)) { int8digit

Re: Non-decimal integer literals

2022-11-23 Thread Dean Rasheed
On Tue, 22 Nov 2022 at 13:37, Peter Eisentraut wrote: > > On 15.11.22 11:31, Peter Eisentraut wrote: > > On 14.11.22 08:25, John Naylor wrote: > >> Regarding the patch, it looks good overall. My only suggestion would > >> be to add a regression test for just below and just above overflow, at > >>

Re: Non-decimal integer literals

2022-11-23 Thread John Naylor
On Wed, Nov 23, 2022 at 3:54 PM David Rowley wrote: > > Going by [1], clang will actually use multiplication by 16 to > implement the former. gcc is better and shifts left by 4, so likely > won't improve things for gcc. It seems worth doing it this way for > anything that does not have HAVE__BUIL

Re: Non-decimal integer literals

2022-11-23 Thread David Rowley
On Wed, 23 Nov 2022 at 21:54, David Rowley wrote: > I wonder if you'd be better off with something like: > > while (*ptr && isxdigit((unsigned char) *ptr)) > { > if (unlikely(tmp & UINT64CONST(0xF000))) > goto out_of_range; > > tm

Re: Non-decimal integer literals

2022-11-23 Thread David Rowley
On Wed, 23 Nov 2022 at 02:37, Peter Eisentraut wrote: > Here is a new patch. This looks like quite an inefficient way to convert a hex string into an int64: while (*ptr && isxdigit((unsigned char) *ptr)) { int8digit = hexlookup[(unsigned char) *ptr];

Re: Non-decimal integer literals

2022-11-22 Thread John Naylor
On Tue, Nov 22, 2022 at 8:36 PM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > > On 15.11.22 11:31, Peter Eisentraut wrote: > > On 14.11.22 08:25, John Naylor wrote: > >> Regarding the patch, it looks good overall. My only suggestion would > >> be to add a regression test for just b

Re: Non-decimal integer literals

2022-11-22 Thread Peter Eisentraut
On 15.11.22 11:31, Peter Eisentraut wrote: On 14.11.22 08:25, John Naylor wrote: Regarding the patch, it looks good overall. My only suggestion would be to add a regression test for just below and just above overflow, at least for int2. ok This was a valuable suggestion, because this found

Re: Non-decimal integer literals

2022-11-15 Thread Peter Eisentraut
On 14.11.22 08:25, John Naylor wrote: Regarding the patch, it looks good overall. My only suggestion would be to add a regression test for just below and just above overflow, at least for int2. ok Minor nits: - * Process {integer}.  Note this will also do the right thing with {decimal}, +

Re: Non-decimal integer literals

2022-11-13 Thread John Naylor
On Mon, Oct 10, 2022 at 9:17 PM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > Taking another look around ecpg to see how this interacts with C-syntax > integer literals. I'm not aware of any particular issues, but it's > understandably tricky. I can find no discussion in the arc

Re: Non-decimal integer literals

2022-10-11 Thread Junwang Zhao
On Tue, Oct 11, 2022 at 4:59 PM Peter Eisentraut wrote: > > On 11.10.22 05:29, Junwang Zhao wrote: > > What do you think if we move these code into a static inline function? like: > > > > static inline char* > > process_digits(char *ptr, int32 *result) > > { > > ... > > } > > How would you handle

Re: Non-decimal integer literals

2022-10-11 Thread Peter Eisentraut
On 11.10.22 05:29, Junwang Zhao wrote: What do you think if we move these code into a static inline function? like: static inline char* process_digits(char *ptr, int32 *result) { ... } How would you handle the different ways each branch checks for valid digits and computes the value of each d

Re: Non-decimal integer literals

2022-10-10 Thread Junwang Zhao
Hi Peter, /* process digits */ + if (ptr[0] == '0' && (ptr[1] == 'x' || ptr[1] == 'X')) + { + ptr += 2; + while (*ptr && isxdigit((unsigned char) *ptr)) + { + int8 digit = hexlookup[(unsigned char) *ptr]; + + if (unlikely(pg_mul_s16_overflow(tmp, 16, &tmp)) || + unlikely(pg_sub_s16_overflow(tmp,

Re: Non-decimal integer literals

2022-10-10 Thread Peter Eisentraut
On 16.02.22 11:11, Peter Eisentraut wrote: The remaining patches are material for PG16 at this point, and I will set the commit fest item to returned with feedback in the meantime. Time to continue with this. Attached is a rebased and cleaned up patch for non-decimal integer literals. (I don

Re: Non-decimal integer literals

2022-02-16 Thread Peter Eisentraut
On 13.02.22 13:14, John Naylor wrote: On Wed, Jan 26, 2022 at 10:10 PM Peter Eisentraut wrote: [v8 patch] 0001-0004 seem pretty straightforward. These have been committed. 0005: {realfail1} { - /* - * throw back the [Ee], and figure out whether what - * remains is an {integer} or {d

Re: Non-decimal integer literals

2022-02-14 Thread Christoph Berg
Re: Peter Eisentraut > This adds support in the lexer as well as in the integer type input > functions. > > Those core parts are straightforward enough, but there are a bunch of other > places where integers are parsed, and one could consider in each case > whether they should get the same treatme

Re: Non-decimal integer literals

2022-02-13 Thread John Naylor
On Wed, Jan 26, 2022 at 10:10 PM Peter Eisentraut wrote: > [v8 patch] 0001-0004 seem pretty straightforward. 0005: {realfail1} { - /* - * throw back the [Ee], and figure out whether what - * remains is an {integer} or {decimal}. - */ - yyless(yyleng - 1); SET_YYLLOC(); - return process_integ

Re: Non-decimal integer literals

2022-01-26 Thread Andrew Dunstan
On 1/25/22 13:43, Alvaro Herrera wrote: > On 2022-Jan-24, Peter Eisentraut wrote: > >> +decinteger {decdigit}(_?{decdigit})* >> +hexinteger 0[xX](_?{hexdigit})+ >> +octinteger 0[oO](_?{octdigit})+ >> +bininteger 0[bB](_?{bindigit})+ > I think there should be te

Re: Non-decimal integer literals

2022-01-26 Thread Peter Eisentraut
On 26.01.22 01:02, Tom Lane wrote: Robert Haas writes: On Tue, Jan 25, 2022 at 5:34 AM Peter Eisentraut wrote: Which part exactly? There are several different changes proposed here. I was just going based on the description of the feature in your original post. If someone is hoping that i

Re: Non-decimal integer literals

2022-01-25 Thread Tom Lane
Robert Haas writes: > On Tue, Jan 25, 2022 at 5:34 AM Peter Eisentraut > wrote: >> Which part exactly? There are several different changes proposed here. > I was just going based on the description of the feature in your > original post. If someone is hoping that int4in() will accept only > ^\d

Re: Non-decimal integer literals

2022-01-25 Thread Alvaro Herrera
On 2022-Jan-24, Peter Eisentraut wrote: > +decinteger {decdigit}(_?{decdigit})* > +hexinteger 0[xX](_?{hexdigit})+ > +octinteger 0[oO](_?{octdigit})+ > +bininteger 0[bB](_?{bindigit})+ I think there should be test cases for literals that these seemingly str

Re: Non-decimal integer literals

2022-01-25 Thread Robert Haas
On Tue, Jan 25, 2022 at 5:34 AM Peter Eisentraut wrote: > On 24.01.22 19:53, Robert Haas wrote: > > On Mon, Jan 24, 2022 at 3:41 AM Peter Eisentraut > > wrote: > >> Rebased patch set > > > > What if someone finds this new behavior too permissive? > > Which part exactly? There are several differe

Re: Non-decimal integer literals

2022-01-25 Thread Peter Eisentraut
On 24.01.22 19:53, Robert Haas wrote: On Mon, Jan 24, 2022 at 3:41 AM Peter Eisentraut wrote: Rebased patch set What if someone finds this new behavior too permissive? Which part exactly? There are several different changes proposed here.

Re: Non-decimal integer literals

2022-01-24 Thread Robert Haas
On Mon, Jan 24, 2022 at 3:41 AM Peter Eisentraut wrote: > Rebased patch set What if someone finds this new behavior too permissive? -- Robert Haas EDB: http://www.enterprisedb.com

Re: Non-decimal integer literals

2022-01-24 Thread Peter Eisentraut
Rebased patch set On 13.01.22 14:42, Peter Eisentraut wrote: Another modest update, because of the copyright year update preventing the previous patches from applying cleanly. I also did a bit of work on the ecpg scanner so that it also handles some errors on par with the main scanner. Ther

Re: Non-decimal integer literals

2022-01-13 Thread Peter Eisentraut
Another modest update, because of the copyright year update preventing the previous patches from applying cleanly. I also did a bit of work on the ecpg scanner so that it also handles some errors on par with the main scanner. There is still no automated testing of this in ecpg, but I have a b

Re: Non-decimal integer literals

2021-12-30 Thread Peter Eisentraut
There has been some other refactoring going on, which made this patch set out of date. So here is an update. The old pg_strtouint64() has been removed, so there is no longer a naming concern with patch 0001. That one should be good to go. I also found that yet another way to parse integers

Re: Non-decimal integer literals

2021-12-01 Thread Peter Eisentraut
On 25.11.21 18:51, John Naylor wrote: If we're going to change the comment anyway, "the parser" sounds more natural. Aside from that, 0001 and 0002 can probably be pushed now, if you like. done --- a/src/interfaces/ecpg/preproc/pgc.l +++ b/src/interfaces/ecpg/preproc/pgc.l @@ -365,6 +365,10

Re: Non-decimal integer literals

2021-12-01 Thread Peter Eisentraut
On 25.11.21 16:46, Zhihong Yu wrote: For patch 3, +int64 +pg_strtoint64(const char *s) How about naming the above function pg_scanint64()? pg_strtoint64xx() can be named pg_strtoint64() - this would align with existing function: pg_strtouint64(const char *str, char **endptr, int base) That

Re: Non-decimal integer literals

2021-11-25 Thread John Naylor
Hi Peter, 0001 -/* we no longer allow unary minus in numbers. - * instead we pass it separately to parser. there it gets - * coerced via doNegate() -- Leon aug 20 1999 +/* + * Numbers + * + * Unary minus is not part of a number here. Instead we pass it separately to + * parser, and there it gets

Re: Non-decimal integer literals

2021-11-25 Thread Zhihong Yu
On Thu, Nov 25, 2021 at 5:18 AM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > On 01.11.21 07:09, Peter Eisentraut wrote: > > Here is an updated patch for this. It's the previous patch polished a > > bit more, and it contains changes so that numeric literals reject > > trailing id

Re: Non-decimal integer literals

2021-11-25 Thread Peter Eisentraut
On 01.11.21 07:09, Peter Eisentraut wrote: Here is an updated patch for this.  It's the previous patch polished a bit more, and it contains changes so that numeric literals reject trailing identifier parts without whitespace in between, as discussed. Maybe I should split that into incremental p

Re: Non-decimal integer literals

2021-10-31 Thread Peter Eisentraut
On 28.09.21 17:30, Peter Eisentraut wrote: On 09.09.21 16:08, Vik Fearing wrote: Even without that point, this patch *is* going to break valid queries, because every one of those cases is a valid number-followed-by-identifier today, Ah, true that.  So if this does go in, we may as well add t

Re: Non-decimal integer literals

2021-09-28 Thread Peter Eisentraut
On 09.09.21 16:08, Vik Fearing wrote: Even without that point, this patch *is* going to break valid queries, because every one of those cases is a valid number-followed-by-identifier today, Ah, true that. So if this does go in, we may as well add the underscores at the same time. Yeah, loo

Re: Non-decimal integer literals

2021-09-28 Thread Peter Eisentraut
On 07.09.21 13:50, Zhihong Yu wrote: On 16.08.21 17:32, John Naylor wrote: > The one thing that jumped out at me on a cursory reading is > the {integer} rule, which seems to be used nowhere except to > call process_integer_literal, which must then inspect the token text to

Re: Non-decimal integer literals

2021-09-09 Thread Vik Fearing
On 9/8/21 3:14 PM, Tom Lane wrote: > Vik Fearing writes: > >> Is there any hope of adding the optional underscores? I see a potential >> problem there as SELECT 1_a; is currently parsed as SELECT 1 AS _a; when >> it should be parsed as SELECT 1_ AS a; or perhaps even as an error since >> 0x1_a w

Re: Non-decimal integer literals

2021-09-08 Thread Tom Lane
Vik Fearing writes: > On 8/16/21 11:51 AM, Peter Eisentraut wrote: >> Here is a patch to add support for hexadecimal, octal, and binary >> integer literals: >> >>     0x42E >>     0o112 >>     0b100101 >> >> per SQL:202x draft. > Is there any hope of adding the optional underscores? I see a po

Re: Non-decimal integer literals

2021-09-08 Thread Vik Fearing
On 8/16/21 11:51 AM, Peter Eisentraut wrote: > Here is a patch to add support for hexadecimal, octal, and binary > integer literals: > >     0x42E >     0o112 >     0b100101 > > per SQL:202x draft. Is there any hope of adding the optional underscores? I see a potential problem there as SELECT 1

Re: Non-decimal integer literals

2021-09-07 Thread Zhihong Yu
On Tue, Sep 7, 2021 at 4:13 AM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > On 16.08.21 17:32, John Naylor wrote: > > The one thing that jumped out at me on a cursory reading is > > the {integer} rule, which seems to be used nowhere except to > > call process_integer_literal, whi

Re: Non-decimal integer literals

2021-09-07 Thread Peter Eisentraut
On 16.08.21 17:32, John Naylor wrote: The one thing that jumped out at me on a cursory reading is the {integer} rule, which seems to be used nowhere except to call process_integer_literal, which must then inspect the token text to figure out what type of integer it is. Maybe consider 4 separate

Re: Non-decimal integer literals

2021-08-16 Thread John Naylor
On Mon, Aug 16, 2021 at 5:52 AM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > > Here is a patch to add support for hexadecimal, octal, and binary > integer literals: > > 0x42E > 0o112 > 0b100101 > > per SQL:202x draft. > > This adds support in the lexer as well as in