On 20.11.24 08:29, jian he wrote:
in match_pattern_prefix maybe change
if (expr_coll && !get_collation_isdeterministic(expr_coll))
return NIL;
to
if (OidIsValid(expr_coll) && !get_collation_isdeterministic(expr_coll))
return NIL;
I left it like it was, because this w
On Tue, Nov 19, 2024 at 9:51 PM Peter Eisentraut wrote:
>
> On 18.11.24 04:30, jian he wrote:
> > we can optimize when trailing (last character) is not wildcards.
> >
> > SELECT 'Ha12foo' LIKE '%foo' COLLATE ignore_accents;
> > within the for loop
> > for(;;)
> > {
> > intcmp;
> > CHE
On 18.11.24 04:30, jian he wrote:
we can optimize when trailing (last character) is not wildcards.
SELECT 'Ha12foo' LIKE '%foo' COLLATE ignore_accents;
within the for loop
for(;;)
{
intcmp;
CHECK_FOR_INTERRUPTS();
}
pg_strncoll comparison will become
Ha12foofoo
a12foo
On Fri, Nov 15, 2024 at 11:42 PM Peter Eisentraut wrote:
>
> On 15.11.24 05:26, jian he wrote:
> > /*
> > * Now build a substring of the text and try to match it against
> > * the subpattern. t is the start of the text, t1 is one past the
> > * last byte. We start with a zero-length string.
> >
On 15.11.24 05:26, jian he wrote:
/*
* Now build a substring of the text and try to match it against
* the subpattern. t is the start of the text, t1 is one past the
* last byte. We start with a zero-length string.
*/
t1 = t
t1len = tlen;
for (;;)
{
int cmp;
CHECK_FOR_INTERRUPTS();
cmp = pg_str
On Tue, Nov 12, 2024 at 3:45 PM Peter Eisentraut wrote:
>
> On 11.11.24 14:25, Heikki Linnakangas wrote:
> > Sadly the algorithm is O(n^2) with non-deterministic collations.Is there
> > any way this could be optimized? We make no claims on how expensive any
> > functions or operators are, so I sup
On 11.11.24 14:25, Heikki Linnakangas wrote:
Sadly the algorithm is O(n^2) with non-deterministic collations.Is there
any way this could be optimized? We make no claims on how expensive any
functions or operators are, so I suppose a slow implementation is
nevertheless better than throwing an er
On 04/11/2024 10:26, Peter Eisentraut wrote:
On 29.10.24 18:15, Jacob Champion wrote:
libfuzzer is unhappy about the following code in MatchText:
+ while (p1len > 0)
+ {
+ if (*p1 == '\\')
+ {
+ found_escape = true;
+
On 29.10.24 18:15, Jacob Champion wrote:
libfuzzer is unhappy about the following code in MatchText:
+while (p1len > 0)
+{
+if (*p1 == '\\')
+{
+found_escape = true;
+NextByte(p1, p1len);
+
On Sun, Sep 15, 2024 at 11:26 PM Peter Eisentraut wrote:
>
> Here is an updated patch. It is rebased over the various recent changes
> in the locale APIs. No other changes.
libfuzzer is unhappy about the following code in MatchText:
> +while (p1len > 0)
> +{
> +
Here is an updated patch. It is rebased over the various recent changes
in the locale APIs. No other changes.
On 30.07.24 21:46, Peter Eisentraut wrote:
On 27.07.24 00:32, Paul A Jungwirth wrote:
On Thu, Jun 27, 2024 at 11:31 PM Peter Eisentraut
wrote:
Here is an updated patch for this.
Jeff Davis wrote:
> > col LIKE 'smith%' collate "nd"
> >
> > is equivalent to:
> >
> > col >= 'smith' collate "nd" AND col < U&'smith\' collate "nd"
>
> That logic seems to assume something about the collation. If you have a
> collation that orders strings by their sha256 hash,
On Fri, 2024-05-03 at 16:58 +0200, Daniel Verite wrote:
> * Generating bounds for a sort key (prefix matching)
>
> Having sort keys for strings allows for easy creation of bounds -
> sort keys that are guaranteed to be smaller or larger than any
> sort
> key from a give range. For exam
On 27.07.24 00:32, Paul A Jungwirth wrote:
On Thu, Jun 27, 2024 at 11:31 PM Peter Eisentraut wrote:
Here is an updated patch for this.
I took a look at this. I added some tests and found a few that give
the wrong result (I believe). The new tests are included in the
attached patch, along with
On Thu, Jun 27, 2024 at 11:31 PM Peter Eisentraut wrote:
> Here is an updated patch for this.
I took a look at this. I added some tests and found a few that give
the wrong result (I believe). The new tests are included in the
attached patch, along with the results I expect. Here are the
failures:
Here is an updated patch for this.
I have added some more documentation based on the discussions, including
some examples taken directly from the emails here.
One thing I have been struggling with a bit is the correct use of
LIKE_FALSE versus LIKE_ABORT in the MatchText() code. I have made s
On 03.05.24 17:47, Daniel Verite wrote:
Peter Eisentraut wrote:
However, off the top of my head, this definition has three flaws: (1)
It would make the single-character wildcard effectively an
any-number-of-characters wildcard, but only in some circumstances, which
could be confusing,
On 03.05.24 16:58, Daniel Verite wrote:
* Generating bounds for a sort key (prefix matching)
Having sort keys for strings allows for easy creation of bounds -
sort keys that are guaranteed to be smaller or larger than any sort
key from a give range. For example, if bounds are pro
Peter Eisentraut wrote:
> However, off the top of my head, this definition has three flaws: (1)
> It would make the single-character wildcard effectively an
> any-number-of-characters wildcard, but only in some circumstances, which
> could be confusing, (2) it would be difficult to com
Peter Eisentraut wrote:
> Yes, certainly, and there is also no indexing support (other than for
> exact matches).
The ICU docs have this note about prefix matching:
https://unicode-org.github.io/icu/userguide/collation/architecture.html#generating-bounds-for-a-sort-key-prefix-matching
On 03.05.24 15:20, Robert Haas wrote:
On Fri, May 3, 2024 at 4:52 AM Peter Eisentraut wrote:
What the implementation does is, it walks through the pattern. It sees
'_', so it steps over one character in the input string, which is '.'
here. Then we have 'foo.' left to match in the input string
On Fri, May 3, 2024 at 4:52 AM Peter Eisentraut wrote:
> What the implementation does is, it walks through the pattern. It sees
> '_', so it steps over one character in the input string, which is '.'
> here. Then we have 'foo.' left to match in the input string. Then it
> takes from the pattern
On 03.05.24 02:11, Robert Haas wrote:
On Thu, May 2, 2024 at 9:38 AM Peter Eisentraut wrote:
On 30.04.24 14:39, Daniel Verite wrote:
postgres=# SELECT '.foo.' like '_oo' COLLATE ign_punct;
?column?
--
f
(1 row)
The first two results look fine, but the next one is
On Thu, May 2, 2024 at 9:38 AM Peter Eisentraut wrote:
> On 30.04.24 14:39, Daniel Verite wrote:
> >postgres=# SELECT '.foo.' like '_oo' COLLATE ign_punct;
> > ?column?
> >--
> > f
> >(1 row)
> >
> > The first two results look fine, but the next one is inconsistent.
>
>
On 30.04.24 14:39, Daniel Verite wrote:
postgres=# SELECT '.foo.' like '_oo' COLLATE ign_punct;
?column?
--
f
(1 row)
The first two results look fine, but the next one is inconsistent.
This is correct, because '_' means "any single character". This is
independent of
Peter Eisentraut wrote:
> This patch adds support for using LIKE with nondeterministic
> collations. So you can do things such as
>
> col LIKE 'foo%' COLLATE case_insensitive
Nice!
> The pattern is partitioned into substrings at wildcard characters
> (so 'foo%bar' is partitioned i
26 matches
Mail list logo