On Wed, Jul 15, 2015 at 12:31 AM, Jeff Janes wrote:
> On Tue, Jul 7, 2015 at 6:33 AM, Alexander Korotkov <
> a.korot...@postgrespro.ru> wrote:
>
>>
>>
>> See Tom Lane's comment about downgrade scripts. I think just remove it is
>> a right solution.
>>
>
> The new patch removes the downgrade path
On Tue, Jul 7, 2015 at 6:33 AM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:
>
>
> See Tom Lane's comment about downgrade scripts. I think just remove it is
> a right solution.
>
The new patch removes the downgrade path and the ability to install the old
version.
(If anyone wants an ea
On Tue, Jun 30, 2015 at 11:28 PM, Jeff Janes wrote:
> On Tue, Jun 30, 2015 at 2:46 AM, Alexander Korotkov <
> a.korot...@postgrespro.ru> wrote:
>
>> On Sun, Jun 28, 2015 at 1:17 AM, Jeff Janes wrote:
>>
>>> This patch implements version 1.2 of contrib module pg_trgm.
>>>
>>> This supports the tr
Jeff Janes writes:
> On Tue, Jun 30, 2015 at 2:46 AM, Alexander Korotkov <
> a.korot...@postgrespro.ru> wrote:
>> pg_trgm--1.1.sql andpg_trgm--1.1--1.2.sql are useful for debug, but do you
>> expect them in final commit? As I can see in other contribs we have only
>> last version and upgrade scrip
On Tue, Jun 30, 2015 at 2:46 AM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:
> On Sun, Jun 28, 2015 at 1:17 AM, Jeff Janes wrote:
>
>> This patch implements version 1.2 of contrib module pg_trgm.
>>
>> This supports the triconsistent function, introduced in version 9.4 of
>> the server
On Sun, Jun 28, 2015 at 1:17 AM, Jeff Janes wrote:
> This patch implements version 1.2 of contrib module pg_trgm.
>
> This supports the triconsistent function, introduced in version 9.4 of the
> server, to make it faster to implement indexed queries where some keys are
> common and some are rare.
On Mon, Jun 29, 2015 at 7:23 AM, Merlin Moncure wrote:
> On Sat, Jun 27, 2015 at 5:17 PM, Jeff Janes wrote:
>> V1.1: Time: 1743.691 ms --- after repeated execution to warm the cache
>>
>> V1.2: Time: 2.839 ms --- after repeated execution to warm the cache
>
> Wow! I'm going to test this.
On Sat, Jun 27, 2015 at 5:17 PM, Jeff Janes wrote:
> This patch implements version 1.2 of contrib module pg_trgm.
>
> This supports the triconsistent function, introduced in version 9.4 of the
> server, to make it faster to implement indexed queries where some keys are
> common and some are rare.
This patch implements version 1.2 of contrib module pg_trgm.
This supports the triconsistent function, introduced in version 9.4 of the
server, to make it faster to implement indexed queries where some keys are
common and some are rare.
I've included the paths to both upgrade and downgrade betwee
Hello,
> If you manually set RPADDING 2 in trgm.h, then it will, but the
> allocation probably should use LPADDING/RPADDING to get it right, rather
> than assume the max values.
Yes you are right. For RPADDING = 2, the current formula is suitable but for
RPADDING =1, a lot of extra space is all
On 03/09/2015 03:33 PM, Tom Lane wrote:
Beena Emerson writes:
In the pg_trgm module, within function generate_trgm, the memory for trigrams
is allocated as follows:
trg = (TRGM *) palloc(TRGMHDRSIZE + sizeof(trgm) * (slen / 2 + 1) *3);
I have been trying to understand why this is so becau
Beena Emerson writes:
> In the pg_trgm module, within function generate_trgm, the memory for trigrams
> is allocated as follows:
> trg = (TRGM *) palloc(TRGMHDRSIZE + sizeof(trgm) * (slen / 2 + 1) *3);
> I have been trying to understand why this is so because it seems to be
> allocating more spa
On 03/09/2015 02:54 PM, Alvaro Herrera wrote:
Beena Emerson wrote:
In the pg_trgm module, within function generate_trgm, the memory for trigrams
is allocated as follows:
trg = (TRGM *) palloc(TRGMHDRSIZE + sizeof(trgm) * (slen / 2 + 1) *3);
I have been trying to understand why this is so becau
Beena Emerson wrote:
> In the pg_trgm module, within function generate_trgm, the memory for trigrams
> is allocated as follows:
>
> trg = (TRGM *) palloc(TRGMHDRSIZE + sizeof(trgm) * (slen / 2 + 1) *3);
>
> I have been trying to understand why this is so because it seems to be
> allocating more s
In the pg_trgm module, within function generate_trgm, the memory for trigrams
is allocated as follows:
trg = (TRGM *) palloc(TRGMHDRSIZE + sizeof(trgm) * (slen / 2 + 1) *3);
I have been trying to understand why this is so because it seems to be
allocating more space than that is required.
The fo
On Fri, Nov 23, 2012 at 2:11 AM, Fujii Masao wrote:
> On Mon, Nov 19, 2012 at 10:56 AM, Tomas Vondra wrote:
>> I've done a quick review of the current patch:
>
> Thanks for the commit!
>
> As Alexander pointed out upthread, another infrastructure patch is required
> before applying this patch. So
On Mon, Nov 19, 2012 at 10:56 AM, Tomas Vondra wrote:
> I've done a quick review of the current patch:
Thanks for the commit!
As Alexander pointed out upthread, another infrastructure patch is required
before applying this patch. So I will implement the infra patch first.
Regards,
--
Fujii Ma
On Mon, Nov 19, 2012 at 7:55 PM, Alexander Korotkov
wrote:
> On Mon, Nov 19, 2012 at 10:05 AM, Alexander Korotkov
> wrote:
>>
>> On Thu, Nov 15, 2012 at 11:39 PM, Fujii Masao
>> wrote:
>>>
>>> Note that we cannot do a partial-match if KEEPONLYALNUM is disabled,
>>> i.e., if query key contains mu
On Mon, Nov 19, 2012 at 10:05 AM, Alexander Korotkov
wrote:
> On Thu, Nov 15, 2012 at 11:39 PM, Fujii Masao wrote:
>
>> Note that we cannot do a partial-match if KEEPONLYALNUM is disabled,
>> i.e., if query key contains multibyte characters. In this case, byte
>> length of
>> the trigram string mi
Hi!
On Thu, Nov 15, 2012 at 11:39 PM, Fujii Masao wrote:
> Note that we cannot do a partial-match if KEEPONLYALNUM is disabled,
> i.e., if query key contains multibyte characters. In this case, byte
> length of
> the trigram string might be larger than three, and its CRC is used as a
> trigram k
On 15.11.2012 20:39, Fujii Masao wrote:
> Hi,
>
> I'd like to propose to extend pg_trgm so that it can compare a partial-match
> query key to a GIN index. IOW, I'm thinking to implement the 'comparePartial'
> GIN method for pg_trgm.
>
> Currently, when the query key is less than three characters,
Hi,
I'd like to propose to extend pg_trgm so that it can compare a partial-match
query key to a GIN index. IOW, I'm thinking to implement the 'comparePartial'
GIN method for pg_trgm.
Currently, when the query key is less than three characters, we cannot use
a GIN index (+ pg_trgm) efficiently, be
On Tue, Jun 14, 2011 at 1:15 AM, Tom Lane wrote:
> I'm not sure that pg_upgrade is a good vehicle for dispensing such
> advice, anyway. At least in the Red Hat packaging, end users will never
> read what it prints, unless maybe it fails outright and they're trying
> to debug why.
In my experienc
On Jun14, 2011, at 07:15 , Tom Lane wrote:
> Robert Haas writes:
>> On Mon, Jun 13, 2011 at 7:47 PM, Bruce Momjian wrote:
>>> No, it does not. Under what circumstances should I issue a suggestion
>>> to reindex, and what should the text be?
>
>> It sounds like GIN indexes need to be reindexed a
Robert Haas writes:
> On Mon, Jun 13, 2011 at 7:47 PM, Bruce Momjian wrote:
>> No, it does not. Under what circumstances should I issue a suggestion
>> to reindex, and what should the text be?
> It sounds like GIN indexes need to be reindexed after upgrading from <
> 9.1 to >= 9.1.
Only if you
Robert Haas wrote:
> On Mon, Jun 13, 2011 at 7:47 PM, Bruce Momjian wrote:
> > Robert Haas wrote:
> >> On Sun, Jun 12, 2011 at 8:40 AM, Florian Pflug wrote:
> >> > Note that this restriction was removed in postgres 9.1 which
> >> > is currently in beta. However, GIT indices must be re-created
> >
On Mon, Jun 13, 2011 at 7:47 PM, Bruce Momjian wrote:
> Robert Haas wrote:
>> On Sun, Jun 12, 2011 at 8:40 AM, Florian Pflug wrote:
>> > Note that this restriction was removed in postgres 9.1 which
>> > is currently in beta. However, GIT indices must be re-created
>> > with REINDEX after upgradin
Robert Haas wrote:
> On Sun, Jun 12, 2011 at 8:40 AM, Florian Pflug wrote:
> > Note that this restriction was removed in postgres 9.1 which
> > is currently in beta. However, GIT indices must be re-created
> > with REINDEX after upgrading from 9.0 to leverage that
> > improvement.
>
> Does pg_upg
On Sun, Jun 12, 2011 at 8:40 AM, Florian Pflug wrote:
> Note that this restriction was removed in postgres 9.1 which
> is currently in beta. However, GIT indices must be re-created
> with REINDEX after upgrading from 9.0 to leverage that
> improvement.
Does pg_upgrade know about this?
--
Robert
Hi
Next time, please post questions regarding the usage of postgres
to the -general list, not to -hackers. The purpose of -hackers is
to discuss the development of postgres proper, not the development
of applications using postgres.
On Jun12, 2011, at 13:33 , Sushant Sinha wrote:
> I am using pg_
I am using pg_trgm for spelling correction as prescribed in the
documentation. But I see that it does not work for unicode sring. The
database was initialized with utf8 encoding and the C locale.
Here is the table:
\d words
Table "public.words"
Column | Type | Modifiers
+---
Greg Stark writes:
> There seem to be three behaviours on the table here:
You're neglecting
4) Let the user decide whether he wants pg_trgm to consider word
elements to be "alphanumerics" or "any non-space".
The main problem I have with Tatsuo's patch is that it forecloses any
reasonably upward
On Sun, May 30, 2010 at 3:41 PM, Tom Lane wrote:
> I don't think it's unreasonable to insist that behavioral changes be
> made in an upward compatible fashion ... especially ones that seem as
> least as likely to break some current usages as to enable new usages.
Fwiw I don't think we've traditio
> > > This is in 9.0, because 8.4 doesn't recognize the \u escape syntax. If
> > > you run this in 8.4, you're just comparing a sequence of ASCII letters
> > > and digits.
> >
> > Hum. Still I prefer 8.4's behavior since anything is better than
> > returning NaN. It seems 9.0 does not have any es
Tatsuo Ishii writes:
>> This is still ignoring the point: arbitrarily changing the module's
>> longstanding standard behavior isn't acceptable. You need to provide
>> a way for the user to control the behavior. (Once you've done that,
>> I think it can be just either "alnum" or "!isspace", but m
On sön, 2010-05-30 at 11:05 +0900, Tatsuo Ishii wrote:
> > > Wait. This works fine for me with stock pg_trgm. local is C and
> > > encoding is UTF8. What version of PostgreSQL are you using? Mine is
> > > 8.4.4.
> >
> > This is in 9.0, because 8.4 doesn't recognize the \u escape syntax. If
> > yo
> > Wait. This works fine for me with stock pg_trgm. local is C and
> > encoding is UTF8. What version of PostgreSQL are you using? Mine is
> > 8.4.4.
>
> This is in 9.0, because 8.4 doesn't recognize the \u escape syntax. If
> you run this in 8.4, you're just comparing a sequence of ASCII letter
> This is still ignoring the point: arbitrarily changing the module's
> longstanding standard behavior isn't acceptable. You need to provide
> a way for the user to control the behavior. (Once you've done that,
> I think it can be just either "alnum" or "!isspace", but maybe some
> other behavior
Tatsuo Ishii writes:
> After thinking a little bit more, I think following patch would not
> break existing behavior and also adopts mutibyte + C locale case. What
> do you think?
This is still ignoring the point: arbitrarily changing the module's
longstanding standard behavior isn't acceptable.
On Sat, May 29, 2010 at 9:13 AM, Tatsuo Ishii wrote:
> ! #define iswordchr(c) (lc_ctype_is_c()? \
> ! ((*(c) &
> 0x80)? !t_isspace(c) : (t_isalpha(c) || t_isdigit(c))) : \
>
Surely isspace(c) will always be false for non-ascii charac
> > It's not a practical solution for people working with prebuilt Postgres
> > versions, which is most people. I don't object to finding a way to
> > provide a "not-space" behavior instead of an "is-alnum" behavior,
> > but as noted upthread a GUC isn't the right way. How do you feel
> > about a
On fre, 2010-05-28 at 10:04 +0900, Tatsuo Ishii wrote:
> > I think the problem at hand has nothing at all to do with agglutination
> > or CJK-specific issues. You will get the same problem with other
> > languages *if* you set a locale that does not adequately support the
> > characters in use. E
> It's not a practical solution for people working with prebuilt Postgres
> versions, which is most people. I don't object to finding a way to
> provide a "not-space" behavior instead of an "is-alnum" behavior,
> but as noted upthread a GUC isn't the right way. How do you feel
> about a new set o
> I think the problem at hand has nothing at all to do with agglutination
> or CJK-specific issues. You will get the same problem with other
> languages *if* you set a locale that does not adequately support the
> characters in use. E.g., Russian with locale C and encoding UTF8:
>
> select simil
> Tatsuo Ishii writes:
> > similarity -> generate_trgm -> find_word -> iswordchr -> t_isalpha ->
> > isalpha
>
> > if locale is C and USE_WIDE_UPPER_LOWER defined which is the case in
> > most modern OSs.
>
> Quite. And *if locale is C then only standard ASCII letters are letters*.
> You may n
Tatsuo Ishii writes:
> Or you could just #undef KEEPONLYALNUM in trgm.h. But I'm not sure
> this is the right thing for you.
It's not a practical solution for people working with prebuilt Postgres
versions, which is most people. I don't object to finding a way to
provide a "not-space" behavior i
Tatsuo Ishii writes:
> similarity -> generate_trgm -> find_word -> iswordchr -> t_isalpha -> isalpha
> if locale is C and USE_WIDE_UPPER_LOWER defined which is the case in
> most modern OSs.
Quite. And *if locale is C then only standard ASCII letters are letters*.
You may not like that but it's
> > Problem with pg_trgm is, it uses isascii() etc. to recognize a letter,
> > which will skip any non ASCII range character in C locale.
>
> The only place I see that is in those ISPRINTABLE macros, which are only
> used in show_trgm(), which is just a debugging function. It could stand
> to be
Tatsuo Ishii writes:
> Problem with pg_trgm is, it uses isascii() etc. to recognize a letter,
> which will skip any non ASCII range character in C locale.
The only place I see that is in those ISPRINTABLE macros, which are only
used in show_trgm(), which is just a debugging function. It could st
> What I can't help wondering as I'm reading this discussion is -
> Tatsuo-san said upthread that he has a problem with pg_trgm that he
> does not have with full text search. So what is full text search
> doing differently than pg_trgm?
Problem with pg_trgm is, it uses isascii() etc. to recognize
> I think the problem at hand has nothing at all to do with agglutination
> or CJK-specific issues. You will get the same problem with other
> languages *if* you set a locale that does not adequately support the
> characters in use. E.g., Russian with locale C and encoding UTF8:
>
> select simil
On Thu, May 27, 2010 at 2:01 PM, Peter Eisentraut wrote:
> On fre, 2010-05-28 at 00:46 +0900, Tatsuo Ishii wrote:
>> > I don't know about Japanese, but the locale approach works just fine for
>> > other agglutinative languages. I would rather suspect that it is the
>> > trigram approach that migh
On fre, 2010-05-28 at 00:46 +0900, Tatsuo Ishii wrote:
> > I don't know about Japanese, but the locale approach works just fine for
> > other agglutinative languages. I would rather suspect that it is the
> > trigram approach that might be rather useless for such languages,
> > because you are goi
> So I think a GUC is broken because pg_tgrm has a index opclasses and
> any indexes built using one setting will be broken if the GUC is
> changed.
>
> Perhaps we need two sets of functions (which presumably call the same
> implementation with a flag to indicate which definition to use). Then
> y
> I don't know about Japanese, but the locale approach works just fine for
> other agglutinative languages. I would rather suspect that it is the
> trigram approach that might be rather useless for such languages,
> because you are going to get a lot of similarity hits for the affixes.
I'm not su
On tor, 2010-05-27 at 23:20 +0900, Tatsuo Ishii wrote:
> Anyway locale is completely usesless for finding word vs non-character
> an agglutinative language such as Japanese.
I don't know about Japanese, but the locale approach works just fine for
other agglutinative languages. I would rather susp
On Thu, May 27, 2010 at 3:52 PM, Tom Lane wrote:
> I think a more appropriate type of fix would be to expose the
> KEEPONLYALNUM option as a GUC, or some other way of letting the
> user decide what he wants.
>
So I think a GUC is broken because pg_tgrm has a index opclasses and
any indexes built
Tatsuo Ishii writes:
> ! #define iswordchr(c)(t_isalpha(c) || t_isdigit(c) ||
> (lc_ctype_is_c() && !t_isspace(c)))
This seems entirely arbitrary. It might "fix" things in your view
but it will break the longstanding behavior for other people.
I think a more appropriate type of fix wou
> Well, that doesn't mean that the answer is to use C locale ;-)
Of course it's up to user whether to use C locale or not. I just want
pg_trgm work with C locale as well.
> However, you could possibly think about making this bit of code
> more flexible:
>
> #ifdef KEEPONLYALNUM
> #define iswordc
Tatsuo Ishii writes:
> Anyway locale is completely usesless for finding word vs non-character
> an agglutinative language such as Japanese.
Well, that doesn't mean that the answer is to use C locale ;-)
However, you could possibly think about making this bit of code
more flexible:
#ifdef KEEPON
> Exactly what do you consider to be the missing functionality?
> You need a notion of word vs non-word character from somewhere,
> and the locale setting is the standard place to get that. The
> core text search functionality behaves the same way.
No. Text search works fine with multibyte + C lo
Tatsuo Ishii writes:
>> It's not a problem, it's just pilot error, or possibly inadequate
>> documentation. pg_trgm uses the locale's definition of "alpha",
>> "digit", etc. In C locale only basic ASCII letters and digits will be
>> recognized as word constituents.
> That means there is no chan
> > Yes, pg_trgm seems to have problems with multibyte + C locale.
>
> It's not a problem, it's just pilot error, or possibly inadequate
> documentation. pg_trgm uses the locale's definition of "alpha",
> "digit", etc. In C locale only basic ASCII letters and digits will be
> recognized as word
Tatsuo Ishii writes:
> What is your locale?
>> It was en_EN.UTF-8. Interesting. With C it fails...
> Yes, pg_trgm seems to have problems with multibyte + C locale.
It's not a problem, it's just pilot error, or possibly inadequate
documentation. pg_trgm uses the locale's definition of "alpha",
"
> > What is your locale?
> It was en_EN.UTF-8. Interesting. With C it fails...
Yes, pg_trgm seems to have problems with multibyte + C locale.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
--
Sent via pgsql-hackers mailing list
On Thursday 27 May 2010 14:40:41 Tatsuo Ishii wrote:
> > > No, it doesn't.
> > > Encoding is EUC_JP, locale is C. Included is the script to reproduce
> > > the problem.
> >
> > test=# select show_trgm('日本語');
> >
> > show_trgm
> >
> > ---
> >
> > No, it doesn't.
> > Encoding is EUC_JP, locale is C. Included is the script to reproduce
> > the problem.
> test=# select show_trgm('日本語');
> show_trgm
> ---
> {0x8194c0,0x836e53,0x1dc363,0x1e22e9}
> (1 row)
>
> Time: 0.44
Hi,
On Thursday 27 May 2010 13:53:37 Tatsuo Ishii wrote:
> > It's already multibyte safe since 8.4
>
> No, it doesn't.
> Encoding is EUC_JP, locale is C. Included is the script to reproduce
> the problem.
test=# select show_trgm('日本語');
show_trgm
--
> It's already multibyte safe since 8.4
No, it doesn't.
$ psql test
Pager usage is off.
psql (8.4.4)
Type "help" for help.
test=# select similarity('abc', 'abd'); -- OK
similarity
0.33
(1 row)
test=# select similarity('日本語', '日本後'); -- NG
similarity
Anyone working on make contrib/pg_trgm mutibyte encoding aware? If
not, I'm interested in the work.
It's already multibyte safe since 8.4
--
Teodor Sigaev E-mail: teo...@sigaev.ru
WWW: http://www.sigaev.ru/
--
Hi,
Anyone working on make contrib/pg_trgm mutibyte encoding aware? If
not, I'm interested in the work.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To mak
71 matches
Mail list logo