> Well, when the preceding comment block contains five references to
> xemacs and the link for more information leads to www.xemacs.org,
> I don't think it's real helpful to add one sentence saying "oh
> by the way we're not actually following xemacs".
>
> I continue to think that we'd be better o
Tatsuo Ishii writes:
> Done along with comment that we follow emacs's implementation, not
> xemacs's.
Well, when the preceding comment block contains five references to
xemacs and the link for more information leads to www.xemacs.org,
I don't think it's real helpful to add one sentence saying "oh
Tatsuo Ishii writes:
>> So far as I can see, the only LCPRVn marker code that is actually in
>> use right now is 0x9d --- there are no instances of 9a, 9b, or 9c
>> that I can find.
>>
>> I also read in the xemacs internals doc, at
>> http://www.xemacs.org/Documentati
>>> Tatsuo Ishii writes:
> So far as I can see, the only LCPRVn marker code that is actually in
> use right now is 0x9d --- there are no instances of 9a, 9b, or 9c
> that I can find.
>
> I also read in the xemacs internals doc, at
> http://www.xemacs.org/Documentation/21.5
>> Tatsuo Ishii writes:
So far as I can see, the only LCPRVn marker code that is actually in
use right now is 0x9d --- there are no instances of 9a, 9b, or 9c
that I can find.
I also read in the xemacs internals doc, at
http://www.xemacs.org/Documentation/21.5/html/i
On Thu, Jul 5, 2012 at 8:46 PM, Tom Lane wrote:
> Robert Haas writes:
>> On Thu, Jul 5, 2012 at 7:11 PM, Tom Lane wrote:
>>> Hm, several of these routines seem to neglect to advance the "from"
>>> pointer?
>
>> Err... yeah. That's not a bug I introduced, but I should have caught
>> it... and it
> Tatsuo Ishii writes:
>>> So far as I can see, the only LCPRVn marker code that is actually in
>>> use right now is 0x9d --- there are no instances of 9a, 9b, or 9c
>>> that I can find.
>>>
>>> I also read in the xemacs internals doc, at
>>> http://www.xemacs.org/Documentation/21.5/html/internal
Robert Haas writes:
> On Thu, Jul 5, 2012 at 7:11 PM, Tom Lane wrote:
>> Hm, several of these routines seem to neglect to advance the "from"
>> pointer?
> Err... yeah. That's not a bug I introduced, but I should have caught
> it... and it does make me wonder how well this code was tested.
> Do
On Thu, Jul 5, 2012 at 7:11 PM, Tom Lane wrote:
> Robert Haas writes:
>> On Sun, Jul 1, 2012 at 5:11 AM, Alexander Korotkov
>> wrote:
>>> [ new patch ]
>
>> With the improved comments in pg_wchar.h, it seemed clear what needed
>> to be done here, so I fixed up the MULE conversion and committed
Tatsuo Ishii writes:
>> So far as I can see, the only LCPRVn marker code that is actually in
>> use right now is 0x9d --- there are no instances of 9a, 9b, or 9c
>> that I can find.
>>
>> I also read in the xemacs internals doc, at
>> http://www.xemacs.org/Documentation/21.5/html/internals_26.htm
Robert Haas writes:
> On Sun, Jul 1, 2012 at 5:11 AM, Alexander Korotkov
> wrote:
>> [ new patch ]
> With the improved comments in pg_wchar.h, it seemed clear what needed
> to be done here, so I fixed up the MULE conversion and committed this.
> I'd appreciate it if someone would check my work
> On Sun, Jul 1, 2012 at 5:11 AM, Alexander Korotkov
> wrote:
>> [ new patch ]
>
> With the improved comments in pg_wchar.h, it seemed clear what needed
> to be done here, so I fixed up the MULE conversion and committed this.
> I'd appreciate it if someone would check my work, but I think it's
On Sun, Jul 1, 2012 at 5:11 AM, Alexander Korotkov wrote:
> [ new patch ]
With the improved comments in pg_wchar.h, it seemed clear what needed
to be done here, so I fixed up the MULE conversion and committed this.
I'd appreciate it if someone would check my work, but I think it's
right.
--
Ro
> So far as I can see, the only LCPRVn marker code that is actually in
> use right now is 0x9d --- there are no instances of 9a, 9b, or 9c
> that I can find.
>
> I also read in the xemacs internals doc, at
> http://www.xemacs.org/Documentation/21.5/html/internals_26.html#SEC145
> that XEmacs think
I wrote:
> Tatsuo Ishii writes:
>>> I have added comments about mule internal encoding by refreshing my
>>> memory and from old document found on
>>> web(http://mibai.tec.u-ryukyu.ac.jp/cgi-bin/info2www?%28mule%29Buffer%20and%20string).
>> Any objection to apply my patch?
> It needs a bit of cop
Tatsuo Ishii writes:
>> I have added comments about mule internal encoding by refreshing my
>> memory and from old document found on
>> web(http://mibai.tec.u-ryukyu.ac.jp/cgi-bin/info2www?%28mule%29Buffer%20and%20string).
> Any objection to apply my patch?
It needs a bit of copy-editing, and I
> I have added comments about mule internal encoding by refreshing my
> memory and from old document found on
> web(http://mibai.tec.u-ryukyu.ac.jp/cgi-bin/info2www?%28mule%29Buffer%20and%20string).
Any objection to apply my patch?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.
Alexander Korotkov writes:
> It's likely we also need to assign some names to all these numbers
> (0xf0, 0xf4, 0xfe, 0x9c, 0x9d). But it's hard for me to invent such names.
The encoding ID byte values already have names (see pg_wchar.h), but the
private prefix bytes don't. I griped about that up
On Tue, Jul 3, 2012 at 10:17 AM, Tatsuo Ishii wrote:
> > OK. So, in that case, I suggest that if the leading byte is non-zero,
> > we emit 0x9d followed by the three available bytes, instead of first
> > testing whether the first byte is >= 0xf0. That test seems to serve
> > no purpose but to c
> OK. So, in that case, I suggest that if the leading byte is non-zero,
> we emit 0x9d followed by the three available bytes, instead of first
> testing whether the first byte is >= 0xf0. That test seems to serve
> no purpose but to confuse the issue.
Probably the code shoud look like this(see b
On Mon, Jul 2, 2012 at 7:33 PM, Tatsuo Ishii wrote:
>> Yeah, I did. I think I may be a bit confused here, so let me try to
>> understand this a bit better. It seems like pg_mule2wchar_with_len
>> uses the following algorithm:
>>
>> - If the first character IS_LC1 (0x81-0x8d), decode two bytes, s
I wrote:
> Some inspection of pg_wchar.h suggests that the IS_LCPRV1 and IS_LCPRV2
> cases are unused: the file doesn't define any encoding labels that match
> the byte values they accept, nor do the comments suggest that Emacs has
> any such labels either.
Scratch that --- I was misled by the fon
Robert Haas writes:
> In the reverse transformation implemented by pg_wchar2mule_with_len,
> if the byte stored with shift 16 IS_LC1 or IS_LC2, then we decode 2 or
> 3 bytes, respectively, exactly as I would expect. ASCII decoding is
> also as I would expect. The case I don't understand is what
> Yeah, I did. I think I may be a bit confused here, so let me try to
> understand this a bit better. It seems like pg_mule2wchar_with_len
> uses the following algorithm:
>
> - If the first character IS_LC1 (0x81-0x8d), decode two bytes, stored
> with shifts of 16 and 0.
> - If the first charact
On Mon, Jul 2, 2012 at 4:46 PM, Alexander Korotkov wrote:
> So, I provided such transformation in versions 0.3 and 0.4 based on
> explanation from Tatsuo Ishii. The problem is that both conversions are
> nontrivial and it's not evident that they are mirror (understanding that
> they are mirror req
On Tue, Jul 3, 2012 at 12:37 AM, Robert Haas wrote:
> On Mon, Jul 2, 2012 at 4:33 PM, Alexander Korotkov
> wrote:
> > On Mon, Jul 2, 2012 at 8:12 PM, Robert Haas
> wrote:
> >>
> >> On Sun, Jul 1, 2012 at 5:11 AM, Alexander Korotkov <
> aekorot...@gmail.com>
> >> wrote:
> >> >> MULE also looks p
On Mon, Jul 2, 2012 at 4:33 PM, Alexander Korotkov wrote:
> On Mon, Jul 2, 2012 at 8:12 PM, Robert Haas wrote:
>>
>> On Sun, Jul 1, 2012 at 5:11 AM, Alexander Korotkov
>> wrote:
>> >> MULE also looks problematic. The code that you've written isn't
>> >> symmetric with the opposite conversion, u
On Mon, Jul 2, 2012 at 8:12 PM, Robert Haas wrote:
> On Sun, Jul 1, 2012 at 5:11 AM, Alexander Korotkov
> wrote:
> >> MULE also looks problematic. The code that you've written isn't
> >> symmetric with the opposite conversion, unlike what you did in all
> >> other cases, and I don't understand
On Sun, Jul 1, 2012 at 5:11 AM, Alexander Korotkov wrote:
>> MULE also looks problematic. The code that you've written isn't
>> symmetric with the opposite conversion, unlike what you did in all
>> other cases, and I don't understand why. I'm also somewhat baffled by
>> the reverse conversion: i
On Wed, Jun 27, 2012 at 11:35 PM, Robert Haas wrote:
> It looks to me like pg_wchar2utf_with_len will not work, because
> unicode_to_utf8 returns its second argument unmodified - not, as your
> code seems to assume, the byte following what was already written.
>
Fixed.
> MULE also looks proble
On Thu, May 24, 2012 at 12:04 AM, Alexander Korotkov
wrote:
> Thanks. I rewrote inverse conversion from pg_wchar to mule. New version of
> patch is attached.
Review:
It looks to me like pg_wchar2utf_with_len will not work, because
unicode_to_utf8 returns its second argument unmodified - not, as
> On Tue, May 22, 2012 at 3:27 PM, Tatsuo Ishii wrote:
>
>> > Thanks for your comments. They clarify a lot.
>> > But I still don't realize how can we distinguish IS_LCPRV2 and IS_LC2?
>> > Isn't it possible for them to produce same pg_wchar?
>>
>> If LB is in 0x90 - 0x99 range, then they are LC2.
On Tue, May 22, 2012 at 3:27 PM, Tatsuo Ishii wrote:
> > Thanks for your comments. They clarify a lot.
> > But I still don't realize how can we distinguish IS_LCPRV2 and IS_LC2?
> > Isn't it possible for them to produce same pg_wchar?
>
> If LB is in 0x90 - 0x99 range, then they are LC2.
> If LB
> Thanks for your comments. They clarify a lot.
> But I still don't realize how can we distinguish IS_LCPRV2 and IS_LC2?
> Isn't it possible for them to produce same pg_wchar?
If LB is in 0x90 - 0x99 range, then they are LC2.
If LB is in 0xf0 - 0xff range, then they are LCPRV2.
--
Tatsuo Ishii
SRA
On Tue, May 22, 2012 at 11:50 AM, Tatsuo Ishii wrote:
>
> I think it's possible. The first characters are defined like this:
>
> #define IS_LCPRV1(c)((unsigned char)(c) == 0x9a || (unsigned char)(c)
> == 0x9b)
> #define IS_LCPRV2(c)((unsigned char)(c) == 0x9c || (unsigned char)(c)
> == 0x9
Hi Alexander,
It was good seeing you in Ottawa!
> Hello, Ishii-san!
>
> We've talked on PGCon that I've questions about mule to wchar
> conversion. My questions about pg_mule2wchar_with_len function are
> following. In these parts of code:
> *
> *
> else if (IS_LCPRV1(*from) && len >= 3)
> {
>
Hello, Ishii-san!
We've talked on PGCon that I've questions about mule to wchar
conversion. My questions about pg_mule2wchar_with_len function are
following. In these parts of code:
*
*
else if (IS_LCPRV1(*from) && len >= 3)
{
from++;
*to = *from++ << 16;
*to |= *from++;
len -= 3;
On Wed, May 2, 2012 at 5:48 PM, Robert Haas wrote:
> On Wed, May 2, 2012 at 9:35 AM, Alexander Korotkov
> wrote:
> > Imagine we've two queries:
> > 1) SELECT * FROM tbl WHERE col LIKE '%abcd%';
> > 2) SELECT * FROM tbl WHERE col LIKE '%abcdefghijk%';
> >
> > The first query require reading post
On Wed, May 2, 2012 at 9:35 AM, Alexander Korotkov wrote:
>> I was thinking you could perhaps do it just based on the *number* of
>> trigrams, not necessarily their frequency.
>
> Imagine we've two queries:
> 1) SELECT * FROM tbl WHERE col LIKE '%abcd%';
> 2) SELECT * FROM tbl WHERE col LIKE '%abc
On Wed, May 2, 2012 at 4:50 PM, Robert Haas wrote:
> On Tue, May 1, 2012 at 6:02 PM, Alexander Korotkov
> wrote:
> > Right. When number of trigrams is big, it is slow to scan posting list of
> > all of them. The solution is this case is to exclude most frequent
> trigrams
> > from index scan. B
On Tue, May 1, 2012 at 6:02 PM, Alexander Korotkov wrote:
> Right. When number of trigrams is big, it is slow to scan posting list of
> all of them. The solution is this case is to exclude most frequent trigrams
> from index scan. But, it require some kind of statistics of trigrams
> frequencies
On Tue, May 1, 2012 at 1:48 AM, Kevin Grittner
wrote:
> My biggest complaint is related to setting the threshold for the %
> operator. It seems to me that there should be a GUC to control the
> default, and that there should be a way to set the threshold for
> each % operator in a query (if there
On Mon, Apr 30, 2012 at 10:07 PM, Robert Haas wrote:
> On Sun, Apr 29, 2012 at 8:12 AM, Erik Rijkers wrote:
> > Perhaps I'm too early with these tests, but FWIW I reran my earlier test
> program against three
> > instances. (the patches compiled fine, and make check was without
> problem).
>
>
Hi Erik
On Sun, Apr 29, 2012 at 4:12 PM, Erik Rijkers wrote:
> Perhaps I'm too early with these tests, but FWIW I reran my earlier test
> program against three
> instances. (the patches compiled fine, and make check was without
> problem).
>
> -- 3 instances:
> HEAD port 6542
>
Robert Haas wrote:
> Hopefully that's not too hard to fix; the basic approach seems
> quite promising.
After playing with trigram searches for name searches against copies
of production database with appropriate indexing, our shop has
chosen it as the new way to do name searches here. It's re
On Sun, Apr 29, 2012 at 8:12 AM, Erik Rijkers wrote:
> Perhaps I'm too early with these tests, but FWIW I reran my earlier test
> program against three
> instances. (the patches compiled fine, and make check was without problem).
These tests results seem to be more about the pg_trgm changes tha
Hi Alexander,
Perhaps I'm too early with these tests, but FWIW I reran my earlier test
program against three
instances. (the patches compiled fine, and make check was without problem).
-- 3 instances:
HEAD port 6542
trgm_regex port 6547 HEAD + trgm-regexp patch (22 N
Hackers,
attached patch adds conversion from pg_wchar string to multibyte string.
This functionality is needed for my patch on index support for regular
expression search
http://archives.postgresql.org/pgsql-hackers/2011-11/msg01297.php .
Analyzing conversion from multibyte to pg_wchar I found fol
48 matches
Mail list logo