On Fri, Jan 20, 2012 at 12:30 AM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote:
> The code badly needs comments. There is no explanation of how the trigram >> extraction code in trgm_regexp.c works. > > Sure. I hoped to find a time for comments before commitfest starts. Unfortunately I didn't, sorry. > Guessing from the variable names, it seems to be some sort of a coloring > algorithm that works on a graph, but that all needs to be explained. Can > this algorithm be found somewhere in literature, perhaps? A link to a paper > would be nice. > I hope it's truly novel. At least application to regular expressions. I'm going to write a paper about it. > Apart from that, the multibyte issue seems like the big one. Any way > around that? Conversion of pg_wchar to multibyte character is the only way I found to avoid serious hacking of existing regexp code. Do you think additional function in pg_wchar_tbl which converts pg_wchar back to multibyte character is possible solution? ------ With best regards, Alexander Korotkov.