On 06/26/2018 08:20 AM, Arthur Zakirov wrote:
Hello hackers,

I'd like to propose the patch which syncs PostgreSQL snowball stemmers.
As Tom pointed [1] stemmers haven't synced for a very long time.

I copied all source files without changes, except replacing '#include
"../runtime/header.h"' with '#include "header.h"' and removing includes
of standard headers from utilities.c.

Hungarian language uses ISO-8859-1 and UTF-8 charsets in Postgres HEAD.
But in Snowball HEAD it is ISO-8859-2 per commit [2]. This patch changes
hungarian's charset from ISO-8859-1 to ISO-8859-2 too.

Additionally updated files in the patch are:
- utilities.c
- header.h

Will add to the next commitfest.

Any comments?

1 - https://www.postgresql.org/message-id/5689.1519054983%40sss.pgh.pa.us
2 - 
https://github.com/snowballstem/snowball/commit/4bcae97db044253ea2edae1dd3ca59f3cddd4b9d



I agree with Tom that we should sync with the upstream before we do anything else. This is a very large patch  but with fairly limited impact. I think now at the start of a dev cycle is the right time to apply it.

I don't know if we have a buildfarm animal testing Hungarian. Maybe we need a buildfarm animal or two testing a large number of locales.

cheers

andrew

--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply via email to