On 23.06.21 14:43, tanghy.f...@fujitsu.com wrote:
I've updated the patch to V8 since Tom, Kyotaro and Laurenz discussed the lower 
case issue of German/Turkish language at [1].

Differences from V7 are:
* Add a function valid_input_text which checks the input text to see if it only 
contains alphabet letters, numbers etc.
* Delete the flag setting of "completion_case_sensitive=false" which introduced 
in V1 patch and no use now.

As you can see, now the patch limited the lower case transform of the input to 
alphabet letters.
By doing that, language like German/Turkish will not affected by this patch.

Any comment or suggestion on this patch is very welcome.

The coding of valid_input_text() seems a bit bulky. I think you can do the same thing using strspn() without a loop.

The name is also not great.  It's not like other strings are not "valid".

There is also no explanation why that specific set of characters is allowed and not others. Does it have something to do with identifier syntax? This needs to be explained.

Seeing that valid_input_text() is always called together with pg_string_tolower(), I think those could be combined into one function, like pg_string_tolower_if_ascii() is whatever. That would save a lot of repetition.

There are a couple of queries where the result is *not* case-insensitive, namely

Query_for_list_of_enum_values
Query_for_list_of_available_extension_versions

(and their variants). These are cases where the query result is not used as an identifier but as a (single-quoted) string. So that needs to be handled somehow, perhaps by adding a COMPLETE_WITH_QUERY_CS() similar to COMPLETE_WITH_CS().

(A test case for the enum case should be doable easily.)


Reply via email to