> The patch introduces way to configure FTS based on CASE/WHEN/THEN/ELSE > construction.
Interesting feature. I needed this flexibility before when I was implementing text search for a Turkish private listing application. Aleksandr and Arthur were kind enough to discuss it with me off-list today. > 1) Multilingual search. Can be used for FTS on a set of documents in > different languages (example for German and English languages). > > ALTER TEXT SEARCH CONFIGURATION multi > ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, > word, hword, hword_part WITH CASE > WHEN english_hunspell AND german_hunspell THEN > english_hunspell UNION german_hunspell > WHEN english_hunspell THEN english_hunspell > WHEN german_hunspell THEN german_hunspell > ELSE german_stem UNION english_stem > END; I understand the need to support branching, but this syntax is overly complicated. I don't think there is any need to support different set of dictionaries as condition and action. Something like this might work better: ALTER TEXT SEARCH CONFIGURATION multi ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, word, hword, hword_part WITH CASE english_hunspell UNION german_hunspell WHEN MATCH THEN KEEP ELSE german_stem UNION english_stem END; To put it formally: ALTER TEXT SEARCH CONFIGURATION name ADD MAPPING FOR token_type [, ... ] WITH config where config is one of: dictionary_name config { UNION | INTERSECT | EXCEPT } config CASE config WHEN [ NO ] MATCH THEN [ KEEP ELSE ] config END > 2) Combination of exact search with morphological one. This patch not > fully solve the problem but it is a step toward solution. Currently, we > should split exact and morphological search in query manually and use > separate index for each part. With new way to configure FTS we can use > following configuration: > > ALTER TEXT SEARCH CONFIGURATION exact_and_morph > ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, > word, hword, hword_part WITH CASE > WHEN english_hunspell THEN english_hunspell UNION simple > ELSE english_stem UNION simple > END This could be: CASE english_hunspell THEN KEEP ELSE english_stem END UNION simple > 3) Using different dictionaries for recognizing and output generation. > As I mentioned before, in new syntax condition and command are separate > and we can use it for some more complex text processing. Here an > example for processing only nouns: > > ALTER TEXT SEARCH CONFIGURATION nouns_only > ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, > word, hword, hword_part WITH CASE > WHEN english_noun THEN english_hunspell > END This would also still work with the simpler syntax because "english_noun", still being a dictionary, would pass the tokens to the next one. > 4) Special stopword processing allows us to discard stopwords even if > the main dictionary doesn't support such feature (in example pl_ispell > dictionary keeps stopwords in text): > > ALTER TEXT SEARCH CONFIGURATION pl_without_stops > ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, > word, hword, hword_part WITH CASE > WHEN simple_pl IS NOT STOPWORD THEN pl_ispell > END Instead of supporting old way of putting stopwords on dictionaries, we can make them dictionaries on their own. This would then become something like: CASE polish_stopword WHEN NO MATCH THEN polish_isspell END -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers