On Sun, 2011-10-02 at 01:25 +0200, Reuven M. Lerner wrote: > Hi, everyone. I'm working on a project on PostgreSQL 9.0 (soon to be > upgraded to 9.1, given that we haven't yet launched). The project > will involve numerous text fields containing English, Spanish, and > Portuguese. Some of those text fields will be searchable by the user. > That's easy enough to do; for our purposes, I was planning to use some > combination of LIKE searches; the database is small enough that this > doesn't take very much time, and we don't expect the number of > searchable records (or columns within those records) to be all that > large. > > The thing is, the people running the site want searches to work on > what I'm calling (for lack of a better term) "bare" letters. That is, > if the user searches for "n", then the search should also match > Spanish words containing "ñ". I'm told by Spanish-speaking members of > the team that this is how they would expect searches to work. > However, when I just did a quick test using a UTF-8 encoded 9.0 > database, I found that PostgreSQL didn't see the two characters as > identical. (I must say, this is the behavior that I would have > expected, had the Spanish-speaking team member not said anything on > the subject.) > > So my question is whether I can somehow wrangle PostgreSQL into > thinking that "n" and "ñ" are the same character for search purposes, > or if I need to do something else -- use regexps, keep a "naked," > searchable version of each column alongside the native one, or > something else entirely -- to get this to work. >
Could you parse the search string for the non-English characters and convert them to the appropriate English character? My skills are not that good or I would offer more details. > Any ideas? > > Thanks, > > Reuven > > > > -- > Reuven M. Lerner -- Web development, consulting, and training > Mobile: +972-54-496-8405 * US phone: 847-230-9795 > Skype/AIM: reuvenlerner -- Jay Lozier jsloz...@gmail.com