-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
I realized this also. Martin Schwarz off list suggested to me to use REGEXP for that purpose. That means that I'm rewriting the user input "übersee" behind the scenes to "^(ü|ue)bersee.*" which works quote well. The only gotcha is that the user may enter any special regexp characters also, so I had to make a list of them and escape them prior to modifying the string. thanks, - - Markus sheeri kritzer wrote: > MySQL doesn't have anything like that. You can use the wildcard > characters instead of the umlauts if you want, such as > > SELECT * from person where name like "%bersee" > which would get > "übersee" and "uebersee" > but also a whole lot more. > > But doing something like > SELECT * from person where name like "_bersee" or name like "__bersee" > might work -- the underscore means "1 of any character", so here the > only noise you'd get are other folks whose names are > _ _ bersee > So there's still a margin for error. > > Unfortunately, there's no special case for "hey, when you're looking > at LIKE, I want to define that x=y" -- particularly when x and y have > differing #'s of characters. > > -Sheeri > On 3/22/06, Markus Fischer <[EMAIL PROTECTED]> wrote: > Hi, > > what is the best way to match german umlauts like 'ä' also their > alternative writing 'ae'? > > For example I'm searching for "übersee" and I also want to find the word > "uebersee" in the database. The "words" are actually names of persons. > > One possibility is to dynamically expand the SQL statement if such > special characters are found. So the search term "übersee" will be > expanded to "SELECT * FROM person WHERE name LIKE 'übersee%' AND name > LIKE 'uebersee%'" but this is getting dirty and very very long if > multiple umlauts are used to cover all cases ... > > So the other idea is to have the name twice in the database for every > person and the second "version" of the name is a normalized for where > all special characters are replaced with their alternative writing. E.g. > I store the field name "übersee" and also name2 "uebersee" and when > matching I match against name2. If the field would container more > special characters it still would work without much more work, e.g. name > is "überseemöbel" then name2 would be "ueberseemoebel" and when the term > "überseemö" is entered it's also normalized to "ueberseemoe" and the > LIKE statement will still match. Basically this is some kind of > primitive stemming like lucene does it. > > Is there maybe some built-in support from MySQL for such special cases? > > thanks for any pointers, > - Markus >> - -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] >> >> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEI4BC1nS0RcInK9ARAnMrAJ9jZ5LAxS1S2bjqrPvIUBSiTGsxxQCgv+5l xyxQhd7B9HTnc8sTa7Tsekk= =uycm -----END PGP SIGNATURE----- -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]