Maciej Bliziński wrote: > On Sat, 2006-08-26 at 16:48 +0200, Michal wrote: >> I also added a few of Slovak characters (Czech and Slovak was >> brothers too, and they have similar alphabet). > > I looked at the Latin Unicode article in Wikipedia: > http://en.wikipedia.org/wiki/Latin_Unicode > > There are characters with accents have I never seen before... Vietnamese > alphabet, for instance, has glyphs which are Latin characters with > unusual accents, for example: ã, or even with two accents: ặ > > For most of the characters, it's pretty easy to remove the accents. > However, some characters are mysterious: should Ƨ be translated to S? > I don't know. So I just deleted them from the accent removal list. >
Nice work Maciej :) When I wrote my first post, I typed: "I will be glad, If some others of you add your own national characters." Each nationality have its own specific characters and rules for them, so I think that somebody from this countries should check your version of patch. > I'm including a patch with "from" and "to" constants extended with all > the characters I found on Wikipedia that seemed to be of any use. This > should cover all the Slavic countries except those which use cyrylic > alphabet. > > One thing... some characters want to be translated into _two_ ASCII > characters, for example Æ to AE. This would require a different data > structure. In present form, I just entered E. The same with ß which > I replaced with single S. Maybe we could try wrote one new function, which will translate one unicode to adequate 2 ascii chars? (translate accent chars will be then done in two steps: 1-replAccents, 2-new function) > > Regards, > Maciej > > > > ------------------------------------------------------------------------ > > Index: django/contrib/admin/media/js/urlify.js > =================================================================== > --- django/contrib/admin/media/js/urlify.js (revision 3618) > +++ django/contrib/admin/media/js/urlify.js (working copy) > @@ -1,4 +1,43 @@ > +function replAccents(s) > +{ > + // Replacement lists based on article in Wikipedia, > + // http://en.wikipedia.org/wiki/Latin_Unicode > + // from and to strings must have same number of characters > + var from = 'ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæçèéêëìíîï'; > + var to = 'AAAAAAECEEEEIIIIDNOOOOOOUUUUYSaaaaaaaceeeeiiii'; > + from += 'ñòóôõöøùúûüýÿĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģ'; > + to += 'noooooouuuuyyaaaaaaccccccccddddeeeeeeeeeegggggggg'; > + from += 'ĤĥĦħĨĩĪīĬĭĮįİıĴĵĶķĸĹĺĻļĽľĿŀŁłŃńŅņŇňʼnŊŋŌōŎŏŐőŒœŔŕŖŗŘř'; > + to += 'hhhhiiiiiiiiiijjkkkllllllllllnnnnnnnnnoooooooorrrrrr'; > + from += 'ŚśŜŝŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽžſƀƂƃƄƅƇƈƉƊƐƑƒƓƔ'; > + to += 'ssssssssttttttuuuuuuuuuuuuwwyyyzzzzzzfbbbbbccddeffgv'; > + from += 'ƖƗƘƙƚƝƞƟƠƤƦƫƬƭƮƯưƱƲƳƴƵƶǍǎǏǐǑǒǓǔǕǖǗǘǙǚǛǜǝǞǟǠǡǢǣǤǥǦǧǨǩ'; > + to += 'likklnnoopettttuuuuyyzzaaiioouuuuuuuuuueaaaaeeggggkk'; > + from += 'ǪǫǬǭǰǴǵǷǸǹǺǻǼǽǾǿȀȁȂȃȄȅȆȇȈȉȊȋȌȍȎȏȐȑȒȓȔȕȖȗȘșȚțȞȟȤȥȦȧȨȩ'; > + to += 'oooojggpnnaaeeooaaaaeeeeiiiioooorrrruuuusstthhzzaaee'; > + from += 'ȪȫȬȭȮȯȰȱȲȳḀḁḂḃḄḅḆḇḈḉḊḋḌḍḎḏḐḑḒḓḔḕḖḗḘḙḚḛḜḝḞḟḠḡḢḣḤḥḦḧḨḩḪḫ'; > + to += 'ooooooooyyaabbbbbbccddddddddddeeeeeeeeeeffgghhhhhhhhhh'; > + from += 'ḬḭḮḯḰḱḲḳḴḵḶḷḸḹḺḻḼḽḾḿṀṁṂṃṄṅṆṇṈṉṊṋṌṍṎṏṐṑṒṓṔṕṖṗṘṙṚṛṜṝṞṟ'; > + to += 'iiiikkkkkkllllllllmmmmmmnnnnnnnnoooooooopppprrrrrrrr'; > + from += 'ṠṡṢṣṤṥṦṧṨṩṪṫṬṭṮṯṰṱṲṳṴṵṶṷṸṹṺṻṼṽṾṿẀẁẂẃẄẅẆẇẈẉẊẋẌẍẎẏẐẑẒẓẔẕ'; > + to += 'ssssssssssttttttttuuuuuuuuuuvvvvwwwwwwwwwwxxxxxyzzzzzz'; > + from += 'ẖẗẘẙẚẛẠạẢảẤấẦầẨẩẪẫẬậẮắẰằẲẳẴẵẶặẸẹẺẻẼẽẾếỀềỂểỄễỆệỈỉỊị'; > + to += 'htwyafaaaaaaaaaaaaaaaaaaaaaaaaeeeeeeeeeeeeeeeeiiii'; > + from += 'ỌọỎỏỐốỒồỔổỖỗỘộỚớỜờỞởỠỡỢợỤụỦủỨứỪừỬửỮữỰựỲỳỴỵỶỷỸỹ'; > + to += 'oooooooooooooooooooooooouuuuuuuuuuuuuuyyyyyyyy'; > + > + for (var i = 0; i != s.length; i++) { > + var x = from.indexOf(s[i]); > + if (x != -1) { > + r = new RegExp(from[x], 'g'); > + s = s.replace(r, to[x]); > + } > + } > + return s; > +} > + > function URLify(s, num_chars) { > + s = replAccents(s); > // changes, e.g., "Petty theft" to "petty_theft" > // remove all these words from the string before urlifying > removelist = ["a", "an", "as", "at", "before", "but", "by", "for", > "from", > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users -~----------~----~----~----~------~----~------~--~---