I'm 100% in favor of expanding Unicode until the sun goes dark. Doing so helps solve the problems affecting speakers of "underserved" languages--access and language preservation. Speakers of Mongolian, Cherokee, Georgian, etc. all deserve to be able to interact with technology in their native languages as much as we speakers of ASCII-friendly languages do. Unicode support also makes writing papers on, dictionaries of, and new texts in such languages much easier, which helps the fight against language extinction, which is a sadly pressing issue.
Also, like, computers are big. Get an external drive for your high-resolution PDF collection of Medieval manuscripts if you feel like you're running out of space. A few extra codepoints aren't going to be the straw that breaks the camel's back. On Thursday, February 26, 2015 at 8:24:34 AM UTC-5, Chris Angelico wrote: > On Thu, Feb 26, 2015 at 11:40 PM, Rustom Mody <rustompm...@gmail.com> wrote: > > Wrote something up on why we should stop using ASCII: > > http://blog.languager.org/2015/02/universal-unicode.html > > >From that post: > > """ > 5.1 Gibberish > > When going from the original 2-byte unicode (around version 3?) to the > one having supplemental planes, the unicode consortium added blocks > such as > > * Egyptian hieroglyphs > * Cuneiform > * Shavian > * Deseret > * Mahjong > * Klingon > > To me (a layman) it looks unprofessional - as though they are playing > games - that billions of computing devices, each having billions of > storage words should have their storage wasted on blocks such as > these. > """ > > The shift from Unicode as a 16-bit code to having multiple planes came > in with Unicode 2.0, but the various blocks were assigned separately: > * Egyptian hieroglyphs: Unicode 5.2 > * Cuneiform: Unicode 5.0 > * Shavian: Unicode 4.0 > * Deseret: Unicode 3.1 > * Mahjong Tiles: Unicode 5.1 > * Klingon: Not part of any current standard > > However, I don't think historians will appreciate you calling all of > these "gibberish". To adequately describe and discuss old texts > without these Unicode blocks, we'd have to either do everything with > images, or craft some kind of reversible transliteration system and > have dedicated software to render the texts on screen. Instead, what > we have is a well-known and standardized system for transliterating > all of these into numbers (code points), and rendering them becomes a > simple matter of installing an appropriate font. > > Also, how does assigning meanings to codepoints "waste storage"? As > soon as Unicode 2.0 hit and 16-bit code units stopped being > sufficient, everyone needed to allocate storage - either 32 bits per > character, or some other system - and the fact that some codepoints > were unassigned had absolutely no impact on that. This is decidedly > NOT unprofessional, and it's not wasteful either. > > ChrisA -- https://mail.python.org/mailman/listinfo/python-list