In article <mailman.10673.1401853976.18130.python-l...@python.org>, Chris Angelico <ros...@gmail.com> wrote:
> You can't ignore those. You might be able to say "Well, my program > will run slower if you throw these at it", but if you're going down > that route, you probably want the full FSR and the advantages it > confers on ASCII and Latin-1 strings. Binding your program to BMP-only > is nearly as dangerous as binding it to ASCII-only; potentially worse, > because you can run an awful lot of artificial tests without > remembering to stick in some astral characters. Yup. I wrote a while(*) back about the pain I was having importing some data into a MySQL(**) database which (unknown to me when I started) only handled BMP. It turns out in the entire dataset of 20-odd million records, there were exactly four that had astral characters. All of my tests worked. I didn't discover the problem until it blew up many hours into the "final" production import run. (*) Two years? (**) This was not the only pain point with MySQL. We eventually switched to Postgress. -- https://mail.python.org/mailman/listinfo/python-list