> There are two showstoppers for me though: > > 1. Introducing a new programming language where the char type is a > byte is anachronistic. You're saying that programmers don't have to > care about the string encoding and can just treat them as an array of > bytes. That is exactly what causes all the problems with applications > that haven't been designed to handle anything but ASCII. If you're > doing any logic with strings except dumb reading and writing, you > typically *have to* know the encoding. Forcing programmers to be aware > of this problem is a good idea IMO. > > You can certainly have a string type that uses byte arrays in UTF-8 > encoding internally, but your string functions should be aware of that > and treat it as a unicode string. The len function and index operators > should count characters, not bytes. Add a byte array data type for > byte arrays instead. > It's not easy. I think Python3's byte arrays have an "upper" method (and a string literal syntax b"abc") which is quite alarming to me that they chose the wrong default.
Eventually the "rope" data structure (that the compiler uses heavily) will become a proper part of the library: By "rope" I mean an immutable string implemented as a tree, so concatenation is O(1). For immutable strings there is no ``[]=`` operation, so using UTF-8 and converting it to a 32bit char works better. > 2. The dynamic dispatch is messy. I agree that procedural is often > simpler and more efficient than object-oriented programming, but > object-oriented programming is useful just as often and should be made > a simple as possible. Since Nimrod seems flexible, perhaps it would be > possible to implement an object-orientation layer in Nimrod that hides > the dynamic dispatch complexity? > Yes, I already have an idea how to improve it. -- http://mail.python.org/mailman/listinfo/python-list