The first book I studied as a CS-student was Structured Computer Organization by Tanenbaum
Apart from the detailed description of various machines like PDP-11, IBM-360 etc it suggested the understanding of the computer at 4 levels: - Microprogramming level - "Conventional" machine level (nowadays called ISA) - OS level -- where system calls become new "instructions" - HLL level of languages (like PL-1 !) [The next edition would add the digital abstraction level below the microprogamming level] For me as for many in my generation this book and this leveled view was an important component in my understanding of CS A few years later I studied a course on something called "networks and networking" Again it talked of some 7 (OSI) layers But it didnt make much sense to someone whose only idea of a network was the wire that connected the terminal to the (pretending) mainframe In a subsequent edition of Networking, I found that Tanenbaum had castigated the 7 OSI layers as useless and unnecessary with the 3 TCP layers being more realisitc Still further(?) editions, he would introduce 5 layers as a hybrid between the international but failed OSI standard and the ubiquitous but incomplete TCP standard Why am I saying all this? A layered understanding is the bedrock of our field Except that sometimes it works And sometimes it doesn't The 3 layers here are - UTF-8 layer - Unicode codepoint layer - Linguistically useful (grapheme) layer Marko's statements like UTF-8 is random access is so obviously wrong that (my guess) is that he is not meaning it literally but elliptically as saying: "This excessive layering is not working" OTOH statements like level 2 is 90% good enough for level 3 is in the same ludicrous class as "The world is as wide as the Atlantic ocean" As pointed out above, agglutinating letters is the norm not the exception in the world's languages upto and including (latin in) English -- https://mail.python.org/mailman/listinfo/python-list