Op 04-12-13 14:02, rusi schreef: > On Wednesday, December 4, 2013 6:02:18 PM UTC+5:30, Antoon Pardon wrote: >> Op 04-12-13 13:01, rusi schreef: >>> On Wednesday, December 4, 2013 3:59:06 PM UTC+5:30, Antoon Pardon wrote: >>>> Op 04-12-13 11:09, rusi schreef: >>>>> I used the spaces case to indicate the limit of chaos. >>>>> Other characters (that >>>>> already have uses) are just as problematic. >>>> >>>> I don't agree with the latter. As it is now python can make the >>>> distinction between >>>> >>>> from A import B and fromAimportB. >>>> >>>> I see no a priori reason why this should be limited to letters. A >>>> language designer might choose to allow a bigger set of characters >>>> in identifiers like '-', '+' and others. In that case a-b would be >>>> an identifier and a - b would be the operation. Just as in python >>>> fromAimportB is an identifier and from A import B is an import >>>> statement. >>> >>> Im not sure what you are saying. >>> Sure a language designer can design a language differently from python. >>> I mentioned lisp. Cobol is another behaving exactly as you describe. >>> >>> My point is that when you do (something like) that, you will need to change >>> the >>> lexical and grammatical structure of the language. And this will make >>> for rather far-reaching changes ALL OVER the language not just in >>> what-follows-dot. >> >> No you don't need to change the lexical and grammatical structure of >> the language. Changing the characters allowed in identifiers, is not a >> change in lexical structure. The only difference in lexical structuring >> would be that '-', '>=' and other similars symbols would have to be >> treated like keyword like 'from', 'as' etc instead of being recognizable >> by just being present. > > Well I am mystified… > Consider the string a-b in a program text. > A Cobol or Lisp system sees this as one identifier. > Python, C (and most modern languages) see this ident, operator, ident. > > As I understand it this IS the lexical structure of the language and the lexer > is the part that implements this: > - in cobol/lisp keeping it as one > - in python/C breaking it into 3 > > Maybe you understand in some other way the phrase "lexical structure"?
Yes I do. The fact that a certain string is lexically evaluated differently is IMO not enough to conclude the language has a different lexical structure. It only means that the values allowed within the structure are different. What I see here is that some languages have an other alphabet over which identifiers are allowed. >> And the grammatical structure of the language wouldn't change at all. >> Sure a-b would now be an identifier and not an operation but that is >> of no concern for the parser. > > About grammar maybe what you are saying will hold: presumably if the token-set > is the same, one could keep the same grammar, with the differences being > entirely inter-lexeme ones. And the question is. If the token-set is the same, how is then is the lexical structure different rather than just the possible values associate with the tokens? -- Antoon Pardon -- https://mail.python.org/mailman/listinfo/python-list