Thomas Wittek wrote:
chromatic wrote:
theproblemlinguisticallyspeakingisthatsometimes [snipped]

I can't remember that I said that you shouldn't separate your
expressions (by punctation/whitspaces),
$.but! (*adding$ %*characters _+that^# &$might) @#not_ !#be()
!&necessary_ *#$doesn't! *(make) [EMAIL PROTECTED] =_easier
to read and to type (in addition it was a torture to type that).

Forgive chromatic. Part of joining @Larry is undergoing a painful initiation process, which tends to inspire zealotry.

The point, though, is that there are three ways of handling the whole "part of speech" issue. One is with a dictionary (reserved words): in this method, every word is assigned a part of speech, usually with a "default." Any use of the word "FOR" must be a loop, any use of "INT" must be a typedef, etc.

Another is with context (and predeclaration). In this method, the surrounding context can be used to infer the part of speech of a word, with some sort of confirmation for 'new' words (user-defined variables, functions, etc.). Most present-day compiled languages use this one, although they frequently rely on the "reserved words" approach, too, for some words.

Finally, the approach Larry has chosen is to explicitly mark the part of speech. Perl up to version 5 used an approach that attempted to correlate the marker with the part of speech associated with the surrounding context: foo(@array) vs. foo($array[0])

This approach was criticized for providing relatively little value over the context+lookup approach. If the sigil has to correspond to the context, then only in rare cases (ambiguous context) is the sigil adding much value.

The new approach (@array[0]) ties the sigil to the declaration, serving to distinguish name collisions and of course to autovivify variables correctly.

Ultimately, it comes down to value added, and culture/custom. "Perl has always used sigils, so perl should continue to use sigils." That's a legitimate stand, in the absence of compelling arguments to the contrary. It "let's perl be perl."

As far as value goes, let's call the C/C++ approach the "nul" approach, since by default there is no sigil in front of words. (And I'm considering * and & to be sigils, rather than operators.)

The nul approach reduces typing. It relies on context to identify the part of speech, occasionally forces some look-ahead (a name followed by '(' is an invocation instead of a reference) and can't handle multiply typed (@foo vs. &foo vs. $foo vs. %foo) names.

The perl approach increases typing, by something less than 1 character per identifier. (This is a real cost, that Larry continues to elect to bear.) The p5 version imposed some disambiguation burden on the parser, since $foo[0] involved @foo, not $foo. Perl *can* handle *some* multiply typed names. There is a difference between $foo and @foo, but not between "my Cat $foo" and "my Dog $foo".

In addition, however, there is the whole *foo thing. Adding the sigil has encouraged people to think in weird ways, 'tied' variables and typeglobs not least among them. I don't know if a 'perl' that used the nul approach would ever have had those features. (Sapir-Whorf lives!)

The perl approach, then, opts to pay a significant penalty (0.9+ characters per variable) to allow access to the cool extra features that few other languages use, and none so compactly.

A similar trade-off exists with the statement terminating semicolon. In this case, it involves the number of statements per line:

A language that terminates statements can ignore whitespace, allowing multiple statements per line and statements that span multiple lines.

A language that associates line termination with statement termination must pay a separate cost (continuation marker) for a statement to span multiple lines. It will not, in general, support multiple statements per line. (Though it could make the terminator "optional" and then inject terminators between colinear statements.)

The vast majority of languages have opted to terminate statements. Perl is among them. Probably the best argument is that encountering a semicolon (or full stop, in COBOL) is a positive indicator rather than a negative one. "I see a semicolon. I know the statement is over." as opposed to "I don't see a continuation marker, so it's likely that the statement is over, although it could be tabbed way off to the right or something."

Also, there's the increasing size of words to consider. While $a = $b + $c is a great example of why line termination is not needed, the trend is for variable and function names, not to mention object and method dereferences, to grow longer.

From
http://www.oreillynet.com/pub/a/javascript/2003/03/18/movabletype.html I get:

|MT::Template::Context->add_tag(HelloWorld => sub { return 'Hello World.'; } );|


The MT::...add_tag method name alone is 30 characters. Jam a few long identifiers together and you're writing a lot of multi-line statements.

If the termination marker were optional, then the punctuation would still have to be reserved--it is unlikely there is another use for semicolon that is obviously exclusive from statement termination that needs to be filled.

And if the termination marker were not optional, but prohibited, then perl wouldn't have one-liners. That's DEFINITELY "unperlish," so we won't go there.

So line termination doesn't gain a punctuation character, and causes the ends of lines to be uncertain. It does reduce typing, for the small crowd of people that wouldn't just use them anyway because they use them in every other language.

I think the lack of value here outweights the "savings" of one character per line.

So semicolons don't seem to be the best invention since sliced bread.
There should be extra-syntax for the rare cases (multiline) and not for
the common ones.
Somehow English seems to get by with periods at the ends of statements, though almost no one pronounces them.

Oh, I thought Perl was a programming language. My fault.
Apples and oranges.
Most modern scripting languages don't need the semicolons. I think
there's no plausible reason for them.


Actually, perl is probably the most "linguistic" of programming languages. A lot of $Larry's concerns with perl syntax, and perl language issues, has historically been linguistic concern. The notion of "end weight," for example, was an important part of the restructuring of regexes in p6. (See http://en.wikipedia.org/wiki/Larry_Wall)

I'm going to forgive you the "no plausible reason for them" comment, since I listed some above. (And since chromatic got you riled up.) But please keep in mind that there are reasons for them, and some of those reasons are reasons of "custom" (i.e., "we always did it like this" or "everybody does it") and custom really is a good reason, although proven value can trump custom. Also, of course, remember that Larry's a pretty smart guy, particularly in the linguistics field. There are more people working in perl than work in Esperanto.

I agree. You need less ignorant colleagues. I'm not sure Perl 6 can fix that.

I don't think that it's a point of ignorance.
Especially as they (and enough other people on the web) only seem to be
ignorant regarding Perl. Strange, huh?


Regarding perl6, yes. Regarding all of perl, not so much. I think this goes back to "perl 6 is late," which is really just another way of saying "perl 6 has taken a long time." Since perl5 is a functioning, popular language, it's not like there's an "incredibly popular scripting language" gap...
By the way, I'm still waiting to meet your cadre of Dylan hackers.

This little snip is especially interesting since working with early versions of perl 6 required mastering Haskell, a language that doubled in popularity when the 2 computer scientists using it were joined by Luke Palmer and Autrijus Tang.

Keep yer stick on the ice.

=Austin


Reply via email to