On Thu, Nov 25, 2004 at 02:31:46PM +1100, Adam Kennedy wrote: : >Let's say you want to write a yacc grammar to parse Perl 6, or : >Parse::RecDescent, or whatever you're going to use. Yes, that will be : >hard in Perl 6. Certainly harder than it was in Perl 5. : : In the end, I concluded there was _no_ way to write even a Perl 5 parser : using any sort of pre-rolled grammar system, as the language does not : have that sort of structure.
On that level you have to think of Perl as multiple languages, not a single language. That in itself should not be a problem, though. : PPI was done "the hard way". Manually stepping through line by line and : using a variety of cruft (some stolen from the perl source, some my own) : to make it "just work". : : I would envisage that the same would be true of writing a PPI6, except : with a hell of a lot more operators :) The number of operators is a bit of a red herring. What you really don't like is that there aren't a fixed number of them. :-) : >However, Perl 6 comes packaged with its own grammar, in Perl's own rule : >format. So now the quote "only perl can parse Perl" may become "only : >Perl can parse Perl" (And even "only Perl can parse perl", since it's : >written in itself :-). : > : >Perl's contextual sensitivity is part of the language. So the best you : >can do is to track everything like you mentioned. It's going to be : >impossible to parse Perl without having perl around to do it for you. : > : >But using the built-in grammar, you can read in a program, macros and : >all, and get an annotated source tree back, that you could rebuild the : >source out of. : : Again, this is of very little use, effectively destroying the source : code and replacing it with different source that is a serialised version : of the tree. And there you put your finger onto the real problem, which is not that Perl is a mutating language or that it has a lot of operators, but that in the process of getting from here to there, it *forgets* how it got there, so there's no way of getting back to here. : This is what I am talking about when I refer to the "Frontpage" effect, : the habit Micrsoft's HTML editor (especially the early versions) had of : reuilding you HTML document from scratch, deleting all your template : variables and PHP code and generally making it impossible to write HTML : by hand. For HTML where you arn't MEANT to be writing stuff by hand : under normal circumstances that wasn't always a problem, but perl _isi_ : meant to be written by hand. But under another view, explosions of opcodes are just part of the compilation process. Again, the real problem is the forgetting of both the original structure and what it means in the context of the language that was being parsed at the time. There is no doubt that source filters are much too crude, and forget way too much. That's why we're trying to kill them dead in Perl 6. I think the real question is how far we can push Perl 6's macro system without forgetting anything you want to know about the structure of the program. Obviously AST macros will have an easier time of it than textual macros. An AST macro can just automatically attach the original parse and context as properties on the top of the new AST. To keep this info around for textual macros will require a bit more trickery, but we have to do it anyway for activities like debugging. So if we can see that in the larger context of preserving the entire compilation audit trail, all the better. : > You could even grab the comments and do something sick : >with them (see Damian :-). Or better yet, do something that PPI : >doesn't, and add some sub call around all statements, or determine the : >meaning of brackets in a particular context. : > : >The question of whether to execute BEGIN blocks is a tricky one. : >Sometimes they change the parse of the program. Sometimes they do other : >stuff. All you can hope for is that people understand the difference : >between BEGIN (change parsing) and INIT (do before the program starts). : : Frankly that is a gaping security hole... not only do I have to still : deal with the problem of loading every single dependency or having no : parsing ability otherwise, but I am required to "trust" every perl : programmer on the planet :( Another red herring--we've always had fairly strict accountability on the language warping dependencies at the "use" level. We're improving that in Perl 6 by requiring a decision on version at "use" time, and making that version a part of the metadata. But it's no accident that one of the places that Perl 5's B::Deparse has troubles is right at the BEGIN boundaries. Wherever Deparse has troubles, you can read that to mean I didn't understand that I should put something into Perl 5 to remember something important. The final metadata for the compiled program has to be able to tell you which chunks of program were compiled under which language. That's just as important as being able to track back to the source line number. : >I love PPI, by the way :-) : : Thank you, I do to :) I'll have to look at it more closely. It seems like PPI might be a good foundation on which to construct a Perl 5-to-6 translator. : But I'd like to still have something like it in perl6 :( Obviously there will be some subset of Perl 6 dialects for which you can do that. The only question is how we can maximize the set of dialects you can track, and efficiently inform people when they are transgressing that boundary. But we're not going to back off of the dialectical approach. One of those subsets of Perl 6 is going to be Standard Perl in 2020. I'm not smart enough to know which one of those it'll be, or hubristic enough to try to settle the question in advance by fiat. But I do think we have to get sufficient control of the "genetics" that we can track the very information that you desire. You can't do what you want without knowing which dialect you have in front of you. And I have a strong hope that Perl 6's approach of specifying dialects within the language rather than through some extern yacc grammar will, in the long run, help you to track this information rather than forget it. Larry