Miko O'Sullivan wrote: > We already have the ability to embed foreign languages (XML, HTML, > whatever) using here docs: > > $myml = MyXmlParser->new(<< '(MARKUP)'); > <thingy> > <blah>blah blah</blah> > </thingy> > (MARKUP)
True, but what kind of magic is hiding inside MyXmlParser? One problem is that writing MyXmlParser to parse and validate XML and then generate some corresponding Perl data structure is difficult and error prone. In the simple case, XML::Simple is your friend. But as Robin points out, the simple approach falls down work when you need finer control over what you're doing. You can use the XML::Schema modules (if you're feeling brave) and that will generate a validating parser with control over the generated data structure. But it's big and bulky and the complexities of XML Schema itself make it a daunting task. There are various other modules and techniques which can acheive the desired result, but I've yet to find one that was both easy to use and powerful (although I need to check out those links that Robin posted). So I'm thinking that if the Perl 6 parser is as flexible and powerful as promises, then can we adapt it to simplify the task of parsing XML into internal data structures? One benefit of inlined XML over the example above is that it would be parsed at compile time, not runtime. When our modified parser sees this: use Perl6::XML; <thingy> <blah>blah blah</blah> </thingy> It would effectively re-write it as if written: my $thingy = { blah => 'blah blah', } and then generate the appropriate opcodes to implement it at runtime. A further benefit would be that your parsed and validated XML markup could then be stored as Parrot bytcode. You would effectively be "compiling" XML into bytecode that you could load into other programs with a simple "use". That would be neat. As and when we need more control over the XML validation or code generation, we would write our own modified XML grammar modules. Apocalypse 5 suggests this would be a simple matter of defining a few new 'rule' constructs. For example, we might want to add a rule for matching thingy/blah that constructs a list rather than a scalar. Thus, the XML would be parsed as if written: my $thingy = { blah => [ 'blah blah' ], } This is all largely hypothetical, of course. Hence the continued hand waving and general lack of detail. Consider it an open thought in process. :-) A