hi,
attached a patch for languages/PIR, fixing:
* changed tabs to spaces in pir.pg, and trimmed all trailing spaced
(this might look better on linux? not sure)
* minor changes in pir.pg
* added a docs directory
* added pirgrammar.pod file, a human-readable version (with some
changes) of pir.pg
This file may, in the end, be the basis for PDD19: PIR, if that's
desirable. Because this file is easier to read, it might also serve
newcomers to PIR in order to learn about its syntax. Furthermore, some
current features of IMCC might be discussed whether to remove them or
not (clean up PIR grammar; now's the chance, after a 1.0 release it's fixed)
* HTML version of pirgrammar.pod: pirgrammar.html
* added a few more tests.
When I made the patch, *again* it contained the contents of the new
files twice. I manually removed the double contents from the patch file.
regards,
klaas-jan
Index: languages/PIR/docs/pirgrammar.pod
===================================================================
--- languages/PIR/docs/pirgrammar.pod (revision 0)
+++ languages/PIR/docs/pirgrammar.pod (revision 0)
@@ -0,0 +1,579 @@
+=head1 NAME
+
+PIR.pod - The Grammar of languages/PIR
+
+=head1 DESCRIPTION
+
+This document provides a more readable grammar of languages/PIR. The actual input
+for PGE is a bit more complex. This grammar for humans does not contain error
+handling and some other issues unimportant for the PIR reference.
+
+
+=head1 STATUS
+
+For a bugs and issues, see the section KNOWN ISSUES AND BUGS.
+
+The grammar includes some constructs that *are* in the IMCC parser,
+but are not implemented. An example of this is the ".global" directive.
+
+
+=head1 VERSION
+
+Version: Saturday Feb. 3rd 2007.
+(not a version number yet, as many improvements are to be expected at this point).
+
+
+=head1 LEXICAL CONVENTIONS
+
+
+=head2 PIR Directives
+
+PIR has a number of directives. All directives start with a dot. Macro identifiers
+(when using a macro, on expansion) also start with a dot (see below). Therefore,
+it is important not to use any of the PIR directives as a macro identifier. The
+PIR directives are:
+
+ .arg .invocant .pcc_call
+ .const .lex .pcc_end_return
+ .emit .line .pcc_end_yield
+ .end .loadlib .pcc_end
+ .endnamespace .local .pcc_sub
+ .eom .meth_call .pragma
+ .get_results .namespace .return
+ .global .nci_call .result
+ .HLL_map .param .sub
+ .HLL .pcc_begin_return .sym
+ .immediate .pcc_begin_yield .yield
+ .include .pcc_begin
+
+
+=head2 Registers
+
+PIR has two types of registers: real registers and virtual or temporary registers.
+Real registers are actual registers in the Parrot VM, and are written like:
+
+ [S|N|I|P]n, where n is a number between 0 to, but not including, 100.
+
+Virtual, or temporary registers are written like:
+
+ $[S|N|I|P]n, where n is a positive integer.
+
+
+
+
+=head2 Constants
+
+An integer constant is a string of one or more digits.
+Examples: 0, 42.
+
+A floatin-point constant is a string of one or more digits, followed by a dot
+and one or more digits. Examples: 1.1, 42.567.
+
+A string constant is a single or double quoted series of characters.
+Examples: 'hello world', "Parrot".
+
+TODO: PMC constants.
+
+=head2 Identifiers
+
+An identifier starts with a character from [_a-zA-Z], followed by
+a character from [_a-zA-Z0-9].
+
+Examples: x, x1, _foo.
+
+
+=head2 Labels
+
+A label is an identifier with a colon attached to it.
+
+Examples: LABEL:
+
+
+=head2 Macro identifiers
+
+A macro identifier is an identifier prefixed with an dot. A macro
+identifier is used when I<expanding> the macro (on usage), not in
+the macro definition.
+
+Examples: .Macro
+
+
+
+=head1 GRAMMAR RULES
+
+A PIR program consists of one or more compilation units. A compilation unit
+is a global, sub, constant or macro definition, or a pragma or emit block.
+PIR is a line oriented language, which means that each statement ends in a
+newline (indicated as "nl"). Moreover, compilation units are always separated
+by a newline.
+
+ program:
+ compilation_unit [ nl compilation_unit ]*
+
+ compilation_unit:
+ global_def
+ | sub_def
+ | const_def
+ | macro_def
+ | pragma
+ | emit
+
+ sub_def:
+ [ ".sub" | ".pcc_sub" ] sub_id sub_pragmas nl body
+
+ sub_id:
+ identifier | string_constant
+
+NOTE: the subpragmas may or may not be separated by a comma.
+
+ sub_pragmas:
+ sub_pragma [ ","? sub_pragma ]*
+
+
+ sub_pragma:
+ ":load"
+ | ":init"
+ | ":immediate"
+ | ":main"
+ | ":anon"
+ | ":lex"
+ | wrap_pragma
+ | vtable_pragma
+ | multi_pragma
+ | outer_pragma
+
+ wrap_pragma:
+ ":wrap" parenthesized_string
+
+ vtable_pragma:
+ ":vtable" parenthesized_string?
+
+ parenthesized_string:
+ "(" string_constant ")"
+
+ multi_pragma:
+ ":multi" "(" multi_types? ")"
+
+ outer_pragma:
+ ":outer" "(" sub_id ")"
+
+ multi_tyes:
+ multi_type [ "," multi_type ]*
+
+ multi_type:
+ type
+ | "_"
+ | keylist
+ | identifier
+ | string_constant
+
+ body:
+ param_decl*
+ labeled_pir_instr*
+ ".end"
+
+ param_decl:
+ ".param" [ [ type identifier ] | reg ] get_flags? nl
+
+ labeled_pir_instr:
+ label? instr nl
+
+ labeled_pasm_instr:
+ label? pasm_instr nl
+
+ instr:
+ pir_instr | pasm_instr
+
+NOTE: the rule 'pasm_instr' is not included in this reference grammar. pasm_instr
+defines the syntax for pure PASM instructions.
+
+ pir_instr:
+ local_decl
+ | lexical_decl
+ | const_def
+ | conditional_stat
+ | assignment_stat
+ | open_namespace
+ | close_namespace
+ | return_stat
+ | sub_invocation
+ | macro_invocation
+ | jump_stat
+ | source_info
+
+ macro_invocation:
+ macro_id parenthesized_args?
+
+ local_decl:
+ [ ".local" | ".sym" ] type local_id_list
+
+ local_id_list:
+ local_id [ "," local_id ]*
+
+ local_id:
+ identifier ":unique_reg"?
+
+ lexical_decl:
+ ".lex" string_constant "," target
+
+
+ global_def:
+ ".global" identifier
+
+ const_def:
+ ".const" type identifier "=" constant_expr
+
+
+ conditional_stat:
+ [ "if" | "unless" ]
+ [ [ "null" target "goto" identifier ]
+ | [ simple_expr [ relational_op simple_expr ]? ]
+ ] "goto" identifier
+
+ jump_stat:
+ "goto" identifier
+
+ relational_op:
+ "=="
+ | "!="
+ | "<="
+ | "<"
+ | <"="
+ | <""
+
+ binary_op:
+ "+"
+ | "-"
+ | "/"
+ | "**"
+ | "*"
+ | "%"
+ | "<<"
+ | <">>"
+ | <">"
+ | "&&"
+ | "||"
+ | "~~"
+ | "|"
+ | "&"
+ | "~"
+ | "."
+
+
+ assign_op:
+ "+="
+ | "-="
+ | "/="
+ | "%="
+ | "*="
+ | ".="
+ | "&="
+ | "|="
+ | "~="
+ | "<<="
+ | <">="
+ | <">>="
+
+ unary_op:
+ "!"
+ | "-"
+ | "~"
+
+ expression:
+ simple_expr
+ | simple_expr binary_op simple_expr
+ | unary_op simple_expr
+
+ simple_expr:
+ float_constant
+ | int_constant
+ | string_constant
+ | target
+
+
+ keylist:
+ "[" keys "]"
+
+ keys:
+ key [ sep key ]*
+
+ sep:
+ "," | ";"
+
+ key:
+ simple_expr
+ | simple_expr ".."
+ | ".." simple_expr
+ | simple_expr ".." simple_expr
+
+
+ assignment_stat:
+ target "=" short_sub_call
+ | target "=" target keylist
+ | target "=" expression
+ | target "=" "new" [ int_constant | string_constant | macro_id ]
+ | target "=" "new" keylist
+ | target "=" "find_type" [ string_constant | string_reg | id ]
+ | target "=" heredoc
+ | target "=" "global" <string_constant
+ | target assign_op simple_expr
+ | target keylist "=" simple_expr
+ | "global" string_constant "=" target
+ | result_var_list "=" short_sub_call
+
+
+
+NOTE: the heredoc rules are not complete or tested. Some work is required here.
+
+ heredoc:
+ "<<" string_constant nl
+ heredoc_string
+ heredoc_label
+
+ heredoc_label:
+ ^^ identifier
+
+ heredoc_string:
+ [ \N | \n ]*
+
+
+ long_sub_call:
+ ".pcc_begin" nl
+ arguments
+ [ method_call | non_method_call] target nl
+ [ local_decl nl ]*
+ result_values
+ ".pcc_end"
+
+
+ non_method_call:
+ ".pcc_call" | ".nci_call"
+
+ method_call:
+ ".invocant" target nl
+ ".meth_call"
+
+ short_sub_call:
+ invocant? [ target | string_constant ] parenthesized_args
+
+ invocant:
+ [ target"." | target "->" ]
+
+ sub_invocation:
+ long_sub_call | short_sub_call
+
+ result_var_list:
+ "(" result_vars ")"
+
+ result_vars:
+ result_var [ "," result_var ]*
+
+ result_var:
+ target get_flags?
+
+
+ parenthesized_args:
+ "(" args ")"
+
+ args:
+ arg [ "," arg ]
+
+ arg:
+ [ float_constant
+ | int_constant
+ | string_constant [ "=>" target ]?
+ | target
+ ]
+ set_flags?
+
+
+ arguments:
+ [ ".arg" simple_expr set_flags? nl ]*
+
+ result_values:
+ [ ".result" target get_flags? nl ]*
+
+ set_flags:
+ [ ":flat"
+ | named_flag
+ ]+
+
+ get_flags:
+ [ ":slurpy"
+ | ":optional"
+ | ":opt_flag"
+ | named_flag
+ ]+
+
+
+ named_flag:
+ ":named" parenthesized_string?
+
+ return_stat:
+ long_return_stat
+ | short_return_stat
+ | long_yield_stat
+ | short_yield_stat
+ | tail_call
+
+
+ long_return_stat:
+ ".pcc_begin_return" nl
+ return_directive*
+ ".pcc_end_return"
+
+ short_return_stat:
+ ".return" parenthesized_args
+
+ long_yield_stat:
+ ".pcc_begin_yield" nl
+ return_directive*
+ ".pcc_end_yield"
+
+ return_directive:
+ ".return" simple_expr set_flags? nl
+
+ short_yield_stat:
+ ".yield" parenthesized_args
+
+ tail_call:
+ ".return" short_sub_call
+
+ open_namespace:
+ ".namespace" identifier
+
+ close_namespace:
+ ".endnamespace" identifier
+
+
+NOTE: an emit block only allows PASM instructions,
+not PIR instructions.
+
+
+ emit:
+ ".emit" nl
+ labeled_pasm_instr*
+ ".eom"
+
+NOTE: the macro definition is not complete, and untested.
+This should be fixed. For now, all characters up to but not
+including ".endm" are 'matched'.
+
+
+ macro_def:
+ ".macro" identifier macro_parameters? nl
+ macro_body
+
+ macro_parameters:
+ "(" id_list? ")"
+
+ macro_body:
+ .*?
+ ".endm" nl
+
+ pragma:
+ include
+ | new_operators
+ | loadlib
+ | namespace
+ | hll_mapping
+ | hll_specifier
+ | source_info
+
+
+ include:
+ ".include" string_constant
+
+ new_operators:
+ ".pragma" "n_operators" int_constant
+
+ loadlib:
+ ".loadlib" string_constant
+
+ namespace:
+ ".namespace" [ "[" namespace_id "]" ]?
+
+ hll_specifier:
+ ".HLL" string_constant "," string_constant
+
+ hll_mapping:
+ ".HLL_map" int_constant "," int_constant
+
+ namespace_id:
+ string_constant [ ";" string_constant ]*
+
+
+NOTE: currently, the line directive is implemented in IMCC as #line.
+See the PROPOSALS document for more information on this.
+
+ source_info:
+ ".line" int_constant [ "," string_constant ]?
+
+ id_list:
+ identifier [ "," identifier ]*
+
+ string_constant:
+ charset_specifier? quoted_string
+
+ charset_specifier:
+ "ascii:"
+ | "binary:"
+ | "unicode:"
+ | "iso-8859-1:"
+
+
+ type:
+ "int"
+ | "num"
+ | "pmc"
+ | "object"
+ | "string"
+ | "Array"
+ | "Hash"
+
+ target:
+ identifier | register
+
+
+
+=head1 AUTHOR
+
+
+Klaas-Jan Stol [EMAIL PROTECTED]
+
+
+=head1 KNOWN ISSUES AND BUGS
+
+Some work should be done on:
+
+=over 4
+
+=item *
+
+Macro parsing
+
+=item *
+
+Heredoc parsing
+
+=item *
+
+The rule 'type' does currently not include custom types (user defined).
+Probably it needs an alternative "identifier". Not sure yet at this point.
+
+=item *
+
+Clean up grammar, remove never-used features.
+
+=item *
+
+Test. A lot.
+
+Bugs or improvements may be sent to the author, and of course greatly
+appreciated. Moreover, if you find any missing constructs that are in
+IMCC, indications of these would be appreciated as well.
+
+=back
+
+=cut
+
Index: languages/PIR/docs/pirgrammar.html
===================================================================
--- languages/PIR/docs/pirgrammar.html (revision 0)
+++ languages/PIR/docs/pirgrammar.html (revision 0)
@@ -0,0 +1,662 @@
+<?xml version="1.0" ?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml">
+<head>
+<title>PIR.pod - The Grammar of languages/PIR</title>
+<meta http-equiv="content-type" content="text/html; charset=utf-8" />
+<link rev="made" href="mailto:" />
+</head>
+
+<body style="background-color: white">
+
+<p><a name="__index__"></a></p>
+<!-- INDEX BEGIN -->
+
+<ul>
+
+ <li><a href="#name">NAME</a></li>
+ <li><a href="#description">DESCRIPTION</a></li>
+ <li><a href="#status">STATUS</a></li>
+ <li><a href="#version">VERSION</a></li>
+ <li><a href="#lexical_conventions">LEXICAL CONVENTIONS</a></li>
+ <ul>
+
+ <li><a href="#pir_directives">PIR Directives</a></li>
+ <li><a href="#registers">Registers</a></li>
+ <li><a href="#constants">Constants</a></li>
+ <li><a href="#identifiers">Identifiers</a></li>
+ <li><a href="#labels">Labels</a></li>
+ </ul>
+
+ <li><a href="#grammar_rules">GRAMMAR RULES</a></li>
+ <li><a href="#author">AUTHOR</a></li>
+ <li><a href="#known_issues_and_bugs">KNOWN ISSUES AND BUGS</a></li>
+</ul>
+<!-- INDEX END -->
+
+<hr />
+<p>
+</p>
+<h1><a name="name">NAME</a></h1>
+<p>PIR.pod - The Grammar of languages/PIR</p>
+<p>
+</p>
+<hr />
+<h1><a name="description">DESCRIPTION</a></h1>
+<p>This document provides a more readable grammar of languages/PIR. The actual input
+for PGE is a bit more complex. This grammar for humans does not contain error
+handling and some other issues unimportant for the PIR reference.</p>
+<p>
+</p>
+<hr />
+<h1><a name="status">STATUS</a></h1>
+<p>For a bugs and issues, see the section KNOWN ISSUES AND BUGS.</p>
+<p>The grammar includes some constructs that *are* in the IMCC parser,
+but are not implemented. An example of this is the ".global" directive.</p>
+<p>
+</p>
+<hr />
+<h1><a name="version">VERSION</a></h1>
+<p>Version: Saturday Feb. 3rd 2007.
+(not a version number yet, as many improvements are to be expected at this point).</p>
+<p>
+</p>
+<hr />
+<h1><a name="lexical_conventions">LEXICAL CONVENTIONS</a></h1>
+<p>
+</p>
+<h2><a name="pir_directives">PIR Directives</a></h2>
+<p>PIR has a number of directives. All directives start with a dot. Macro identifiers
+(when using a macro, on expansion) also start with a dot (see below). Therefore,
+it is important not to use any of the PIR directives as a macro identifier. The
+PIR directives are:</p>
+<pre>
+ .arg .invocant .pcc_call
+ .const .lex .pcc_end_return
+ .emit .line .pcc_end_yield
+ .end .loadlib .pcc_end
+ .endnamespace .local .pcc_sub
+ .eom .meth_call .pragma
+ .get_results .namespace .return
+ .global .nci_call .result
+ .HLL_map .param .sub
+ .HLL .pcc_begin_return .sym
+ .immediate .pcc_begin_yield .yield
+ .include .pcc_begin</pre>
+<p>
+</p>
+<h2><a name="registers">Registers</a></h2>
+<p>PIR has two types of registers: real registers and virtual or temporary registers.
+Real registers are actual registers in the Parrot VM, and are written like:
+</p>
+<pre>
+
+ [S|N|I|P]n, where n is a number between 0 to, but not including, 100.</pre>
+<pre>
+
+Virtual, or temporary registers are written like:</pre>
+<pre>
+ $[S|N|I|P]n, where n is a positive integer.</pre>
+<p>
+</p>
+<h2><a name="constants">Constants</a></h2>
+<p>An integer constant is a string of one or more digits.
+Examples: 0, 42.</p>
+<p>A floatin-point constant is a string of one or more digits, followed by a dot
+and one or more digits. Examples: 1.1, 42.567.</p>
+<p>A string constant is a single or double quoted series of characters.
+Examples: 'hello world', "Parrot".</p>
+<p>TODO: PMC constants.</p>
+<p>
+</p>
+<h2><a name="identifiers">Identifiers</a></h2>
+<p>An identifier starts with a character from [_a-zA-Z], followed by
+a character from [_a-zA-Z0-9].</p>
+<p>Examples: x, x1, _foo.</p>
+<p>
+</p>
+<h2><a name="labels">Labels</a></h2>
+<p>A label is an identifier with a colon attached to it.</p>
+<p>Examples: LABEL:</p>
+<pre>
+
+=head2 Macro identifiers
+
+A macro identifier is an identifier prefixed with an dot. A macro
+identifier is used when I<expanding> the macro (on usage), not in
+the macro definition.</pre>
+<p>Examples: .Macro</p>
+<p>
+</p>
+<hr />
+<h1><a name="grammar_rules">GRAMMAR RULES</a></h1>
+<p>A PIR program consists of one or more compilation units. A compilation unit
+is a global, sub, constant or macro definition, or a pragma or emit block.
+PIR is a line oriented language, which means that each statement ends in a
+newline (indicated as "nl"). Moreover, compilation units are always separated
+by a newline.</p>
+<pre>
+ program:
+ compilation_unit [ nl compilation_unit ]*
+
+ compilation_unit:
+ global_def
+ | sub_def
+ | const_def
+ | macro_def
+ | pragma
+ | emit
+
+ sub_def:
+ [ ".sub" | ".pcc_sub" ] sub_id sub_pragmas nl body
+
+ sub_id:
+ identifier | string_constant</pre>
+<p>NOTE: the subpragmas may or may not be separated by a comma.
+</p>
+<pre>
+
+ sub_pragmas:
+ sub_pragma [ ","? sub_pragma ]*</pre>
+<pre>
+
+ sub_pragma:
+ ":load"
+ | ":init"
+ | ":immediate"
+ | ":main"
+ | ":anon"
+ | ":lex"
+ | wrap_pragma
+ | vtable_pragma
+ | multi_pragma
+ | outer_pragma</pre>
+<pre>
+
+ wrap_pragma:
+ ":wrap" parenthesized_string</pre>
+<pre>
+
+ vtable_pragma:
+ ":vtable" parenthesized_string?</pre>
+<pre>
+
+ parenthesized_string:
+ "(" string_constant ")"</pre>
+<pre>
+
+ multi_pragma:
+ ":multi" "(" multi_types? ")"</pre>
+<pre>
+
+ outer_pragma:
+ ":outer" "(" sub_id ")"</pre>
+<pre>
+
+ multi_tyes:
+ multi_type [ "," multi_type ]*</pre>
+<pre>
+
+ multi_type:
+ type
+ | "_"
+ | keylist
+ | identifier
+ | string_constant</pre>
+<pre>
+
+ body:
+ param_decl*
+ labeled_pir_instr*
+ ".end"</pre>
+<pre>
+
+ param_decl:
+ ".param" [ [ type identifier ] | reg ] get_flags? nl</pre>
+<pre>
+
+ labeled_pir_instr:
+ label? instr nl</pre>
+<pre>
+
+ labeled_pasm_instr:
+ label? pasm_instr nl</pre>
+<pre>
+
+ instr:
+ pir_instr | pasm_instr</pre>
+<p><table cellspacing="0" cellpadding="0"><tr><td>NOTE: the rule 'pasm_instr' is not included in this reference grammar. pasm_instr
+<tr><td>defines the syntax for pure PASM instructions. <td>
+</table></p>
+<pre>
+
+ pir_instr:
+ local_decl
+ | lexical_decl
+ | const_def
+ | conditional_stat
+ | assignment_stat
+ | open_namespace
+ | close_namespace
+ | return_stat
+ | sub_invocation
+ | macro_invocation
+ | jump_stat
+ | source_info</pre>
+<pre>
+
+ macro_invocation:
+ macro_id parenthesized_args?</pre>
+<pre>
+
+ local_decl:
+ [ ".local" | ".sym" ] type local_id_list</pre>
+<pre>
+
+ local_id_list:
+ local_id [ "," local_id ]*</pre>
+<pre>
+
+ local_id:
+ identifier ":unique_reg"?</pre>
+<pre>
+
+ lexical_decl:
+ ".lex" string_constant "," target</pre>
+<pre>
+
+ global_def:
+ ".global" identifier</pre>
+<pre>
+
+ const_def:
+ ".const" type identifier "=" constant_expr
+</pre>
+<pre>
+
+ conditional_stat:
+ [ "if" | "unless" ]
+ [ [ "null" target "goto" identifier ]
+ | [ simple_expr [ relational_op simple_expr ]? ]
+ ] "goto" identifier
+</pre>
+<pre>
+
+ jump_stat:
+ "goto" identifier
+</pre>
+<pre>
+
+ relational_op:
+ "=="
+ | "!="
+ | "<="
+ | "<"
+ | <"="
+ | <""
+</pre>
+<pre>
+
+ binary_op:
+ "+"
+ | "-"
+ | "/"
+ | "**"
+ | "*"
+ | "%"
+ | "<<"
+ | <">>"
+ | <">"
+ | "&&"
+ | "||"
+ | "~~"
+ | "|"
+ | "&"
+ | "~"
+ | "."
+</pre>
+<pre>
+
+ assign_op:
+ "+="
+ | "-="
+ | "/="
+ | "%="
+ | "*="
+ | ".="
+ | "&="
+ | "|="
+ | "~="
+ | "<<="
+ | <">="
+ | <">>="
+</pre>
+<pre>
+
+ unary_op:
+ "!"
+ | "-"
+ | "~"
+</pre>
+<pre>
+
+ expression:
+ simple_expr
+ | simple_expr binary_op simple_expr
+ | unary_op simple_expr
+</pre>
+<pre>
+
+ simple_expr:
+ float_constant
+ | int_constant
+ | string_constant
+ | target
+</pre>
+<pre>
+
+ keylist:
+ "[" keys "]"
+</pre>
+<pre>
+
+ keys:
+ key [ sep key ]*
+</pre>
+<pre>
+
+ <span class="variable">sep</span><span class="operator">:</span>
+ <span class="string">","</span> <span class="operator">|</span> <span class="string">";"</span>
+</pre>
+<pre>
+
+ key:
+ simple_expr
+ | simple_expr ".."
+ | ".." simple_expr
+ | simple_expr ".." simple_expr
+</pre>
+<pre>
+
+ assignment_stat:
+ target "=" short_sub_call
+ | target "=" target keylist
+ | target "=" expression
+ | target "=" "new" [ int_constant | string_constant | macro_id ]
+ | target "=" "new" keylist
+ | target "=" "find_type" [ string_constant | string_reg | id ]
+ | target "=" heredoc
+ | target "=" "global" <string_constant
+ | target assign_op simple_expr
+ | target keylist "=" simple_expr
+ | "global" string_constant "=" target
+ | result_var_list "=" short_sub_call
+</pre>
+<pre>
+
+NOTE: the heredoc rules are not complete or tested. Some work is required here.</pre>
+<pre>
+ heredoc:
+ "<<" string_constant nl
+ heredoc_string
+ heredoc_label
+
+ heredoc_label:
+ ^^ identifier
+
+ heredoc_string:
+ [ \N | \n ]*
+
+
+ long_sub_call:
+ ".pcc_begin" nl
+ arguments
+ [ method_call | non_method_call] target nl
+ [ local_decl nl ]*
+ result_values
+ ".pcc_end"
+
+
+ non_method_call:
+ ".pcc_call" | ".nci_call"
+
+ method_call:
+ ".invocant" target nl
+ ".meth_call"
+
+ short_sub_call:
+ invocant? [ target | string_constant ] parenthesized_args
+
+ invocant:
+ [ target"." | target "->" ]
+
+ sub_invocation:
+ long_sub_call | short_sub_call
+
+ result_var_list:
+ "(" result_vars ")"
+
+ result_vars:
+ result_var [ "," result_var ]*
+
+ result_var:
+ target get_flags?
+
+
+ parenthesized_args:
+ "(" args ")"
+
+ args:
+ arg [ "," arg ]
+
+ arg:
+ [ float_constant
+ | int_constant
+ | string_constant [ "=>" target ]?
+ | target
+ ]
+ set_flags?
+
+
+ arguments:
+ [ ".arg" simple_expr set_flags? nl ]*
+
+ result_values:
+ [ ".result" target get_flags? nl ]*
+
+ set_flags:
+ [ ":flat"
+ | named_flag
+ ]+
+
+ get_flags:
+ [ ":slurpy"
+ | ":optional"
+ | ":opt_flag"
+ | named_flag
+ ]+
+
+
+ named_flag:
+ ":named" parenthesized_string?
+
+ return_stat:
+ long_return_stat
+ | short_return_stat
+ | long_yield_stat
+ | short_yield_stat
+ | tail_call
+
+
+ long_return_stat:
+ ".pcc_begin_return" nl
+ return_directive*
+ ".pcc_end_return"
+
+ short_return_stat:
+ ".return" parenthesized_args
+
+ long_yield_stat:
+ ".pcc_begin_yield" nl
+ return_directive*
+ ".pcc_end_yield"
+
+ return_directive:
+ ".return" simple_expr set_flags? nl
+
+ short_yield_stat:
+ ".yield" parenthesized_args
+
+ tail_call:
+ ".return" short_sub_call
+
+ open_namespace:
+ ".namespace" identifier
+
+ close_namespace:
+ ".endnamespace" identifier
+
+</pre>
+<p>NOTE: an emit block only allows PASM instructions,
+not PIR instructions.
+
+</p>
+<pre>
+
+ emit:
+ ".emit" nl
+ labeled_pasm_instr*
+ ".eom"
+
+</pre>
+<p>NOTE: the macro definition is not complete, and untested.
+This should be fixed. For now, all characters up to but not
+including ".endm" are 'matched'.
+
+</p>
+<pre>
+
+ <span class="variable">macro_def</span><span class="operator">:</span>
+ <span class="string">".macro"</span> <span class="variable">identifier</span> <span class="variable">macro_parameters</span><span class="operator">?</span> <span class="variable">nl</span>
+ <span class="variable">macro_body</span>
+
+ <span class="variable">macro_parameters</span><span class="operator">:</span>
+ <span class="string">"("</span> <span class="variable">id_list</span><span class="operator">?</span> <span class="string">")"</span>
+
+ <span class="variable">macro_body</span><span class="operator">:</span>
+ <span class="operator">.*</span><span class="regex">?
+ ".endm" nl
+
+ pragma:
+ include
+ | new_operators
+ | loadlib
+ | namespace
+ | hll_mapping
+ | hll_specifier
+ | source_info
+
+
+ include:
+ ".include" string_constant
+
+ new_operators:
+ ".pragma" "n_operators" int_constant
+
+ loadlib:
+ ".loadlib" string_constant
+
+ namespace:
+ ".namespace" [ "[" namespace_id "]" ]?</span>
+
+ <span class="variable">hll_specifier</span><span class="operator">:</span>
+ <span class="string">".HLL"</span> <span class="variable">string_constant</span> <span class="string">","</span> <span class="variable">string_constant</span>
+
+ <span class="variable">hll_mapping</span><span class="operator">:</span>
+ <span class="string">".HLL_map"</span> <span class="variable">int_constant</span> <span class="string">","</span> <span class="variable">int_constant</span>
+
+ <span class="variable">namespace_id</span><span class="operator">:</span>
+ <span class="variable">string_constant</span> <span class="operator">[</span> <span class="string">";"</span> <span class="variable">string_constant</span> <span class="operator">]</span><span class="operator">*</span>
+
+ <span class="variable">source_info</span><span class="operator">:</span>
+ <span class="string">".line"</span> <span class="variable">int_constant</span> <span class="operator">[</span> <span class="string">","</span> <span class="variable">string_constant</span> <span class="operator">]</span><span class="operator">?</span>
+
+ <span class="variable">id_list</span><span class="operator">:</span>
+ <span class="variable">identifier</span> <span class="operator">[</span> <span class="string">","</span> <span class="variable">identifier</span> <span class="operator">]</span><span class="operator">*</span>
+
+ <span class="variable">string_constant</span><span class="operator">:</span>
+ <span class="variable">charset_specifier</span><span class="operator">?</span> <span class="variable">quoted_string</span>
+
+ <span class="variable">charset_specifier</span><span class="operator">:</span>
+ <span class="string">"ascii:"</span>
+ <span class="operator">|</span> <span class="string">"binary:"</span>
+ <span class="operator">|</span> <span class="string">"unicode:"</span>
+ <span class="operator">|</span> <span class="string">"iso-8859-1:"</span>
+
+
+ <span class="variable">type</span><span class="operator">:</span>
+ <span class="string">"int"</span>
+ <span class="operator">|</span> <span class="string">"num"</span>
+ <span class="operator">|</span> <span class="string">"pmc"</span>
+ <span class="operator">|</span> <span class="string">"object"</span>
+ <span class="operator">|</span> <span class="string">"string"</span>
+ <span class="operator">|</span> <span class="string">"Array"</span>
+ <span class="operator">|</span> <span class="string">"Hash"</span>
+
+ <span class="variable">target</span><span class="operator">:</span>
+ <span class="variable">identifier</span> <span class="operator">|</span> <span class="variable">register</span>
+
+
+</pre>
+<p>
+</p>
+<hr />
+<h1><a name="author">AUTHOR</a></h1>
+<p>Klaas-Jan Stol <a href="mailto:[EMAIL PROTECTED]">[EMAIL PROTECTED]</a>
+
+
+</p>
+<p>
+</p>
+<hr />
+<h1><a name="known_issues_and_bugs">KNOWN ISSUES AND BUGS</a></h1>
+<p>Some work should be done on:
+
+</p>
+<ul>
+<li>
+<p>Macro parsing
+
+</p>
+</li>
+<li>
+<p>Heredoc parsing
+
+</p>
+</li>
+<li>
+<p>The rule 'type' does currently not include custom types (user defined).
+Probably it needs an alternative "identifier". Not sure yet at this point.
+
+</p>
+</li>
+<li>
+<p>Clean up grammar, remove never-used features.
+
+</p>
+</li>
+<li>
+<p>Test. A lot.
+
+</p>
+<p>Bugs or improvements may be sent to the author, and of course greatly
+appreciated. Moreover, if you find any missing constructs that are in
+IMCC, indications of these would be appreciated as well.
+
+</p>
+</li>
+</ul>
+
+</body>
+
+</html>
Index: languages/PIR/lib/pir.pg
===================================================================
--- languages/PIR/lib/pir.pg (revision 16865)
+++ languages/PIR/lib/pir.pg (working copy)
@@ -33,7 +33,7 @@
[ <'.sub'> | <'.pcc_sub'> ]
[ <sub_id> | <syntax_error: sub identifier (id or string constant) expected> ]
<sub_pragmas>?
- [ <?nl> | <syntax_error: newline expected> ]
+ [ <?nl> | <syntax_error: newline expected> ]
<body>
}
@@ -140,15 +140,15 @@
}
rule labeled_pasm_instr {
- [ <label> <pasm_instr>?
+ [ <label> <pasm_instr>?
| <pasm_instr>
]
[ <?nl> | <syntax_error: newline expected after instruction> ]
}
rule instr {
- [ <pir_instr> | <pasm_instr> ]
-
+ [ <pir_instr> | <pasm_instr> ]
+
}
# this is a token, because no spaces are allowed between
@@ -294,18 +294,18 @@
}
rule assign_operator {
- <'+='>
- | <'-='>
- | <'/='>
- | <'%='>
- | <'*='>
- | <'.='>
- | <'&='>
- | <'|='>
- | <'~='>
- | <'<<='>
- | <'>>='>
- | <'>>>='>
+ <'+='>
+ | <'-='>
+ | <'/='>
+ | <'%='>
+ | <'*='>
+ | <'.='>
+ | <'&='>
+ | <'|='>
+ | <'~='>
+ | <'<<='>
+ | <'>>='>
+ | <'>>>='>
}
rule unary_operator {
@@ -336,8 +336,8 @@
}
rule key {
- [ <simple_expr> [ <'..'> [ <simple_expr> ]? ]? ]
- | [ <'..'> <simple_expr> ]
+ [ <simple_expr> [ <'..'> [ <simple_expr> ]? ]? ]
+ | [ <'..'> <simple_expr> ]
}
@@ -352,7 +352,7 @@
| <target> <'='> <'new'> [ <int_constant> | <string_constant> | <macro_id> ]
| <target> <'='> <'new'> <keylist>
| <target> <'='> <'find_type'> [ <string_constant> | <string_reg> | <id> ]
- | <target> <'='> <heredoc_string>
+ | <target> <'='> <heredoc>
| <target> <'='> <'global'> <string_constant> # deprecated?
| <target> <assign_operator> <simple_expr>
| <target> <keylist> <'='> <simple_expr>
@@ -402,7 +402,7 @@
| isntsame
}
-rule heredoc_string {
+rule heredoc {
<'<<'>
[ <string_constant> | <syntax_error: heredoc label identifier expected> ]
[ <?nl> | <syntax_error: newline expected after heredoc label> ]
@@ -434,21 +434,21 @@
]
[ <target> | <syntax_error: id or register expected that holds the sub object> ]
[ <?nl> | <syntax_error: newline after '.pcc_call sub' expected> ]
- [ <local_decl> <?nl> ]*
+ [ <local_decl> <?nl> ]*
<result_values>
[ <'.pcc_end'> | <syntax_error: '.pcc_end' expected> ]
}
rule invocant {
- <'.invocant'>
- <target>
- <?nl>
+ <'.invocant'>
+ <target>
+ <?nl>
}
rule short_sub_call {
[ [ <target>\. ]
| [ <target> <'->'> ]
- ]? # optional invocant
+ ]? # optional invocant
[ <target> | <string_constant> ] # method or sub name/id
<parenthesized_args> # sub args
}
@@ -609,7 +609,7 @@
#
rule macro_def {
<'.macro'>
- [ <id> | <syntax_error: macro identifier expected> ]
+ [ <id> | <syntax_error: macro identifier expected> ]
<macro_parameters>?
[ <?nl> | <syntax_error: newline expected after macro parameter list> ]
<macro_body>
@@ -657,8 +657,8 @@
rule namespace {
<'.namespace'>
[ <'['>
- [ <namespace_id> | <syntax_error: namespace identifier expected> ]
- [ <']'> | <syntax_error: ']' expected> ]
+ [ <namespace_id> | <syntax_error: namespace identifier expected> ]
+ [ <']'> | <syntax_error: ']' expected> ]
]?
}
Index: languages/PIR/t/assign.t
===================================================================
--- languages/PIR/t/assign.t (revision 16865)
+++ languages/PIR/t/assign.t (working copy)
@@ -3,7 +3,7 @@
use strict;
use warnings;
use lib qw(t . lib ../lib ../../lib ../../../lib);
-use Parrot::Test tests => 5;
+use Parrot::Test tests => 6;
use Test::More;
language_output_is( 'PIR_PGE', <<'CODE', <<'OUT', 'simple assignments' );
@@ -85,3 +85,17 @@
"parse" => PMC 'PIRGrammar' { ... }
Parse successful!
OUT
+
+language_output_is( 'PIR_PGE', <<'CODE', <<'OUT', 'string charset modifiers' );
+.sub main
+ .local string s
+ s = ascii:"Hello World"
+ s = binary:"Hello WOrld"
+ s = unicode:"Hello world"
+ s = iso-8859-1:"Hello world"
+.end
+CODE
+"parse" => PMC 'PIRGrammar' { ... }
+Parse successful!
+OUT
+
Index: languages/PIR/t/call.t
===================================================================
--- languages/PIR/t/call.t (revision 16865)
+++ languages/PIR/t/call.t (working copy)
@@ -3,7 +3,7 @@
use strict;
use warnings;
use lib qw(t . lib ../lib ../../lib ../../../lib);
-use Parrot::Test tests => 9;
+use Parrot::Test tests => 10;
use Test::More;
language_output_is( 'PIR_PGE', <<'CODE', <<'OUT', 'long sub invocation' );
@@ -167,3 +167,29 @@
"parse" => PMC 'PIRGrammar' { ... }
Parse successful!
OUT
+
+
+language_output_is( 'PIR_PGE', <<'CODE', <<'OUT', 'short sub call with flags' );
+
+# the sub body is taken from PDD03
+.sub main :main
+ .local pmc x, y
+ foo(x :flat)
+ foo(x, 'y' => y)
+ foo(x, y :named('y'))
+ foo(x :flat :named)
+ foo(a, b, c :flat, 'x' => 3, 'y' => 4, z :flat :named('z'))
+
+ x = foo() # single result
+ (i, j :optional, ar :slurpy, value :named('key') ) = foo()
+.end
+
+.sub foo
+ .return (i, ar :flat, value :named('key') )
+.end
+
+CODE
+"parse" => PMC 'PIRGrammar' { ... }
+Parse successful!
+OUT
+
Index: languages/PIR/t/sub.t
===================================================================
--- languages/PIR/t/sub.t (revision 16865)
+++ languages/PIR/t/sub.t (working copy)
@@ -63,8 +63,16 @@
OUT
language_output_is( 'PIR_PGE', <<'CODE', <<'OUT', 'outer flag' );
+
+.sub outer_sub
+.end
+
+.sub bar :outer(outer_sub)
+.end
+
.sub main :outer('outer_sub')
.end
+
CODE
"parse" => PMC 'PIRGrammar' { ... }
Parse successful!
@@ -179,7 +187,12 @@
language_output_is( 'PIR_PGE', <<'CODE', <<'OUT', 'pcc_sub' );
.pcc_sub x
-
+ .param int i # positional parameter
+ .param pmc argv :slurpy # slurpy array
+ .param pmc value :named('key') # named parameter
+ .param int x :optional # optional parameter
+ .param int has_x :opt_flag # flag 0/1 x was passed
+ .param pmc kw :slurpy :named # slurpy hash
.end
CODE
"parse" => PMC 'PIRGrammar' { ... }