Hello, this mail is an analysis of implementing operator ternary operator "?? ::" in pugs. Welcome for suggestions.
I. ?? :: in pugs The implementation of ?? :: in pugs is buggy. Below are two perl 6 expressions which pugs can NOT parse. 1. '(1 ?? 2 :: 3)' should return 2. t/03operator.t line 26. 2. '$t = 1 ?? "true" :: "false"' should makes $t = "true". t/syntax//decl_vs_assign_prec line 8. Pugs failed to parse the first one, and in second one, $t is assigned to 1 and the whale expression returns "true". The first error is because in Parser.hs, rulePostTernary is only in ruleExpression which is a top level rule and makes "?? ::" works on the same level as post "if" condition. And in the parenthesis, parseOp is used instead of ruleExpression, which failed to parse "??". The second one is kind of extension problem of the first one. Since the level of "?? ::" is too high, assignment expression takes place first. II. Implementations in other compilers. I referenced several open source compilers implemented trenary operators. Including gcc-4.0, gcc-4.1, g++-3.3, g++-4.1 and perl-5.9.1 gcc-4.0: yacc (bison) grammar gcc-4.1: hand-written recursion-descent parser in C g++-3.3: yacc (bison) grammar g++-4.1: hand-written recursion-descent parser in C perl-5.9.1: yacc (bison) grammar IIa. gcc-4.0 & g++-3.3, yacc grammar These are almost the same. First of all, specify the priority: %right <code> '?' ':' Then the expression grammar is given: expr: expr_no_commas | expr ',' expr_no_commas ; expr_no_commas: cast_expr | ... // binary operations | expr_no_commas '?' expr ':' expr_no_commas ; IIb. gcc-4.1, C code Here is a draft of parser, omitting AST building code and unneccesary parameters. c_parser_expression () { c_parser_expr_no_commas (); while (c_parser_next_token_is (CPP_COMMA)) { c_parser_consume_token (); // consume ',' c_parser_expr_no_commas () } } c_parser_expr_no_commas () { c_parser_conditional_expression (); } c_parser_conditional_expression () { c_parser_binary_expression (); if (c_parser_next_token_is_not (CPP_QUERY)) return; c_parser_consume_token (); // consume '?' c_parser_expression (); c_parser_require (CPP_COLON); c_parser_conditional_expression (); } c_parser_binary_expression () { /* parse expression binary-expression: simple-cast-expression | binary-expression <token> binary-expression */ } IIc. g++-4.1, C code cp_parser_expression () { cp_parser_assignment_expression (); } cp_parser_assignment_expression () { cp_parser_binary_expression (); if (cp_lexer_next_token_is (CPP_QUERY)) cp_parser_question_colon_clause () } cp_parser_question_colon_clause () { cp_lexer_consume_token (); // Consume '?' cp_parser_expression (); cp_parser_require (CPP_COLON); cp_parser_assignment_expression (); } cp_parser_binary_expression () { // works similiar to c_parser_binary_expression } IId. perl-5.9.1, yacc grammar. Since there are loose logical operators in perl 5, too, should be helpful. %left <ival> OROP DOROP %left ANDOP %right '?' ':' ... expr: expr ANDOP expr | expr OROP expr | expr DOROP expr | argexpr %prec PREC_LOW ; argexpr: argexpr ',' | argexpr ',' term | term %prec PREC_LOW term: termbinop | termunop | ... | term '?' term ':' term III. Solutions. In Parser.hs, ternOps is defined but does not generate any thing. I don't know if it's because Parsec doesn't support suching gramma or not. If Parsec really lack such support, I have two suggestions: 1. Hack Parsec and make it work, but I don't know the difficulty. 2. Learn from the two hand-written parser, add another level of parser. I've tried to write an extremely little parser in the second way as in attachment. Somehow dirty, but maybe acceptable. Shu-Chun Weng
module Parser where import Rule import Rule.Expr {- usage: Parser> run expr "10mul1-1??3::4" 40 Parser> run expr "(1add1-2)??56::78" 78 -} binExpr :: Parser Integer binExpr = buildExpressionParser binTable binFactor <?> "binary expression" binTable = [[op "*" (*) AssocLeft, op "/" (div) AssocLeft] ,[op "+" (+) AssocLeft, op "-" (-) AssocLeft]] where op s f assoc = Infix (do{ string s; return f}) assoc binFactor = do{ char '(' ; x <- expr ; char ')' ; return x } <|> number <?> "simple expression" expr :: Parser Integer expr = buildExpressionParser looseTable looseFactor <?> "loose expression" looseTable = [[op "mul" (*) AssocLeft] , [op "add" (+) AssocLeft] ] where op s f assoc = Infix (do{ string s; return f}) assoc looseFactor = try(do{ char '('; x <- expr; char ')'; eof; return x }) <|> do{ a <- binExpr; do{ try(string "??"); b <- expr; string "::"; c <- expr; if a == 0 then return c else return b } <|> return a } <?> "loose factor" number :: Parser Integer number = do{ ds <- many1 digit ; return $ read ds } <?> "number" run :: Show a => Parser a -> String -> IO () run p str = case (parse p "" str) of Left err -> do{ putStr "parse error at "; print err } Right x -> print x