Pyparsing 1.3.3 contains mostly bugfixes and minor enhancements over previous releases, including some improvement in Unicode support. Here are the change notes:
Version 1.3.3 - September 12, 2005 ---------------------------------- - Improved support for Unicode strings that would be returned using srange. Added greetingInKorean.py example, for a Korean version of "Hello, World!" using Unicode. (Thanks, June Kim!) - Added 'hexnums' string constant (nums+"ABCDEFabcdef") for defining hexadecimal value expressions. - NOTE: ===THIS CHANGE MAY BREAK EXISTING CODE=== Modified tag and results definitions returned by makeHTMLTags(), to better support the looseness of HTML parsing. Tags to be parsed are now caseless, and keys generated for tag attributes are now converted to lower case. Formerly, makeXMLTags("XYZ") would return a tag with results name of "startXYZ", this has been changed to "startXyz". If this tag is matched against '<XYZ Abc="1" DEF="2" ghi="3">', the matched keys formerly would be "Abc", "DEF", and "ghi"; keys are now converted to lower case, giving keys of "abc", "def", and "ghi". These changes were made to try to address the lax case sensitivity agreement between start and end tags in many HTML pages. No changes were made to makeXMLTags(), which assumes more rigorous parsing rules. Also, cleaned up case-sensitivity bugs in closing tags, and switched to using Keyword instead of Literal class for tags. (Thanks, Steve Young, for getting me to look at these in more detail!) - Added two helper parse actions, upcaseTokens and downcaseTokens, which will convert matched text to all uppercase or lowercase, respectively. - Deprecated Upcase class, to be replaced by upcaseTokens parse action. - Converted messages sent to stderr to use warnings module, such as when constructing a Literal with an empty string, one should use the Empty() class or the empty helper instead. - Added ' ' (space) as an escapable character within a quoted string. - Added helper expressions for common comment types, in addition to the existing cStyleComment (/*...*/) and htmlStyleComment (<!-- ... -->) . dblSlashComment = // ... (to end of line) . cppStyleComment = cStyleComment or dblSlashComment . javaStyleComment = cppStyleComment . pythonStyleComment = # ... (to end of line) Download pyparsing at http://pyparsing.sourceforge.net. -- Paul ======================================== Pyparsing is a pure-Python class library for quickly developing recursive-descent parsers. Parser grammars are assembled directly in the calling Python code, using classes such as Literal, Word, OneOrMore, Optional, etc., combined with operators '+', '|', and '^' for And, MatchFirst, and Or. No separate code-generation or external files are required. Pyparsing can be used in many cases in place of regular expressions, with shorter learning curve and greater readability and maintainability. Pyparsing comes with a number of parsing examples, including: - "Hello, World!" (English and Korean) - chemical formulas - configuration file parser - web page URL extractor - 5-function arithmetic expression parser - subset of CORBA IDL - chess portable game notation - simple SQL parser - Mozilla calendar file parser - EBNF parser/compiler -- http://mail.python.org/mailman/listinfo/python-list