Hi!

I found (and fixed) a few Bugs in the file local/bin/sage-preparse.

These are the things I fixed:

* The module docstrings disappeared when preparsing because the
preparse_file function inserted those numeric_literals definitions before
the docstrings.

* Now also unicode-docstrings (e.g. u"""foo""") are recognized as
docstrings. Also raw docstrings may now use an upper case R as string
modifier (R"""foo""" would work now) which is allowed in Python.

* Now all coding-comments as specified by Python are found and excluded
from preparsing.

* I did not fix a bug that occurs when a statement is on the same line
where the docstring ends (e.g. """foo"""; print 2^5). It will not be
preparsed! I added a TODO-comment on the according line.


I don't know if I am allowed to attach files here, so I'm going to host
the diff and my fixed version of the file elsewhere:

http://spielwiese.hsg-kl.de/~cui/bug-fix-preparse/

Included is the original file as of sage version 4.4 on opensuse 64bit,
the file that contains my bugfixes and a diff file of those two.

greetings,
David Poetzsch-Heffter.

PS: If I am allowed to attach files, the named files come also attached
with this e-mail.

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Attachment: sage-preparse.bugfix
Description: Binary data

8a9,12
>     -- David Poetzsch-Heffter (2010): fix trac #????
>        (Recognize unicode docstrings, factoring out numeric literals
>         won't destroy the docstring search, all allowed encoding
>         comments are found)
63a68,72
> # We use this regexp to check if a line is a coding-comment.
> # This regexp is officially specified by python for
> # recognizing coding-comments.
> coding_regexp = re.compile(r"coding[:=]\s*([-\w.]+)")
> 
106,113c115,149
< 
<     #Check to see if a coding is specified in the .sage file.
<     #If it is, then we want to copy it over to the new file
<     #and not include it in the preprocessing
<     if F.startswith('# -*- coding:'):
<         end = F.find('\n')
<         coding = F[:end+1] if end != -1 else F
<         F = F[len(coding):]
---
>     
>     # It is ** critical ** that we put all inserts made via preparsing
>     # after the module docstring, since otherwise the module docstring
>     # will disappear
>     i = find_position_right_after_module_docstring(F)
>     docstring = F[:i]
>     F = F[i:]
> 
>     # If a docstring was found the coding comment should be in the comments
>     # preceeding the docstring and is therefore included in the
>     # docstring-variable. So we have to look for a coding-comment only if
>     # no docstring was found!
>     if (i == 0):
>         # A coding-comment may appear in the first or in the second
>         # line of a file. If there is such a comment, then we want to
>         # copy it over to the new file and not include it in the
>         # preprocessing.
>         lines = F.splitlines()
>         if coding_regexp.search(lines[0]):
>             coding = lines[0] + "\n"
>             F = F[len(coding):]
>         elif coding_regexp.search(lines[1]):
>             coding = "\n".join(lines[:2]) + "\n"
>             F = "\n".join(lines[2:])
>         else:
>             coding = ''
>     elif "\n" not in docstring.strip():
>         # Ok, there is one possibility: If the found docstring is only
>         # a single line the second line may be a coding comment.
>         lines = F.splitlines()
>         if coding_regexp.search(lines[0]):
>             coding = lines[0] + "\n"
>             F = F[len(coding):]
>         else:
>             coding = ''
116c152
< 
---
>     
124d159
< 
126,127d160
<     # It is ** critical ** that we put this after the mdoule docstring, since
<     # otherwise the module docstring will disappear. 
129,130c162
<     i = find_position_right_after_module_docstring(G)
<     G = coding + G[:i] + insert + G[i:]
---
>     G = docstring + coding + insert + G
163c195,196
<     if not (n[0] in ['"',"'"] or n[0:2] in ['r"',"r'"]):
---
>     possible_docstring_starts = ['"',"'", 'r"',"r'", 'u"', "u'", 'ur"', "ur'"]
>     if not any( map(n.lower().startswith, possible_docstring_starts) ):
171c204,205
<     n = n.lstrip('r')  # strip leading r if there is one
---
>     # strip leading u and/or r if there is one
>     n = n.lstrip('u').lstrip('U').lstrip('r').lstrip('R')
177a212,213
>         # TODO: If there are statements in the same line where the
>         # docstring ends they will not be preparsed!

Attachment: sage-preparse.orig-4.4
Description: Binary data

Reply via email to