On 07/19/2011 01:14 PM, Xah Lee wrote:
I added other unicode brackets to your list of brackets, but it seems
your code still fail to catch a file that has mismatched curly quotes.
(e.g.http://xahlee.org/p/time_machine/tm-ch04.html  )

LOL Billy.

  Xah

I suspect its due to the file mode being opened with 'rb' mode. Also, the diction of characters at the top, the closing token is the key, while the opening one is the value. Not sure if thats obvious.

Also returning the position of the first mismatched pair is somewhat ambiguous. File systems store files as streams of octets (mine do anyways) rather than as characters. When you ask for the position of the the first mismatched pair, do you mean the position as per file.tell() or do you mean the nth character in the utf-8 stream?

Also, you may have answered this earlier but I'll ask again anyways: You ask for the first mismatched pair, Are you referring to the inner most mismatched, or the outermost? For example, suppose you have this file:

foo[(])bar

Would the "(" be the first mismatched character or would the "]"?

--
Bill
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to