On Monday, May 12, 2014 11:05:53 PM UTC+5:30, scott...@gmail.com wrote: > On Friday, May 9, 2014 8:12:57 PM UTC-4, Steven D'Aprano wrote: > > fStr = fStr.replace(b'‒', b'-') > > Still doesn't work > > > > Best: > > > > > > # Untested > > > > fStr = re.sub(b'&#x(201[2-5])|(2E3[AB])|(00[2A]D)', b'-', fStr) > > Still doesn't work. > > Guess whatever the code is for endash and mdash are not the ones I am > using....
What happens if you divide two string? >>> 'a' / 'b' Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for /: 'str' and 'str' Or multiply 2 lists? >>> [1,2]*[3,3] Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can't multiply sequence by non-int of type 'list' Trying to do a text operation like re.sub on a NON-text object like a doc-file is the same. Yes python may not be intelligent enough to give you such useful error messages outside its territory ie on contents of random files, however logically its the same -- an impossible operation. The options you have: 1. Use doc-specific tools eg MS/Libre office to work on doc files ie dont use python 2. Follow Tim Golden's suggestion, ie use win32com which is a doc-talking python API [BTW Thanks Tim for showing how easy it is] 3. Get out of the doc format to txt (export as plain txt) and then try what you are trying on the txt -- https://mail.python.org/mailman/listinfo/python-list