On Monday, May 12, 2014 11:05:53 PM UTC+5:30, scott...@gmail.com wrote:
> On Friday, May 9, 2014 8:12:57 PM UTC-4, Steven D'Aprano wrote:
> >     fStr = fStr.replace(b'&#x2012', b'-')
> 
>    Still doesn't work
> 
> 
> > Best:
> > 
> > 
> >     # Untested
> > 
> >     fStr = re.sub(b'&#x(201[2-5])|(2E3[AB])|(00[2A]D)', b'-', fStr)
> 
>   Still doesn't work.
> 
>   Guess whatever the code is for endash and mdash are not the ones I am 
> using....

What happens if you divide two string?
>>> 'a' / 'b'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for /: 'str' and 'str'

Or multiply 2 lists?

>>> [1,2]*[3,3]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can't multiply sequence by non-int of type 'list'

Trying to do a text operation like re.sub on a NON-text object like a doc-file
is the same.

Yes python may not be intelligent enough to give you such useful error messages
outside its territory ie on contents of random files, however logically its the
same -- an impossible operation.


The options you have:
1. Use doc-specific tools eg MS/Libre office to work on doc files ie dont use 
python
2. Follow Tim Golden's suggestion, ie use win32com which is a doc-talking
python API [BTW Thanks Tim for showing how easy it is]
3. Get out of the doc format to txt (export as plain txt) and then try what you 
are trying on the txt
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to