Re: Problem Converting Word to UTF8 Text File

Gabriel Genellina Sun, 21 Oct 2007 10:07:39 -0700

En Sun, 21 Oct 2007 13:35:43 -0300, <[EMAIL PROTECTED]> escribi�:

> Hi all,
>
> I'm trying to copy a bunch of microsoft word documents that have
> unicode characters into utf-8 text files.  Everything works fine at
> the beginning.  The word documents get converted and new utf-8 text
> files with the same name get created.  And then I try to copy the data
> and I keep on getting "TypeError: coercing to Unicode: need string or
> buffer, instance found".  I'm probably copying the word document
> wrong.  What can I do?


Always remember to provide the full traceback.
Where do you get the error? In the last line: shutil.copyfile?
If the file already contains the text in utf-8, and you just want to make  
a copy, use shutil.copy as before.
(or, why not tell Word to save the file using the .txt extension in the  
first place?)

> for doc in glob.glob(input):
>     txt_split = os.path.splitext(doc)
>     txt_doc = txt_split[0] + '.txt'
>     txt_doc = codecs.open(txt_doc,'w','utf-8')
>     shutil.copyfile(doc,txt_doc)

copyfile expects path names as arguments, not a  
codecs-wrapped-file-like-object

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Problem Converting Word to UTF8 Text File

Reply via email to