Re: A Unicode problem -HELP

2006-05-17 Thread Martin v. Löwis
manstey wrote: > Thanks very much. Your def comma_separated_utf8(items): approach raises > an exception in codecs.py, so I tried = u", ".join(word_info + parse + > gloss), which works perfectly. So I want to understand exactly why this > works. word_info and parse and gloss are all tuples. does st

Re: A Unicode problem -HELP

2006-05-17 Thread manstey
Hi Martin, Thanks very much. Your def comma_separated_utf8(items): approach raises an exception in codecs.py, so I tried = u", ".join(word_info + parse + gloss), which works perfectly. So I want to understand exactly why this works. word_info and parse and gloss are all tuples. does str convert t

Re: A Unicode problem -HELP

2006-05-16 Thread Ben Finney
"manstey" <[EMAIL PROTECTED]> writes: > 1. Here is my input data file, line 2: > gn1:1,1.2 R")$I73YT R")[EMAIL PROTECTED] Your program is reading this using the 'utf-8' encoding. When it does so, all the characters you show above will be read in happily as you see them (so long as you view them w

Re: A Unicode problem -HELP

2006-05-16 Thread Tim Roberts
"manstey" <[EMAIL PROTECTED]> wrote: > >I have done more reading on unicode and then tried my code in IDLE >rather than WING IDE, and discovered that it works fine in IDLE, so I >think WING has a problem with unicode. Rather, its output defaults to ASCII. >So, assuming I now work in IDLE, all I w

Re: A Unicode problem -HELP

2006-05-16 Thread Martin v. Löwis
manstey wrote: > a=str(word_info + parse + gloss).encode('utf-8') > a=a[1:len(a)-1] > > Is this clearer? Indeed. The problem is your usage of str() to "render" the output. As word_info+parse+gloss is a list (or is it a tuple?), str() will already produc

Re: A Unicode problem -HELP

2006-05-16 Thread manstey
OK, I apologise for not being clearer. 1. Here is my input data file, line 2: gn1:1,1.2 R")$I73YT R")[EMAIL PROTECTED] 2. Here is my output data file, line 2: u'gn', u'1', u'1', u'1', u'2', u'-', u'R")$I73YT', u'R")$IYT', u'R")$IYT', u'@', u'ncfsa', u'nc', '', '', '', u'f', u's', u'a', '', '', ''

Re: A Unicode problem -HELP

2006-05-16 Thread Martin v. Löwis
manstey wrote: > input_file = open(input_file_loc, 'r') > output_file = open(output_file_loc, 'w') > for line in input_file: > output_file.write(str(word_info + parse + gloss)) # = three > functions that return tuples > > (u'F', u'\u0254') are two of the many unicode tuple elements returned

Re: A Unicode problem -HELP

2006-05-16 Thread Ben Finney
"manstey" <[EMAIL PROTECTED]> writes: > I'm a newbie at python, so I don't really understand how your answer > solves my unicode problem. Since your replies fail to give any context of the existing discussion, I could only go by the content of what you'd written in that message. I didn't see a pr

Re: A Unicode problem -HELP

2006-05-16 Thread manstey
I'm a newbie at python, so I don't really understand how your answer solves my unicode problem. I have done more reading on unicode and then tried my code in IDLE rather than WING IDE, and discovered that it works fine in IDLE, so I think WING has a problem with unicode. For example, in WING this

Re: A Unicode problem -HELP

2006-05-16 Thread Ben Finney
"manstey" <[EMAIL PROTECTED]> writes: > input_file = open(input_file_loc, 'r') > output_file = open(output_file_loc, 'w') > for line in input_file: > output_file.write(str(word_info + parse + gloss)) # = three functions > that return tuples If you mean that 'word_info', 'parse' and 'gloss'

Re: A Unicode problem -HELP

2006-05-16 Thread manstey
Hi Martin, HEre is how I write: input_file = open(input_file_loc, 'r') output_file = open(output_file_loc, 'w') for line in input_file: output_file.write(str(word_info + parse + gloss)) # = three functions that return tuples (u'F', u'\u0254') are two of the many unicode tuple elements retu

Re: A Unicode problem -HELP

2006-05-11 Thread Martin v. Löwis
manstey wrote: > 1. I have # -*- coding: UTF-8 -*- as my first line. > 2. In Wing IDE I have set Default Encoding to UTF-8 > 3. I have imported codecs and opened and written my file, which doesn't > have a BOM, as encoding=UTF-8 > 4. I have written a dictionary for translation, with entries such as