Re: unicode "em space" in regex

2005-04-17 Thread "Martin v. LÃwis"
Xah Lee wrote: > Thanks. Is it true that any unicode chars can also be used inside regex > literally? > > e.g. > re.search(ur'â+',mystring,re.U) > > I tested this case and apparently i can. Yes. In fact, when you write u"\u2003" or u"â" doesn't matter to re.search. Either way you get a Unicode

Re: unicode "em space" in regex

2005-04-16 Thread "Martin v. LÃwis"
Xah Lee wrote: > how to represent the unicode "em space" in regex? You will have to pass a Unicode literal as the regular expression, e.g. fracture=re.split(u'\u2003*\\|\u2003*',myline,re.U) Notice that, in raw Unicode literals, you can still use \u to escape characters, e.g. fracture=re.spli

Re: Unicode BOM marks

2005-03-09 Thread "Martin v. LÃwis"
Steve Horsley wrote: It is my understanding that the BOM (U+feff) is actually the Unicode character "Non-breaking zero-width space". My understanding is that this used to be the case. According to http://www.unicode.org/faq/utf_bom.html#38 the application should now specify specific processing,

Re: unicode surrogates in py2.2/win

2005-03-08 Thread "Martin v. LÃwis"
Mike Brown wrote: Very strange how it only shows up after the 1st import attempt seems to succeed, and it doesn't ever show up if I run the code directly or run the code in the command-line interpreter. The reason for that is that the Python byte code stores the Unicode literal in UTF-8. The firs

Re: Unicode BOM marks

2005-03-07 Thread "Martin v. LÃwis"
Francis Girard wrote: Well, no text files can't be concatenated ! Sooner or later, someone will use "cat" on the text files your application did generate. That will be a lot of fun for the new unicode aware "super-cat". Well, no. For example, Python source code is not typically concatenated, nor

Re: Unicode BOM marks

2005-03-07 Thread "Martin v. LÃwis"
Francis Girard wrote: If I understand well, into the UTF-8 unicode binary representation, some systems add at the beginning of the file a BOM mark (Windows?), some don't. (Linux?). Therefore, the exact same text encoded in the same UTF-8 will result in two different binary files, and of a slightl

Re: unicode(obj, errors='foo') raises TypeError - bug?

2005-02-23 Thread "Martin v. LÃwis"
Kent Johnson wrote: Could this be handled with a try / except in unicode()? Something like this: Perhaps. However, this would cause a significant performance hit, and possbibly undesired side effects. So due process would require that the interface of __unicode__ first, and then change the actual

Re: unicode(obj, errors='foo') raises TypeError - bug?

2005-02-23 Thread "Martin v. LÃwis"
Steven Bethard wrote: Yeah, I agree it's weird. I suspect if someone supplied a patch for this behavior it would be accepted -- I don't think this should break backwards compatibility (much). Notice that the "right" thing to do would be to pass encoding and errors to __unicode__. If the string o

Re: rotor replacement

2005-01-21 Thread "Martin v. LÃwis"
[EMAIL PROTECTED] wrote: Do you know this for a fact? I'm going by newsgroup messages from around the time that I was proposing to put together a standard block cipher module for Python. Ah, newsgroup messages. Anybody could respond, whether they have insight or not. The PSF does comply with the

Re: xml parsing escape characters

2005-01-21 Thread "Martin v. LÃwis"
Luis P. Mendes wrote: From your experience, do you think that if this wrong XML code could be meant to be read only by somekind of Microsoft parser, the error will not occur? This is very unlikely. MSXML would never do this incorrectly. Regards, Martin -- http://mail.python.org/mailman/listinfo/py

Re: Unicode conversion in 'print'

2005-01-14 Thread "Martin v. LÃwis"
Ricardo Bugalho wrote: thanks for the information. But what I was really looking for was informaion on when and why Python started doing it (previously, it always used sys.getdefaultencoding())) and why it was done only for 'print' when stdout is a terminal instead of always. It does that since 2.

Re: Referenz auf Variable an Funktion Ãbergeben?

2005-01-10 Thread "Martin v. LÃwis"
Torsten Mohr wrote: Geht sowas auch in Python? Nicht direkt. Es ist Ãblich, dass Funktionen, die Ergebnisse (RÃckgabewerte) liefern, dies mittels return tun: def vokale(string): result = [c for c in string if c in "aeiou"] return "".join(result) x = "Hallo, Welt" x = vokale(x) Falls man meh