Re: Python 2.6 StreamReader.readline()

2012-07-25 Thread Walter Dörwald
On 25.07.12 08:09, Ulrich Eckhardt wrote: Am 24.07.2012 17:01, schrieb cpppw...@gmail.com: reader = codecs.getreader(encoding) lines = [] with open(filename, 'rb') as f: lines = reader(f, 'strict').readlines(keepends=False) where encoding == 'utf-16-be' Everything wo

Re: Issues with `codecs.register` and `codecs.CodecInfo` objects

2012-07-10 Thread Walter Dörwald
On 07.07.12 04:56, Steven D'Aprano wrote: On Fri, 06 Jul 2012 12:55:31 -0400, Karl Knechtel wrote: Hello all, While attempting to make a wrapper for opening multiple types of UTF-encoded files (more on that later, in a separate post, I guess), I ran into some oddities with the `codecs` module

Re: Why are some unicode error handlers "encode only"?

2012-03-11 Thread Walter Dörwald
On 11.03.12 15:37, Steven D'Aprano wrote: At least two standard error handlers are documented as working for encoding only: xmlcharrefreplace backslashreplace See http://docs.python.org/library/codecs.html#codec-base-classes and http://docs.python.org/py3k/library/codecs.html Why is this? I

Re: replacing words in HTML file

2010-04-30 Thread Walter Dörwald
On 28.04.10 15:02, james_027 wrote: > hi, > > Any idea how I can replace words in a html file? Meaning only the > content will get replace while the html tags, javascript, & css are > remain untouch. You could try XIST (http://www.livinglogic.de/Python/xist/): Example code: from ll.xist import

Re: how to write a unicode string to a file ?

2009-10-19 Thread Walter Dörwald
On 17.10.09 08:28, Mark Tolonen wrote: > > "Kee Nethery" wrote in message > news:aaab63c6-6e44-4c07-b119-972d4f49e...@kagi.com... >> >> On Oct 16, 2009, at 5:49 PM, Stephen Hansen wrote: >> >>> On Fri, Oct 16, 2009 at 5:07 PM, Stef Mientki >>> wrote: >> >> snip >> >>> The thing is, I'd be VERY

Re: HTMLgen???

2009-10-16 Thread Walter Dörwald
On 16.10.09 05:44, alex23 wrote: > On Oct 15, 6:58 pm, an...@vandervlies.xs4all.nl wrote: >> Does HTMLgen (Robin Friedrich's) still exsist?? And, if so, where can it >> be found? > > If you're after an easy to use html generator, I highly recommend > Richard Jones' html[1] lib. It's new, supported

Re: unicode issue

2009-10-01 Thread Walter Dörwald
On 01.10.09 17:50, Rami Chowdhury wrote: > On Thu, 01 Oct 2009 08:10:58 -0700, Walter Dörwald > wrote: > >> On 01.10.09 16:09, Hyuga wrote: >>> On Sep 30, 3:34 am, gentlestone wrote: >>>> Why don't work this code on Python 2.6? Or how can I do this

Re: unicode issue

2009-10-01 Thread Walter Dörwald
On 01.10.09 16:09, Hyuga wrote: > On Sep 30, 3:34 am, gentlestone wrote: >> Why don't work this code on Python 2.6? Or how can I do this job? >> >> _MAP = { >> # LATIN >> u'À': 'A', u'Á': 'A', u'Â': 'A', u'Ã': 'A', u'Ä': 'A', u'Å': 'A', >> u'Æ': 'AE', u'Ç':'C', >> u'È': 'E', u'É': 'E',

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Walter Dörwald
Martin v. Löwis wrote: >> "correct" -> "corrected" > > Thanks, fixed. > >>> To convert non-decodable bytes, a new error handler "python-escape" is >>> introduced, which decodes non-decodable bytes using into a private-use >>> character U+F01xx, which is believed to not conflict with private-use >

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Walter Dörwald
Martin v. Löwis wrote: > I'm proposing the following PEP for inclusion into Python 3.1. > Please comment. > > Regards, > Martin > > PEP: 383 > Title: Non-decodable Bytes in System Character Interfaces > Version: $Revision: 71793 $ > Last-Modified: $Date: 2009-04-22 08:42:06 +0200 (Mi, 22. Apr 20

Re: [2.5.1] ShiftJIS to Unicode?

2008-11-27 Thread Walter Dörwald
Gilles Ganault wrote: > Hello > > I'm trying to read pages from Amazon JP, whose web pages are > supposed to be encoded in ShiftJIS, and decode contents into Unicode > to keep Python happy: > > www.amazon.co.jp > /> > > But this doesn't work: > > == > m = try.search(the_page) > if m

Re: ANN: XML builder for Python

2008-07-03 Thread Walter Dörwald
Jonas Galvez wrote: Walter Dörwald wrote: XIST has been using with blocks since version 3.0. [...] with xsc.Frag() as node: +xml.XML() +html.DocTypeXHTML10transitional() with html.html(): [...] Sweet! I don't like having to use the unary operator tho, I wanted something as simp

Re: ANN: XML builder for Python

2008-07-03 Thread Walter Dörwald
Stefan Behnel wrote: Hi, Walter Dörwald wrote: XIST has been using with blocks since version 3.0. Take a look at: http://www.livinglogic.de/Python/xist/Examples.html from __future__ import with_statement from ll.xist import xsc from ll.xist.ns import html, xml, meta with xsc.Frag() as

Re: ANN: XML builder for Python

2008-07-03 Thread Walter Dörwald
Stefan Behnel wrote: Stefan Behnel wrote: Jonas Galvez wrote: Not sure if it's been done before, but still... Obviously ;) http://codespeak.net/lxml/tutorial.html#the-e-factory ... and tons of other tools that generate XML, check PyPI. Although it might be the first time I see the with sta

Re: convert xhtml back to html

2008-04-24 Thread Walter Dörwald
Arnaud Delobelle wrote: "Tim Arnold" <[EMAIL PROTECTED]> writes: hi, I've got lots of xhtml pages that need to be fed to MS HTML Workshop to create CHM files. That application really hates xhtml, so I need to convert self-ending tags (e.g. ) to plain html (e.g. ). Seems simple enough, but I

Re: Generating HTML

2007-09-12 Thread Walter Dörwald
Sebastian Bassi wrote: > Hello, > > What are people using these days to generate HTML? I still use > HTMLgen, but I want to know if there are new options. I don't > want/need a web-framework a la Zope, just want to produce valid HTML > from Python. If you want something that works similar to HTM

Re: Replacement for HTMLGen?

2007-05-04 Thread Walter Dörwald
Joshua J. Kugler wrote: > I realize that in today's MVC-everything world, the mere mention of > generating HTML in the script is near heresy, but for now, it's what I ened > to do. :) > > That said, can someone recommend a good replacement for HTMLGen? I've found > good words about it (http://www

Re: Unicode error handler

2007-01-31 Thread Walter Dörwald
[EMAIL PROTECTED] wrote: > On Jan 30, 11:28 pm, Walter Dörwald <[EMAIL PROTECTED]> wrote: > >> codecs.register_error("transliterate", transliterate) >> >>Walter > > Really, really slick solution. > Though, why was it [:1], not [0]? ;-)

Re: Unicode error handler

2007-01-31 Thread Walter Dörwald
Martin v. Löwis wrote: > Walter Dörwald schrieb: >> You might try the following: >> >> # -*- coding: iso-8859-1 -*- >> >> import unicodedata, codecs >> >> def transliterate(exc): >> if not isinstance(exc, UnicodeEncodeError): >>

Re: Unicode error handler

2007-01-30 Thread Walter Dörwald
Rares Vernica wrote: > Hi, > > Does anyone know of any Unicode encode/decode error handler that does a > better replace job than the default replace error handler? > > For example I have an iso-8859-1 string that has an 'e' with an accent > (you know, the French 'e's). When I use s.encode('asci

Re: urllib.unquote and unicode

2006-12-21 Thread Walter Dörwald
Martin v. Löwis wrote: > Duncan Booth schrieb: >> The way that uri encoding is supposed to work is that first the input >> string in unicode is encoded to UTF-8 and then each byte which is not in >> the permitted range for characters is encoded as % followed by two hex >> characters. > > Can you

Re: Is htmlGen still alive?

2006-12-19 Thread Walter Dörwald
[EMAIL PROTECTED] wrote: > Does anybody know whether htmlGen, the Python-class library for > generating HTML, is still being maintained? Or from where it can be > downloaded? The Starship site where it used to be hosted is dead. I don't know if HTMLgen is still alive, but if you're looking for alt

Re: Python tools for managing static websites?

2006-10-31 Thread Walter Dörwald
site http://www.livinglogic.de/Python/ itself was generated with XIST. You can find the source for the website here: http://www.livinglogic.de/viewcvs/index.cgi/LivingLogic/WWW-Python/site/ Hope that helps! Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: unicode, bytes redux

2006-09-25 Thread Walter Dörwald
Steven D'Aprano wrote: > On Mon, 25 Sep 2006 00:45:29 -0700, Paul Rubin wrote: > >> willie <[EMAIL PROTECTED]> writes: >>> # U+270C >>> # 11100010 10011100 10001100 >>> buf = "\xE2\x9C\x8C" >>> u = buf.decode('UTF-8') >>> # ... later ... >>> u.bytes() -> 3 >>> >>> (goes through each code point and

Re: how to get size of unicode string/string in bytes ?

2006-08-02 Thread Walter Dörwald
Diez B. Roggisch wrote: >> So then the easiest thing to do is: take the maximum length of a unicode >> string you could possibly want to store, multiply it by 4 and make that >> the length of the DB field. > >> However, I'm pretty convinced it is a bad idea to store Python unicode >> strings dire

Re: Having problems with strings in HTML

2006-06-27 Thread Walter Dörwald
hey shouldn't. They part of the url, which is (IIRC) a CDATA >> attribute of the A element, not PCDATA. > > It is CDATA but ampersands still need to be escaped. Exactly. See http://www.w3.org/TR/html4/appendix/notes.html#ampersands-in-uris Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: a good programming text editor (not IDE)

2006-06-16 Thread Walter Dörwald
[EMAIL PROTECTED] wrote: > John Salerno wrote: > [snip] >> Thanks for any suggestions, and again I'm sorry if this feels like the >> same question as usual (it's just that in my case, I'm not looking for >> something like SPE, Komodo, Eric3, etc. right now). > > I was taking a peek at c.l.py to c

Re: curses event handling

2006-06-07 Thread Walter Dörwald
John Hunter wrote: > I have a curses app that is displaying real time data. I would like > to bind certain keys to certain functions, but do not want to block > waiting for > > c = screen.getch() > > Is it possible to register callbacks with curses, something like > > screen.register('ke

Re: HTMLParser fragility

2006-04-06 Thread Walter Dörwald
r from libxml2 or any of the available wrappers for it. Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: [ANN] markup.py - 1.2 - an HTML/XML generator

2006-04-04 Thread Walter Dörwald
ame conflicts with a keyword (and one can assume it means "for all > keywords other than class"). No, I think what it means is this: "Use cls as the name of the first argument in a classmethod. For anything else (i.e. name that are not the first argument in a classmethod) appen

Re: encoding problems (X and X)

2006-03-24 Thread Walter Dörwald
v.append(u"?") return (u"".join(v), uerr.end) codecs.register_error('replacelatscii', latscii_error) Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: unicode question

2006-03-01 Thread Walter Dörwald
Edward Loper wrote: > Walter Dörwald wrote: >> Edward Loper wrote: >> >>> [...] >>> Surely there's a better way than converting back and forth 3 times? Is >>> there a reason that the 'backslashreplace' error mode can't be used

Re: unicode question

2006-02-27 Thread Walter Dörwald
text if an input character can't be encoded. But a backslash character in an 8bit string is no error, so it won't get replaced on decoding. What you want is a different codec (try e.g. "string-escape" or "unicode-escape"). Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: print UTF-8 file with BOM

2005-12-23 Thread Walter Dörwald
encoding named utf-8-sig, that would output a leading BOM on writing and skip it on reading. Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: XML DOM: XML/XHTML inside a text node

2005-11-04 Thread Walter Dörwald
s: And now for something completely differentNumber 1 ... the larch I hope this is what you need. Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: Need a spider library

2005-10-12 Thread Walter Dörwald
('http://www.python.org/search/', 'http://www.python.org/search/', u'Search') ('http://www.python.org/download/', 'http://www.python.org/download/', u'Download') ('http://www.python.org/doc/', 'http://www.python.org/doc/', u'Documentation') ... Hope that helps, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: cgi, reusing html. common problem?

2005-09-01 Thread Walter Dörwald
vinglogic.de/Python/xist). It was developed for exactly this purpose: You implement reusable HTML fragments in Python and you can use any kind of embedded dynamic language (PHP and JSP are supported out of the box). Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: python html

2005-08-19 Thread Walter Dörwald
.de/Python/xist) Code might look like this: from ll.xist import xsc, parsers node = parsers.parseURL("http://www.python.org/";, tidy=True) for link in node//xsc.URLAttr: link[:] = unicode(link).replace( "http://www.python.org/";, "http://www.perl.org/";

Re: Syntax error after upgrading to Python 2.4

2005-08-10 Thread Walter Dörwald
ethod in the CharStyle class which returns a new modified > instance of CharStyle. > > I'm using Windows XP and Python 2.4.1 > > Any ideas? O:-) This is probably related to http://www.python.org/sf/1163244. Do you have a PEP 263 encoding declaration in your file? Can you try Lib/codecs.py from current CVS? Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: Trimming X/HTML files

2005-07-28 Thread Walter Dörwald
rint "\t%s" % field["name"] --- This prints: Fields for http://www.google.com/search q domains sitesearch sourceid submit Hope that helps! Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: what is __init__.py used for?

2005-07-05 Thread Walter Dörwald
ut this __init__.py inside the system2 directoy you couldn't import other.py because Python doesn't know where the source code for system2 lives and refuses to treat system2 as a package. Hope that helps, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: MySQL: 'latin-1' codec can't encode character

2005-05-13 Thread Walter Dörwald
try: > return charmap[ord(s)], info.end This will fail if there's more than one consecutive unencodable character, better use return charmap[ord(s[0])], info.start+1 or return "".join(charmap.get(ord(c), u"" % ord(c)) for c in s), info.end (witho

Re: HTML cleaner?

2005-04-25 Thread Walter Dörwald
Ivan Voras wrote: M.-A. Lemburg wrote: Not true: mxTidy integrates tidy as C lib. It's not an interface to the command line tool. Thanks, I'll look at it again! Another option might be the HTML parser (libxml2.htmlReadMemory()) from libxml2 (http://www.xmlsoft.org) Bye, Walter Dörwa

Re: xmlproc maintainer?

2005-03-18 Thread Walter Dörwald
e a little tricky, because the parser must determine which encoding to use before instantiating the decoder. Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: unicode encoding usablilty problem

2005-02-18 Thread Walter Dörwald
ver there's an implicit conversion between str and unicode. HTH, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: Trouble with the encoding of os.getcwd() in Korean Windows

2005-02-09 Thread Walter Dörwald
. According to http://www.python.org/doc/2.4/lib/os-file-dir.html this has been added in Python 2.3 and should work on Windows. Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: Unicode universe (was Re: Dr. Dobb's Python-URL! - weekly Python news and links (Dec 30))

2005-01-04 Thread Walter Dörwald
e("utf-16") is an abbreviation of u.encode().decode("utf-16") In the same way str has an encode method, so s.encode("utf-16") is an abbreviation of s.decode().encode("utf-16") Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list

Re: Small Problem P 2.4 (line>2048 Bytes)

2004-12-15 Thread Walter Dörwald
quot;break" in multiples lines, the problem is solved. This sounds like bug http://www.python.org/sf/1076985 "Incorrect behaviour of StreamReader.readline leads to crash". Are you using a PEP 263 coding header for your script? Bye, Walter Dörwald -- http://mail.python.org/mailman/listinfo/python-list