On 25.07.12 08:09, Ulrich Eckhardt wrote:
Am 24.07.2012 17:01, schrieb cpppw...@gmail.com:
reader = codecs.getreader(encoding)
lines = []
with open(filename, 'rb') as f:
lines = reader(f, 'strict').readlines(keepends=False)
where encoding == 'utf-16-be'
Everything wo
On 07.07.12 04:56, Steven D'Aprano wrote:
On Fri, 06 Jul 2012 12:55:31 -0400, Karl Knechtel wrote:
Hello all,
While attempting to make a wrapper for opening multiple types of
UTF-encoded files (more on that later, in a separate post, I guess), I
ran into some oddities with the `codecs` module
On 11.03.12 15:37, Steven D'Aprano wrote:
At least two standard error handlers are documented as working for
encoding only:
xmlcharrefreplace
backslashreplace
See http://docs.python.org/library/codecs.html#codec-base-classes
and http://docs.python.org/py3k/library/codecs.html
Why is this? I
On 28.04.10 15:02, james_027 wrote:
> hi,
>
> Any idea how I can replace words in a html file? Meaning only the
> content will get replace while the html tags, javascript, & css are
> remain untouch.
You could try XIST (http://www.livinglogic.de/Python/xist/):
Example code:
from ll.xist import
On 17.10.09 08:28, Mark Tolonen wrote:
>
> "Kee Nethery" wrote in message
> news:aaab63c6-6e44-4c07-b119-972d4f49e...@kagi.com...
>>
>> On Oct 16, 2009, at 5:49 PM, Stephen Hansen wrote:
>>
>>> On Fri, Oct 16, 2009 at 5:07 PM, Stef Mientki
>>> wrote:
>>
>> snip
>>
>>> The thing is, I'd be VERY
On 16.10.09 05:44, alex23 wrote:
> On Oct 15, 6:58 pm, an...@vandervlies.xs4all.nl wrote:
>> Does HTMLgen (Robin Friedrich's) still exsist?? And, if so, where can it
>> be found?
>
> If you're after an easy to use html generator, I highly recommend
> Richard Jones' html[1] lib. It's new, supported
On 01.10.09 17:50, Rami Chowdhury wrote:
> On Thu, 01 Oct 2009 08:10:58 -0700, Walter Dörwald
> wrote:
>
>> On 01.10.09 16:09, Hyuga wrote:
>>> On Sep 30, 3:34 am, gentlestone wrote:
>>>> Why don't work this code on Python 2.6? Or how can I do this
On 01.10.09 16:09, Hyuga wrote:
> On Sep 30, 3:34 am, gentlestone wrote:
>> Why don't work this code on Python 2.6? Or how can I do this job?
>>
>> _MAP = {
>> # LATIN
>> u'À': 'A', u'Á': 'A', u'Â': 'A', u'Ã': 'A', u'Ä': 'A', u'Å': 'A',
>> u'Æ': 'AE', u'Ç':'C',
>> u'È': 'E', u'É': 'E',
Martin v. Löwis wrote:
>> "correct" -> "corrected"
>
> Thanks, fixed.
>
>>> To convert non-decodable bytes, a new error handler "python-escape" is
>>> introduced, which decodes non-decodable bytes using into a private-use
>>> character U+F01xx, which is believed to not conflict with private-use
>
Martin v. Löwis wrote:
> I'm proposing the following PEP for inclusion into Python 3.1.
> Please comment.
>
> Regards,
> Martin
>
> PEP: 383
> Title: Non-decodable Bytes in System Character Interfaces
> Version: $Revision: 71793 $
> Last-Modified: $Date: 2009-04-22 08:42:06 +0200 (Mi, 22. Apr 20
Gilles Ganault wrote:
> Hello
>
> I'm trying to read pages from Amazon JP, whose web pages are
> supposed to be encoded in ShiftJIS, and decode contents into Unicode
> to keep Python happy:
>
> www.amazon.co.jp
> />
>
> But this doesn't work:
>
> ==
> m = try.search(the_page)
> if m
Jonas Galvez wrote:
Walter Dörwald wrote:
XIST has been using with blocks since version 3.0.
[...]
with xsc.Frag() as node:
+xml.XML()
+html.DocTypeXHTML10transitional()
with html.html():
[...]
Sweet! I don't like having to use the unary operator tho, I wanted
something as simp
Stefan Behnel wrote:
Hi,
Walter Dörwald wrote:
XIST has been using with blocks since version 3.0.
Take a look at:
http://www.livinglogic.de/Python/xist/Examples.html
from __future__ import with_statement
from ll.xist import xsc
from ll.xist.ns import html, xml, meta
with xsc.Frag() as
Stefan Behnel wrote:
Stefan Behnel wrote:
Jonas Galvez wrote:
Not sure if it's been done before, but still...
Obviously ;)
http://codespeak.net/lxml/tutorial.html#the-e-factory
... and tons of other tools that generate XML, check PyPI.
Although it might be the first time I see the with sta
Arnaud Delobelle wrote:
"Tim Arnold" <[EMAIL PROTECTED]> writes:
hi, I've got lots of xhtml pages that need to be fed to MS HTML Workshop to
create CHM files. That application really hates xhtml, so I need to convert
self-ending tags (e.g. ) to plain html (e.g. ).
Seems simple enough, but I
Sebastian Bassi wrote:
> Hello,
>
> What are people using these days to generate HTML? I still use
> HTMLgen, but I want to know if there are new options. I don't
> want/need a web-framework a la Zope, just want to produce valid HTML
> from Python.
If you want something that works similar to HTM
Joshua J. Kugler wrote:
> I realize that in today's MVC-everything world, the mere mention of
> generating HTML in the script is near heresy, but for now, it's what I ened
> to do. :)
>
> That said, can someone recommend a good replacement for HTMLGen? I've found
> good words about it (http://www
[EMAIL PROTECTED] wrote:
> On Jan 30, 11:28 pm, Walter Dörwald <[EMAIL PROTECTED]> wrote:
>
>> codecs.register_error("transliterate", transliterate)
>>
>>Walter
>
> Really, really slick solution.
> Though, why was it [:1], not [0]? ;-)
Martin v. Löwis wrote:
> Walter Dörwald schrieb:
>> You might try the following:
>>
>> # -*- coding: iso-8859-1 -*-
>>
>> import unicodedata, codecs
>>
>> def transliterate(exc):
>> if not isinstance(exc, UnicodeEncodeError):
>>
Rares Vernica wrote:
> Hi,
>
> Does anyone know of any Unicode encode/decode error handler that does a
> better replace job than the default replace error handler?
>
> For example I have an iso-8859-1 string that has an 'e' with an accent
> (you know, the French 'e's). When I use s.encode('asci
Martin v. Löwis wrote:
> Duncan Booth schrieb:
>> The way that uri encoding is supposed to work is that first the input
>> string in unicode is encoded to UTF-8 and then each byte which is not in
>> the permitted range for characters is encoded as % followed by two hex
>> characters.
>
> Can you
[EMAIL PROTECTED] wrote:
> Does anybody know whether htmlGen, the Python-class library for
> generating HTML, is still being maintained? Or from where it can be
> downloaded? The Starship site where it used to be hosted is dead.
I don't know if HTMLgen is still alive, but if you're looking for
alt
site
http://www.livinglogic.de/Python/ itself was generated with XIST. You
can find the source for the website here:
http://www.livinglogic.de/viewcvs/index.cgi/LivingLogic/WWW-Python/site/
Hope that helps!
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
Steven D'Aprano wrote:
> On Mon, 25 Sep 2006 00:45:29 -0700, Paul Rubin wrote:
>
>> willie <[EMAIL PROTECTED]> writes:
>>> # U+270C
>>> # 11100010 10011100 10001100
>>> buf = "\xE2\x9C\x8C"
>>> u = buf.decode('UTF-8')
>>> # ... later ...
>>> u.bytes() -> 3
>>>
>>> (goes through each code point and
Diez B. Roggisch wrote:
>> So then the easiest thing to do is: take the maximum length of a unicode
>> string you could possibly want to store, multiply it by 4 and make that
>> the length of the DB field.
>
>> However, I'm pretty convinced it is a bad idea to store Python unicode
>> strings dire
hey shouldn't. They part of the url, which is (IIRC) a CDATA
>> attribute of the A element, not PCDATA.
>
> It is CDATA but ampersands still need to be escaped.
Exactly. See
http://www.w3.org/TR/html4/appendix/notes.html#ampersands-in-uris
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
[EMAIL PROTECTED] wrote:
> John Salerno wrote:
> [snip]
>> Thanks for any suggestions, and again I'm sorry if this feels like the
>> same question as usual (it's just that in my case, I'm not looking for
>> something like SPE, Komodo, Eric3, etc. right now).
>
> I was taking a peek at c.l.py to c
John Hunter wrote:
> I have a curses app that is displaying real time data. I would like
> to bind certain keys to certain functions, but do not want to block
> waiting for
>
> c = screen.getch()
>
> Is it possible to register callbacks with curses, something like
>
> screen.register('ke
r from libxml2 or any of the available
wrappers for it.
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
ame conflicts with a keyword (and one can assume it means "for all
> keywords other than class").
No, I think what it means is this: "Use cls as the name of the first
argument in a classmethod. For anything else (i.e. name that are not the
first argument in a classmethod) appen
v.append(u"?")
return (u"".join(v), uerr.end)
codecs.register_error('replacelatscii', latscii_error)
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
Edward Loper wrote:
> Walter Dörwald wrote:
>> Edward Loper wrote:
>>
>>> [...]
>>> Surely there's a better way than converting back and forth 3 times? Is
>>> there a reason that the 'backslashreplace' error mode can't be used
text if an input character can't be encoded. But
a backslash character in an 8bit string is no error, so it won't get
replaced on decoding.
What you want is a different codec (try e.g. "string-escape" or
"unicode-escape").
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
encoding named utf-8-sig, that would output a leading BOM on writing
and skip it on reading.
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
s:
And now for something completely differentNumber 1 ...
the larch
I hope this is what you need.
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
('http://www.python.org/search/', 'http://www.python.org/search/',
u'Search')
('http://www.python.org/download/', 'http://www.python.org/download/',
u'Download')
('http://www.python.org/doc/', 'http://www.python.org/doc/',
u'Documentation')
...
Hope that helps,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
vinglogic.de/Python/xist). It was developed for exactly
this purpose: You implement reusable HTML fragments in Python and you
can use any kind of embedded dynamic language (PHP and JSP are supported
out of the box).
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
.de/Python/xist)
Code might look like this:
from ll.xist import xsc, parsers
node = parsers.parseURL("http://www.python.org/";, tidy=True)
for link in node//xsc.URLAttr:
link[:] = unicode(link).replace(
"http://www.python.org/";,
"http://www.perl.org/";
ethod in the CharStyle class which returns a new modified
> instance of CharStyle.
>
> I'm using Windows XP and Python 2.4.1
>
> Any ideas? O:-)
This is probably related to http://www.python.org/sf/1163244. Do you
have a PEP 263 encoding declaration in your file? Can you try
Lib/codecs.py from current CVS?
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
rint "\t%s" % field["name"]
---
This prints:
Fields for http://www.google.com/search
q
domains
sitesearch
sourceid
submit
Hope that helps!
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
ut this
__init__.py inside the system2 directoy you couldn't import other.py
because Python doesn't know where the source code for system2 lives and
refuses to treat system2 as a package.
Hope that helps,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
try:
> return charmap[ord(s)], info.end
This will fail if there's more than one consecutive unencodable
character, better use
return charmap[ord(s[0])], info.start+1
or
return "".join(charmap.get(ord(c), u"" % ord(c)) for c in
s), info.end
(witho
Ivan Voras wrote:
M.-A. Lemburg wrote:
Not true: mxTidy integrates tidy as C lib. It's not an interface
to the command line tool.
Thanks, I'll look at it again!
Another option might be the HTML parser (libxml2.htmlReadMemory()) from
libxml2 (http://www.xmlsoft.org)
Bye,
Walter Dörwa
e a little tricky, because
the parser must determine which encoding to use before instantiating the
decoder.
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
ver there's an implicit conversion between
str and unicode.
HTH,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
. According to
http://www.python.org/doc/2.4/lib/os-file-dir.html
this has been added in Python 2.3 and should work on Windows.
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
e("utf-16")
is an abbreviation of
u.encode().decode("utf-16")
In the same way str has an encode method, so
s.encode("utf-16")
is an abbreviation of
s.decode().encode("utf-16")
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
quot;break" in multiples lines, the problem is solved.
This sounds like bug http://www.python.org/sf/1076985
"Incorrect behaviour of StreamReader.readline leads to crash".
Are you using a PEP 263 coding header for your script?
Bye,
Walter Dörwald
--
http://mail.python.org/mailman/listinfo/python-list
48 matches
Mail list logo