def add(self, x, y):
... return x+y
...
>>> class B:
... pass
...
>>> B.add = A.add
>>>
>>> print(B().add(1, 2))
3
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
things that
seem totally out of reach with your current approach.
http://wiki.python.org/moin/WebFrameworks
or, more generally:
http://wiki.python.org/moin/WebProgramming
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
abeen <[EMAIL PROTECTED]> wrote:
> I would want to know which could be the best programming language for
> developing web spider.
Since you ask in comp.lang.python: I'd suggest APL
--
Web (en): http://www.no-spoon.de/ -*- Web (de): http://www.frell.de/
--
http://mail.python.org/mailman/listinf
... root.clear() # one record done, clean up everything
http://effbot.org/zone/element-iterparse.htm
You can also do things like
... print element.findtext("r3/r4")
Read the ElementTree tutorial to learn how to extract your data:
http://effbot.org/zone/element.htm#searching-for-subelements
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
s h
tree = h.parse("somefile.html")
text = tree.xpath("string( some/[EMAIL PROTECTED] )")
http://codespeak.net/lxml
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Here are some performance comparisons of HTML parsers:
http://blog.ianbicking.org/2008/03/30/python-html-parser-performance/
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
http://blog.ianbicking.org/2008/03/30/python-html-parser-performance/
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
bijeshn wrote:
> the extracted files are to be XML too. ijust need to extract it raw
> (tags and data just like it is in the parent XML file..)
Ah, so then replace the "print tostring()" line in my example by
ET.ElementTree(element).write("outputfile.xml")
and you
f will give you a
(possibly arbitrary) unique priority.
This (or a similar approach) may or may not solve your problem, depending on
how you determine the dependencies.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Michel Bouwmans wrote:
> I'm trying to strip all script-blocks from a HTML-file using regex.
You might want to take a look at lxml.html instead, which comes with an HTML
cleaner module:
http://codespeak.net/lxml/lxmlhtml.html#cleaning-up-html
Stefan
--
http://mail.python.org/mailman/
Stefan Behnel wrote:
> It's not as trivial as it sounds. Removing the CDATA sections in the parser is
> just for fun.
... *not* just for fun ...
obviously ...
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
a block when it isn't encapsulated with HTML-comment markers
> and it tries to parse the contents of the document.write's. ;)
Risking to repear myself: using the right tool for the job is generally a good
idea.
http://codespeak.net/lxml/lxmlhtml.html#cleaning-up-html
Stefan
--
http://mail
r is
just for fun. It simplifies the internal tree traversal and text aggregation,
so this would be affected if we allowed CDATA content in addition to normal
text content. It's not that hard, it's just that it hasn't been done so far.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Hi again,
Stefan Behnel wrote:
> Silfheed wrote:
>> So first off I know that CDATA is generally hated and just shouldn't
>> be done, but I'm simply required to parse it and spit it back out.
>> Parsing is pretty easy with lxml, but it's the spitting back out
&
€') -> (u'\u20ac',True)
> entity2cp('&foobar;') -> ('&foobar;',False)
> """
Is there a reason why you return a tuple instead of just returning the
converted result and raising an exception if the conversion fails?
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
John Nagle wrote:
>easy_install usually seems to make things harder.
>
>BeautifulSoup is one single .py file. That's all you need.
> Everything else is excess baggage.
I wouldn't call the installation of a single module Python package a good
example for the "us
Martin Bless wrote:
> [Stefan Behnel] wrote & schrieb:
>>> def entity2uc(entity):
>>> """Convert entity like { to unichr.
>>>
>>> Return (result,True) on success or (input string, False)
>>> otherwise. Example:
>
ot; # this line won't
[bad words stripped]
this should read
acceptable = u"abcdefghijklmnopqrstuvwxyzóíñú"
acceptable = u"abcdefghijklmnopqrstuvwxyzóíñúá"
Mind the little "u" before the string, which makes it a unicode string instead
of an encoded byte
kml.findall("{%s}Folder/{%s}Folder" % (ns, ns)):
>if folders["name"].text=='Routes':
>print folder.findall("{%s}LineString/{%s}coordinates" % (ns,
> ns))
What's "name" here? An attribute? Then this might work better:
if folders.get("name") == 'Routes':
or did you mean it to be a child node?
if folders.findtext("name") == 'Routes':
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Gabriel Genellina wrote:
> You have plenty of time to evaluate alternatives. Your code may become
> obsolete even before 3.3 is shipped.
Sure, and don't forget to save two bytes when storing the year. ;)
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
rt from two different places -
unless (I assume) you want to use 2to3, which might even break this approach.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
hdante wrote:
> 6. If you just want to speed-up your python programs or offer some
> special, system-specific or optimized behavior to your python
> applications, or you just want to complement your python knowledge,
> learn C.
"Learn C", ok, but then go and use Cython in
GD wrote:
> Please remove ability to multiple inheritance in Python 3000.
I'm so happy *that's* a dead parrot, all right.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
GD wrote:
> Please remove ability to multiple inheritance in Python 3000.
>
> Multiple inheritance is bad for design, rarely used and contains many
> problems for usual users.
Ah, one more:
"doctor, when I do this, it hurts!"
- "then don't do that!&qu
k.net/lxml
http://codespeak.net/lxml/tutorial.html
http://codespeak.net/lxml/validation.html
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
l import etree
tree = etree.parse("thefile.xhtml")
tree.write("thefile.html", method="html")
http://codespeak.net/lxml
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
bryan rasmussen top-posted:
> On Thu, Apr 24, 2008 at 9:55 PM, Stefan Behnel <[EMAIL PROTECTED]> wrote:
>> from lxml import etree
>>
>> tree = etree.parse("thefile.xhtml")
>> tree.write("thefile.html", method="html")
>
ith practical examples.
>
> Can you recommend one?
http://wiki.python.org/moin/PythonBooks
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
tree. If not, remove it.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Benjamin wrote:
> On Apr 6, 11:03 pm, Stefan Behnel <[EMAIL PROTECTED]> wrote:
>> Benjamin wrote:
>>> I'm trying to parse an HTML file. I want to retrieve all of the text
>>> inside a certain tag that I find with XPath. The DOM seems to make
>>> t
ave the two integrated so that you could generate API docs and written docs
in one step, with the same look-and-feel.
(BTW: yes, I'm advocating not to implement another API documentation tool, but
to integrate with an existing one. I would think epydoc matches the
requirements quite well her
ng a script. You might want to look into
functions.
http://docs.python.org/tut/node6.html#SECTION006600000
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
required to reject non
well-formed input.
In case it actually is well-formed XML and the problem is somewhere in your
code but you can't see it through the SAX haze, try lxml. It also allows you
to pass the expected encoding to the parser to override broken document
encodings.
http://co
ree has limited support for xpath. I think it
> should be documented what 'path' should look like to.
You mean like this?
http://effbot.org/zone/element-xpath.htm
Note that the method is called "find()", not "execute_xpath()" or something.
If you want full XPath sup
erage Python code would
not even change whitespace. ;)
> I just need it for python -- but it should not force me to use PEP8.
IMHO, a tool that automatically corrects code to comply with PEP-8 would
actually be more helpful, but I guess that's pretty far reached.
Stefan
--
http://m
gt; you need further information or clarification.
Generating XML from your data shouldn't be too hard once it's in a database.
The harder part is getting it in there through a web interface. I would look
at a dedicated web framework like Django first:
http://www.djangoproject.com/
Stefan
--
h
quite a bit of code in the official
libxml2 bindings.
http://codespeak.net/lxml
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
get you going:
http://codespeak.net/lxml/lxmlhtml.html#creating-html-with-the-e-factory
Try something like this:
tables = []
for row in rows_returned_by_the_query:
tables.append(
E.TABLE( ... )
)
html_body = E.BODY( *tables )
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
with xpather on firefox..
>
> thoughts/comments/pointers...
Read a good XPath tutorial.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
thon does in its module header, method
"generate_module_preamble". It has an (obviously incomplete) list of the most
important adaptations to write portable code for Py2.3 to 3.0.
http://hg.cython.org/cython-devel/file/tip/Cython/Compiler/ModuleNode.py
Stefan
--
http://mail.python.org/m
they are documented, both approaches are fine for different
> cases. Currently the only reference I found about unicode in
> ElementTree is "All strings can either be Unicode strings, or 8-bit
> strings containing US-ASCII only." [1], which is rather ambiguous
It's not
rces to provide
external encoding information. lxml supports the Python unicode type as a
transport and reads the internal byte sequence of the unicode string.
To be clear, this does not mean that the parsing happens at the unicode
character level. Parsing XML is about parsing bytes, not characters.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
; AttributeError: 'module' object has no attribute 'BytesIO'
Do you have a module called "io" lying around in your Python path somewhere?
lxml.etree checks for io.BytesIO (Py2.6/3.0) being available when it starts
up, and only failing that, falls back to StringIO.StringIO (Py <= 2.5).
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Owen Zhang wrote:
> Can anyone recommand the best performance python xslt library?
lxml. It's based on libxml2/libxslt.
http://codespeak.net/lxml
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
bruce wrote:
> I'm using quick test with libxml2dom
>
> ===
> import libxml2dom
>
> aa=libxml2dom.parseString(foo)
> ff=libxml2dom.toString(aa)
>
> print ff
> ===
>
> --
> when i start, foo is:
>
>
>
>
>
>
>
> .
> .
> .
>
>
>
e text. I think that's what you were looking for.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
get the number of total nodes...
> by subtracting, i can get the number of nodes, without text.. is there an
> easier way??!!
Yes, learn to use XPath, e.g.
//tr/td[not string()]
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
castironpi wrote:
> Any interest in pursuing/developing/working together on a mmaped-xml
> class? Faster, not readable in text editor.
Any hints on what you are talking about?
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
ng for a tool that can process HTML using XPath, try
lxml.html.
http://codespeak.net/lxml
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Stefan Behnel wrote:
> Yes, learn to use XPath, e.g.
>
> //tr/td[not string()]
Oh, well...
//tr/td[not(string())]
as I said, wrong news group. ;-)
Try something like "gmane.text.xml.xpath.general", for example.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
change.
> You get constant time updates to contents, and log-time searches.
Every XML tree structure gives you log-time searches. But how do you achieve
constant time updates in a sequential file?
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
ument in an iterparse-like fashion
(called iterwalk).
http://codespeak.net/lxml/
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
print( td.xpath("normalize-space()") )
Tweak as you see fit, tree iteration is at your service in case you need more.
http://codespeak.net/lxml/
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
[fixing the subject appropriately]
Jackie Wang wrote:
> How should I delete the 'font' tags while keeping the content inside?
Amongst many other goodies for working with HTML, the Elements in lxml.html
have a ".drop_tag()" method specifically for that purpose.
http://code
ind it very usable and from what I heard so
far, a couple of other people also like it a lot better than ZSI.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Owen Zhang wrote:
> I am trying to build lxml package in SunOS 5.10. I got the following
> errors.
Could you report this on the lxml mailing list?
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
ant to use XPath, try this:
print tree.xpath('string()')
or if you want to use it in real code:
get_tree_text = etree.XPath('string()')
print get_tree_text(tree)
or just use
print etree.tostring(tree, method="text")
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Marco Bizzarri wrote:
> On Mon, Sep 15, 2008 at 8:15 PM, Stefan Behnel <[EMAIL PROTECTED]> wrote:
>> Mailing List SVR wrote:
>>> I have to implement a soap web services from wsdl, the server is
>>> developed using oracle, is zsi or some other python library for s
ate sample
> documents from a schema:
> http://help.eclipse.org/help32/index.jsp?topic=/org.eclipse.wst.xmleditor.doc.user/topics/tcrexxsd.html
>
> As can XML IDEs such as Stylus Studio and XML Spy.
There's also a Java tool called the "XML instance generator" by Sun tha
ight? libxml2 doesn't support that. All you get is the
list of errors in the error log of the validator, but when libxml2 decides to
bail out from the validation, that's it.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
re, not a tag.
Use the .addprevious() method on the root Element with a PI object.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
with "..." before
> feeding lxmls parser.
Yes, you can do that. To avoid creating an intermediate string, you can use
the feed parser and do something like this:
parser = etree.XMLParser()
parser.feed("")
parser.feed(your_xml_tag_sequence_data)
parser.feed(&quo
correctly and what the differences would be if any
as for terminology:
http://www.csse.monash.edu.au/~lloyd/tildeMML/Structured/HMM.html
http://en.wikipedia.org/wiki/Hidden_Markov_model
Hope it helps,
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
ns.html
Installation has just become easier (at least on Linux):
http://codespeak.net/lxml/installation.html
http://codespeak.net/pipermail/lxml-dev/2006-March/001008.html
Give it a try and keep spreading the word!
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
you are using.
Normally, you can expect any decent multi-processor operating system to
distribute the load nicely over all available processors. Relying on that
might help you.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
n established; it should not be called at
all if a host and user were given when the instance was
created. Most FTP commands are only allowed after the client
has logged in.
Perhaps the server wasn't satisfied with your credentials and
closed the connection between the .login() and .retrlines()
calls?
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
host is probably rather
impractical because the files sometimes change very frequently.
Developing only locally is impractical for some projects because
the remote development server has some infrastructure that I
can't reproduce locally or only with a lot of work.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
erent timezones.
License
---
ftputil 2.1 is Open Source software, released under the revised BSD
license (see http://www.opensource.org/licenses/bsd-license.php ).
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
il is pure Python and should work on OS X without problems.
If you think that installing/using an additional library is
overkill, you can extract the necessary parser code from the file
ftp_stat.py or write your own parser.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
def testfunc():
print "hi there"
in the python file. Running PyRun_SimpleString("testfunc()\n") just
after PyEval_AcquireThread() works like a charm though. Any ideas? I
kinda get the feeling that I don't get the dict from
PyImport_GetModuleDict() that I'm expecting.
Thanks in advance..
/Stefan
--
http://mail.python.org/mailman/listinfo/python-list
ftputil mailing list (see
http://ftputil.sschwarzer.net/trac/wiki/MailingList ) or to me.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
ot;None". If the list is longer, I'd like
> the excess elements to be ignored.
What about this:
fillUp = [None] * 6
(a, b, c, d, e, f) = (list + fillUp)[:6]
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
were. Deeper in, I agree; that stuff
> should have been dealt with at the gates.
But that may have a code smell on it, too. In most cases, when users
provide excessive arguments that the program would ignore, that's best
treated as an error.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
k
with the rest (so-called "local name").
> Additionally, anyone know how ElementTree handle's XML elements that
> include Unicode?
It's an XML parser, so the answer is: without any difficulties.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
he current namespace. Only the names
a,b,c are imported. If you do "import xyz", then "xyz" becomes a defined
name in your current namespace.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
asier to use.
Hmm, I didn't see the original e-mail, but when it comes to "easier to
use", I think the answer is basically Cython.
http://cython.org
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
'with blank', 4)
> >>> getattr(c, 'with blank')
> 4
>
> getattr / setattr seems treat any string as attribute name.
Feature. We're all adults.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
tly good version numbers. If I don't
> change the number, I get a condescending message "Upload failed (400):
> A file named "Morelia-0.0.10.tar.gz" already exists for
> Morelia-0.0.10. To fix problems with that file you should create a new
> release."
It is qui
ithin an
application written in C++?
Since you likely want to establish some kind of Python level API that wraps
your C++ code (so that Python code can talk to it), you might want to take
a look at Cython:
http://cython.org
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
get the output right, and to keep the output correct when you
change minor stuff in your code.
Given that you seem to pull data from a database into a web page, you might
want to take a look at Django, for example.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Phlip, 29.12.2009 23:58:
And I hope you answered your questions here, if no one else did, to
avoid dead search trails in the archives.
You should have read the posting.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
;,', ''))
if float_value:
return float(bfloat_value.replace(',', ''))
if date_value:
return dateutil.parser.parse(date_value, dayfirst=True)
raise ...
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
mlhtml.html#creating-html-with-the-e-factory
Note that there are tons of ways to generate HTML with Python. A quick web
search (or a quick read on PyPI or the Python Wiki) should get you started.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Phlip, 05.01.2010 18:00:
On Jan 5, 12:16 am, Stefan Behnel wrote:
Note that there are tons of ways to generate HTML with Python.
Forgot to note - I'm generating schematic XML, and I'm trying to find
a way better than the Django template I started with!
Well, then note that ther
thing it's made for on
the project homepage.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
in one obvious code step,
and then delete any content you are done with to safe memory.
It's also very fast, you will like not loose much performance compared to
xml.sax.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
ertainly going to be faster than reporting that bug, don't you think?
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Alf P. Steinbach, 12.01.2010 13:10:
* Stefan Behnel:
Maybe you should just stop using the module. Writing the code yourself
is certainly going to be faster than reporting that bug, don't you think?
It's part of the standard Python distribution.
Don't you think bugs in the s
I think everyone's free to put resources into the creation of new
programming languages. Google has enough money to put it into all sorts of
things without the need to have them pay off.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
have to
live with the fact that it was, and continues to be, practical for existing
code bases (and certainly for new code), so it clearly is not hopeless to
do so, not even "in general".
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
: it is not the case
that I have strong feelings or any feelings at all about that bug report
or any other.
Then why don't you just stop blaming the world for that terrible doom that
was laid upon you by running into a bug?
But you're starting to annoy me.
Funny that it's you
Feelings about me.
Sorry to disappoint you - I don't.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
ch to Python. Some of it is already there. But not all.
Why don't you write up a proposal for the python-ideas list?
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
good idea to me. It could still send out a notification
to the relevant component maintainers, so that they can deal with the bug
(e.g. open it up manually or drop it as spam) even if the reporter takes a
day or two to respond to the confirmation e-mail.
Stefan
--
http://mail.python.org/mailman
ramming practice anyway), an unbound
method can simplify (and speed up) the above, e.g.
sorted(items, key=unicode.lower)
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Phlip, 07.01.2010 17:44:
On Jan 7, 5:36 am, Stefan Behnel wrote:
Well, then note that there are tons of ways to generate XML with Python,
including the one I pointed you to.
from lxml.html import builder as E
xml = E.foo()
All I want is "", but I get "
.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
nothing is lost.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
bible'}, {'id': 1, 'title': 'the \xc4hnlich'}]
The entry with the umlaut is the last item in but according to german
umlaut rules it should be the first item in the result.
Do I have to set anything with the locale module?
http://wiki.python.org/moin/HowTo/Sorting#Topicstobecovered
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
heir code to Py3 by producing underqualified statements like
the above.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list
1401 - 1500 of 2239 matches
Mail list logo