Stefan Behnel added the comment:
I gave the implementation a try and attached an incomplete patch. Some tests
are failing.
It turns out that it's not entirely easy to do this. As Antoine noticed, the
hack in the C implementation of the TreeBuilder makes it tricky to integrate
with
Stefan Behnel added the comment:
FWIW, lxml.etree supports wildcards like '{*}tag' in searches, and this is
otherwise quite rarely a problem in practice.
I'm -1 on the proposed feature and wouldn't mind rejecting this all together.
(At least change the title to someth
Stefan Behnel added the comment:
Rejecting this ticket was the right thing to do. It's not a bug but a feature.
In Python 2.x, ElementTree returns any text content that can correctly be
represented as an ASCII encoded string in the native Py2.x string type (i.e.
'str'). Only no
Stefan Behnel added the comment:
There's also the QName class which can be used to split qualified tag names.
And it's pretty trivial to pre-process the entire tree by stripping all
namespaces from it the intention is really to do namespace agnostic processing.
However, in my experi
Stefan Behnel added the comment:
Just to reiterate this point, lxml.etree supports a "pretty_print" flag in its
tostring() function and ElementTree.write(). It would thus make sense to
support the same thing in ET.
http://lxml.de/api.html#serialisation
For completeness, the current
Stefan Behnel added the comment:
Please leave the title as it is now.
--
title: ElementTree gets awkward to use if there is an xmlns -> ElementTree --
provide a way to ignore namespace in tags and seaches
___
Python tracker
<http://bugs.pyth
Stefan Behnel added the comment:
As I already suggested for lxml, you can use the QName class to process
qualified names, e.g.
QName(some_element.tag).localname
Or even just
QName(some_element).localname
It appears that ElementTree doesn't support this. It lists the QName ty
New submission from Stefan Behnel:
The get/set/delitem slicing protocol has replaced the old Py2.x
get/set/delslice protocol in Py3.x. This change introduces a substantial
overhead due to processing indices as Python objects rather than plain
Py_ssize_t values. This overhead should be reduced
Stefan Behnel added the comment:
This tiny patch adds a fast-path to _PyEval_SliceIndex() that speeds up the
slicing-heavy "fannkuch" benchmark by 8% for me.
--
keywords: +patch
Added file: http://bugs.python.org/file31421/faster_PyEval_SliceI
Stefan Behnel added the comment:
Sorry, broken patch. Here's a new one (same results).
--
Added file: http://bugs.python.org/file31422/faster_PyEval_SliceIndex.patch
___
Python tracker
<http://bugs.python.org/is
Stefan Behnel added the comment:
Another patch, originally proposed in issue10227 by Kristján Valur Jónsson. It
uses local variables instead of pointers in PySlice_GetIndicesEx(). I see no
performance difference whatsoever (Linux/gcc 4.7), but also I can't see a
reason not to do this. I
Stefan Behnel added the comment:
And in fact, callgrind shows an *increase* in the number of instructions
executed for the "use locals" patch (898M with vs. 846M without that patch when
running the fannkuch benchmark twice). That speaks against making that change
Stefan Behnel added the comment:
Here is another patch that remembers the Py_ssize_t slice indices if they are
known at instantiation time. It only makes a very small difference for the
"fannkuch" benchmark, so that's no reason to add both the complexity and the
(IMHO ig
Stefan Behnel added the comment:
Ok, so what are we going to do for the next alpha?
--
___
Python tracker
<http://bugs.python.org/issue17741>
___
___
Python-bug
Stefan Behnel added the comment:
I was asking for the current implementation to be removed until we have a
working implementation that hurts neither the API nor the module design.
--
___
Python tracker
<http://bugs.python.org/issue17
Stefan Behnel added the comment:
I don't think I understand what you mean.
In any case, it's not to late to remove the implementation. There was only one
alpha release so far that included it, so it can't really break any existing
code that relies on it. The longer we wait,
Stefan Behnel added the comment:
Could we please keep the discussion on rational terms?
It's not just the method names. The problem is that you are duplicating an
existing class (the XMLParser) for no good reason, instead of putting the
feature where it belongs: *behind* the XMLParser.
Stefan Behnel added the comment:
Given that it seems to be hard to come to a consensus in this ticket, I've
asked for removal of the code on python-dev.
http://mail.python.org/pipermail/python-dev/2013-August/128095.html
--
___
Python tr
Stefan Behnel added the comment:
"""
1. Why have the "event builder" wrap a tree builder? Can't it just be a
separate target?
"""
You need a TreeBuilder in order to build the tree for the events.
If you want to use a different target than a TreeBu
Stefan Behnel added the comment:
I attached a patch that removes the IncrementalParser class and merges its
functionality into the _IterParseIterator. It thus retains most of the
refactoring without adding new functionality and/or APIs.
I did not take a look if anything else from later
Stefan Behnel added the comment:
"""
TreeBuilder has to support an explicit API for collecting and reporting events.
XMLParser has to call into this API and either not have _setevents at all or
have something public and documented. Note also that event hookup in the parser
m
Stefan Behnel added the comment:
> I still think IncrementalParser is worth keeping.
If you want to keep it at all cost, I think we should at least hide it behind a
function (as with iterparse()). If it's implemented as a class, chances are
that people will start relying on inte
Stefan Behnel added the comment:
Actually, let me revise my rpevious comment. I think we should fake the new
interface for now by adding a TreeEventBuilder that requires having its own
TreeBuilder internally, instead of wrapping an arbitrary target. That way, we
can avoid having to clean up
Stefan Behnel added the comment:
> fully working patches will be considered
Let me remind you that it's not me who wants this feature so badly.
> As for faking the new API, I don't know if that's a good idea because we're
> not yet sure what that new API is.
I
Stefan Behnel added the comment:
> Also, even if the new approach is implemented in the next release,
IncrementalParser can stay as a simple synonym to
XMLParser(target=EventBuilder(...)).
No it can't. According to your signature, it accepts a parser instance as
input. So it can'
Stefan Behnel added the comment:
Hmm, did you look at my last comment at all? It solves both the technical
issues and the API issues very nicely and avoids any problems of potential
future changes. Let me quickly explain why.
The feature in question depends on two existing parts of the API
Stefan Behnel added the comment:
BTW, I also like how short and clean iterparse() becomes when you move this
feature into the parser. It's basically just a convenience function that does
read(), feed(), and yield-from. Plus the usual bit of bolerplate code,
obvi
Stefan Behnel added the comment:
> iterparse's "parser" argument will be deprecated
No need to do that. Update the docs, yes, but otherwise keep the possibility to
improve the implementation later on, without going through a deprecation +
dedeprecation cycle. That would
Stefan Behnel added the comment:
I don't see adding one method to XMLParser as a design problem. In fact, it's
even a good design on the technical side, because if ET ever gains an
HTMLParser, then the implementation of this feature would be highly dependent
on the underlying parse
Stefan Behnel added the comment:
> ... instead require passing in a callback that accepts the target ...
That could be the parser class then, for example, except that there may be
other options to set as well. Plus, it would not actually allow iterparse to
wrap a user provided target. So,
Stefan Behnel added the comment:
> it's really about turning XMLParser's "push" API for events (where the events
> are pushed into the target object by the parser calling the appropriate
> methods), into an iterparse style pull API where the events can be retrieve
Stefan Behnel added the comment:
> in the long run we want the new class to just be a convenience API for
> combining XMLParser and a custom target object, even if it can't be
> implemented that way right now.
Just to be clear: I changed my opinion on this one and I no longer
Stefan Behnel added the comment:
> XMLParser knows nothing about Elements, at least in the direct API of today.
> The one constructing Elements is the target.
Absolutely. And I'm 100% for keeping that distinction exactly as it is.
> The "read_events" method pr
Stefan Behnel added the comment:
Here is a proof-of-concept patch that integrates the functionality of the
IncrementalParser into the XMLParser. I ended up reusing most of Antoines
implementation and test suite. In case he'll look back into this ticket at some
point, I'll put a
Stefan Behnel added the comment:
(I still wonder why I'm the one writing all the patches here when Eli is the
one who actually wants this feature ...)
--
___
Python tracker
<http://bugs.python.org/is
Stefan Behnel added the comment:
BTW, maybe "read_events()" still isn't the ideal method name to put on a parser.
--
___
Python tracker
<http://bugs.pyt
Stefan Behnel added the comment:
> Putting _setevents aside for the moment,
Agreed, obviously.
> XMLParser is a clean and simple API. Its output is only "push" (by calling
> callbacks on the target). It doesn't deal with Elements at all.
We already agreed on that,
Stefan Behnel added the comment:
> The whole point of the new API is not to replace XMLParser, but to provide a
> convenience API to set up a particular combination of an XMLParser with a
> particular kind of custom target.
Ok, but I'm saying that we don't need that. It&
Stefan Behnel added the comment:
This is a bit tricky in ET because it generally allows you to stick anything
into the Element properties (and that's a feature). So catching this at tree
building time (as lxml.etree does) isn't really possible.
However, at least catching it in the
Stefan Behnel added the comment:
Go for it. That's usually the fastest way to get things done.
--
___
Python tracker
<http://bugs.python.org/issue18850>
___
___
Stefan Behnel added the comment:
Eli, I agree that we've put way more than enough time into the discussion by
now. We all know each other's arguments and failed to convince each other.
Please come up with working code that shows that the approach you are
advocating for
Stefan Behnel added the comment:
Michele, could you elaborate how you would exploit this issue as a security
risk?
I mean, I can easily create a (non-)XML-document with control characters
manually, and the parser would reject it.
What part of the create-to-serialise process exactly is a
Stefan Behnel added the comment:
> The parser is *not* rejecting control chars.
The parser *is* rejecting control characters. It's an XML parser. See the
example in the link you posted.
> assume you have a script that simply stores each message it receives (from
> stdin, fro
Stefan Behnel added the comment:
Or maybe even to "enhancement". The behaviour that it writes out what you give
it isn't exactly wrong, it's just inconvenient that you have to take care
yourself that you pass it wel
Stefan Behnel added the comment:
> the push API is inactive and gets redirected to a pull API
Given that this is aimed to become a redundant 'convenience' wrapper around
something else at a point, I assume that you are aware that the above is just
an arbitrary restriction due to
Stefan Behnel added the comment:
> I think the point here is clarifying whether xml expect text or just a byte
> string. In case that's a stream of byte, I agree with you, is more a
> "behaviour" problem.
XML is *defined* as a stream of bytes.
Regarding the API
Stefan Behnel added the comment:
>> XML is *defined* as a stream of bytes.
> Can you *paste* the *source* proving what you are arguing, please?
http://www.w3.org/TR/REC-xml/
> python3 works with ElementTree(bytes(unicode))
What does this s
Stefan Behnel added the comment:
We are talking about two different things here.
I said that (serialised) XML is defined as a sequence of bytes. Read the spec
on that.
What you are talking about is the Infoset, or the parsed/generated in-memory
XML tree. That's obviously not bytes,
Stefan Behnel added the comment:
Any comments regarding my naming suggestion?
Calling it a "push" parser is just too ambiguous.
--
___
Python tracker
<http://bugs.python.o
Stefan Behnel added the comment:
Erm, "pull" parser, but you see what I mean.
--
___
Python tracker
<http://bugs.python.org/issue17741>
___
___
Python-b
Stefan Behnel added the comment:
Is that you actual use case? That you *want* to store binary data in XML,
instead of getting it properly rejected as non well-formed content?
Then I suggest going the canonical route of passing it through base64 first, or
any of the other binary-to-characters
Stefan Behnel added the comment:
> As an advice I hope you do not take as insult, saying
> "in section {section} the spec says {argument}"
> is much more constructive than
> "read the spec on that", "{extremely_obvious_link}",
> at least to people
Stefan Krah added the comment:
I uderstand that you are building Python using "emerge". I would try to get a
release from python.org, do a normal build ...
make distclean
./configure --prefix=/tmp --with-pydebug --with-valgrind
make
make install
...
then install matplotlib using
Stefan Krah added the comment:
I would probably work on it (it's basically implemented in _testbuffer.c),
but I'm not sure if the NumPy community will actually use the feature.
--
___
Python tracker
<http://bugs.python.o
Stefan Krah added the comment:
The request is certainly valid, but the patch is tricky to review.
--
___
Python tracker
<http://bugs.python.org/issue17
Stefan Krah added the comment:
A similar issue was closed, see msg157249. The error looks deliberate
to me, so let's close this, too.
--
nosy: +skrah
resolution: -> works for me
status: open -> closed
___
Python tracker
<http://b
Stefan Krah added the comment:
Martin, msg196534 shows that you are building with -DPy_LIMITED_API=1.
You can either use the limited API or --with-pydebug, but not both.
[As I said in the other issue, IMHO it is better to use a minimal set
of build options when reporting bugs
Stefan Krah added the comment:
I think I understand now: If you used the strategy from msg196520,
of course you get the Gentoo flags.
What you really should do is download a release or get a checkout
from hg.python.org and build that _without_ using "e
Stefan Krah added the comment:
Well, these look like Gentoo build flags. Did you or "emerge" or
anything else export CFLAGS in the shell?
--
___
Python tracker
<http://bugs.python.o
Stefan Krah added the comment:
Wikipedia sounds good. Let's avoid linking directly to "free" versions. :)
--
___
Python tracker
<http://bugs.pyt
Stefan Krah added the comment:
BTW, in _decimal the feature can already be enabled with:
./configure CFLAGS=-DEXTRA_FUNCTIONALITY
>>> IEEEContext(DECIMAL128)
Context(prec=34, rounding=ROUND_HALF_EVEN, Emin=-6143, Emax=6144, capitals=1,
clamp=1, flags=[], traps=[])
>>> IEE
New submission from Stefan Behnel:
The exception handling clauses in framework_find() are weird.
def framework_find(fn, executable_path=None, env=None):
"""
Find a framework using dyld semantics in a very loose manner.
Will take input such as:
Stefan Behnel added the comment:
changing title as it doesn't really look like a typo, more a "converto"
--
title: typo in Lib/ctypes/macholib/dyld.py -> invalid exception handling in
Lib/ctypes/macholib/dyld.py
___
Pyt
New submission from Stefan Behnel:
diff --git a/performance/pystone.py b/performance/pystone.py
--- a/performance/pystone.py
+++ b/performance/pystone.py
@@ -59,9 +59,9 @@
def main(loops=LOOPS):
benchtime, stones = pystones(loops)
-print "Pystone(%s) time for %d passes
Stefan Behnel added the comment:
I can well imagine that the serialiser is broken for this in Py2.x, given that
the API accepts byte strings and stores them as such. The fix might be as
simple as decoding byte strings in the serialiser before writing them out.
Involves a pretty high
Stefan Krah added the comment:
Is the distutils freeze still in place? If not, I'll commit initfunc2.patch.
--
___
Python tracker
<http://bugs.python.org/i
Stefan Behnel added the comment:
> This would make it possible to layer XMLPullParser on top of the stock
> XMLParser coupled with a special target that collects "events" from the
> callback calls.
Given that we have an XMLPullParser now, I think we should not clutter
Stefan Behnel added the comment:
(fixing subject to properly hit bug filters)
--
title: Make ET event handling more modular to allow custom targets for the
non-blocking parser -> Make ElementTree event handling more modular to allow
custom targets for the non-blocking par
New submission from Stefan Schukat:
If os.spawnv or os.spawnve is called with an empty second argument the process
terminates in release builds under Windows. This is simple to reproduce:
>>> import os
>>> nPath = os.path.join(os.environ["windir"], "notepa
Stefan Behnel added the comment:
While refactoring the iterparse() implementation in lxml to support this new
interface, I noticed that the close() method of the XMLPullParser does not
behave like the close() method of the XMLParser. Instead of setting some .root
attribute on the parser
Changes by Stefan Krah :
--
nosy: +skrah
___
Python tracker
<http://bugs.python.org/issue18606>
___
___
Python-bugs-list mailing list
Unsubscribe:
Stefan Behnel added the comment:
Looks like we missed the alpha2 release for the close() API fix. I recommend
not letting yet another deadline go by.
--
___
Python tracker
<http://bugs.python.org/issue17
New submission from Stefan Behnel:
The .close() method of the new XMLPullParser (see issue17741) in Py3.4 shows an
unnecessarily complicated behaviour that is inconsistent with the .close()
method of the existing XMLParser.
The attached patch removes some code to fix this
Changes by Stefan Behnel :
--
nosy: +eli.bendersky
___
Python tracker
<http://bugs.python.org/issue18990>
___
___
Python-bugs-list mailing list
Unsubscribe:
Stefan Behnel added the comment:
Created separate issue18990 to keep this one closed as is.
--
___
Python tracker
<http://bugs.python.org/issue17741>
___
___
Pytho
Stefan Krah added the comment:
Hmm, I've managed to produce the error with 3.1:
$ python3.1
Fatal Python error: Py_Initialize: can't initialize sys standard streams
Traceback (most recent call last):
File "/usr/lib/python3.1/io.py", line 60, in
import _io
File
Stefan Krah added the comment:
The funny thing is, in 3.3 I can't reproduce it (i.e. I only get
the KeyboardInterrupt). So I'm not sure if this happens more often
now.
^CTraceback (most recent call last):
File "", line 989, in _find_and_load
File "", l
Stefan Krah added the comment:
The patch looks good to me.
--
___
Python tracker
<http://bugs.python.org/issue14630>
___
___
Python-bugs-list mailing list
Unsub
Stefan Krah added the comment:
I'm not sure what to do. Martin's opinion was that the change should
be reverted:
http://mail.python.org/pipermail/python-dev/2012-March/117390.html
--
___
Python tracker
<http://bugs.python.o
Stefan Behnel added the comment:
Could the
while thread._count() > c:
pass
in test_thread.py be changed to this? (as used in other places)
while thread._count() > c:
time.sleep(0.01)
It currently hangs in Cython because it doesn't free the GIL during
New submission from Stefan Krah :
The build --without-threads segfaults in run_tests.py:
http://www.python.org/dev/buildbot/all/builders/AMD64%20Fedora%20without%20threads%203.x/builds/2123/steps/test/logs/stdio
--
components: Tests
messages: 159496
nosy: skrah
priority: release
Stefan Krah added the comment:
This issue is now apparently causing a segfault:
(gdb) r ./Tools/scripts/run_tests.py -j 1 -u all -W --timeout=3600
Starting program: /home/stefan/pydev/cpython/python
./Tools/scripts/run_tests.py -j 1 -u all -W --timeout=3600
[...]
Program received signal
Stefan Krah added the comment:
Appears to be related to #14583.
--
keywords: +buildbot
resolution: -> duplicate
stage: needs patch -> committed/rejected
status: open -> closed
superseder: -> try/except import fails --without-threads
__
New submission from Stefan Krah :
Seen on the Gentoo buildbot:
http://www.python.org/dev/buildbot/all/builders/x86%20Gentoo%20Non-Debug%203.x/builds/2154/steps/test/logs/stdio==
ERROR: test_format (test.test_bool.BoolTest
Changes by Stefan Krah :
--
type: crash -> behavior
___
Python tracker
<http://bugs.python.org/issue14686>
___
___
Python-bugs-list mailing list
Unsubscri
Stefan Krah added the comment:
On another bot:
http://www.python.org/dev/buildbot/all/builders/x86%20Gentoo%203.x/builds/2205/steps/test/logs/stdio
[250/364] test_bool
python: Objects/unicodeobject.c:13501: formatlong: Assertion
`unicode_modifiable(result)' failed.
Fatal Python
Stefan Krah added the comment:
I think the issue is fixed in all affected branches. Georg, can we close it?
--
resolution: -> fixed
stage: -> committed/rejected
status: open -> pending
___
Python tracker
<http://bugs.python.or
Changes by Stefan Krah :
--
nosy: +skrah
___
Python tracker
<http://bugs.python.org/issue9530>
___
___
Python-bugs-list mailing list
Unsubscribe:
http://mail.pyth
Stefan Krah added the comment:
Vinay's patch solves the problem. +1 for committing.
--
nosy: +skrah
___
Python tracker
<http://bugs.python.org/is
Stefan Krah added the comment:
I tested the patch, it works fine. I can't test the popup situation
since I currently only have ssh access.
--
___
Python tracker
<http://bugs.python.org/i
Stefan Krah added the comment:
The proposal makes sense at first glance, but I agree with Mark that it is
not clear what should be done. For example, all arrays in Python silently
convert to inf:
>>> from numpy import array
>>> x = array([1,2,3], 'f')
>>> x
Stefan Krah added the comment:
I agree. Fixing all this would probably require a PEP. It looks like the
original plan was to provide a facility to turn off the Overflow exception:
http://mail.python.org/pipermail/python-dev/2000-May/003990.html
New submission from stefan brunthaler :
The attached patch adds quickening based inline caching (INCA) to the CPython
3.3 interpreter. It uses a code generator to generate instruction derivatives
using the mako template engine, and a set of utility functions to enable
automatic and safe
Stefan Krah added the comment:
This looks quite impressive, so sorry for immediately jumping in with
criticism. -- I've benchmarked the things I worked on, and I can't see
any speedups but some significant slowdowns. This is on 64-bit Linux
with a Core 2 Duo, both versions compiled
stefan brunthaler added the comment:
> This looks quite impressive, so sorry for immediately jumping in with
> criticism. -- I've benchmarked the things I worked on, and I can't see
> any speedups but some significant slowdowns. This is on 64-bit Linux
> with a Cor
Stefan Krah added the comment:
> > Modules/_decimal/tests/bench.py:
> >
> >
> > Not much change for floats and decimal.py, 8-10% slowdown for _decimal!
>
> This result is not unexpected, as I have no inline cached versions of
>
Stefan Krah added the comment:
The sysconfig docs say: "configuration variables relevant for the
current platform"
If get_config_var('SIZEOF_VOID_P') is meaningless for universal builds,
then IMO it should return None. None would then mean either "
Stefan Krah added the comment:
The tests for arrays with suboffsets literally need sizeof(void *).
I don't think C guarantees SIZEOF_VOID_P == SIZEOF_SIZE_T. If HAVE_SSIZE_T
is defined in pyport.h, AFAICS no such check is made.
Of course these concerns may be entirely theore
stefan brunthaler added the comment:
This is the updated patch file that fixes the performance issues measurable
using the official perf.py py3k test suite.
--
Added file: http://bugs.python.org/file25541/20120511-inca.patch
___
Python tracker
Changes by stefan brunthaler :
Added file: http://bugs.python.org/file25542/20120511-inca-perf.txt
___
Python tracker
<http://bugs.python.org/issue14757>
___
___
Pytho
2301 - 2400 of 4951 matches
Mail list logo