Is there any way to configure cElementTree to ignore the XML root
namespace? Default cElementTree (Python 2.6.4) appears to add the XML
root namespace URI to _every_ single tag. I know that I can strip
URIs manually, from every tag, but it is a rather idiotic thing to do
(performance wise).
--
h
I'm referring to xmlns/URI prefixes. Here's a code example:
from xml.etree.cElementTree import iterparse
from cStringIO import StringIO
xml = """http://www.very_long_url.com";>"""
for event, elem in iterparse(StringIO(xml)): print event, elem
The output is:
end http://www.very_long_url.com}ch
> I think that's your main mistake: don't remove them. Instead, use the fully
> qualified names when comparing.
>
> Stefan
Yes. That's what I'm forced to do. Pre-calculating tags like tagChild
= "{%s}child" % uri and using them instead of "child". As a result the
code looks ugly and there is extra
Here's a link to the patch exposing this parameter:
http://bugs.python.org/issue8583
--
http://mail.python.org/mailman/listinfo/python-list
> Unless you have multiple namespaces or are working with defined schema
> or something, it's useless boilerplate.
>
> It'd be a nice feature if ElementTree could let users optionally
> ignore a namespace, unfortunately it doesn't have it.
Yep. Exactly my point. Here's a link to the patch address
On May 2, 12:54 pm, Andreas Löscher wrote:
> Hi,
> I am looking for an easy to use parser. I am want to get an overview
> over parsing and want to try to get some information out of a C-Header
> file. Which parser would you recommend?
ANTLR
--
http://mail.python.org/mailman/listinfo/python-list
>
> > ANTLR
>
> I don't know if it's that easy to get started with though. The
> companion for-pay book is *most excellent*, but it seems to have been
> written to the detriment of the normal online docs.
>
> Cheers,
> Chris
> --http://blog.rebertia.com
IMO ANTLR is much easier to use compared to
Anybody knows if a python sparsehash module is there in the wild?
--
http://mail.python.org/mailman/listinfo/python-list
How can I create an empty object with dynamic attributes? It should be
something like:
>>> m = object()
>>> m.myattr = 1
But this doesn't work. And I have to resort to:
>>> class expando(object): pass
>>> m = expando()
>>> m.myattr = 1
Is there a one-liner that would do the thing?
-- Cheers, D
On Jun 3, 3:43 pm, "Emin.shopper Martinian.shopper"
wrote:
> Dear Experts,
>
> I am getting a MemoryError when creating a dict in a long running
> process and suspect this is due to memory fragmentation. Any
> suggestions would be welcome. Full details of the problem are below.
>
> I have a long r
> I have a long running processing which eventually dies to a
> MemoryError exception. When it dies, it is using roughly 900 MB on a 4
> GB Windows XP machine running Python 2.5.4. If I do "import pdb;
BTW have you tried the same code with the Python 2.6.5?
-- Dmitry
--
http://mail.python.org/ma
I'm still unconvinced that it is a memory fragmentation problem. It's
very rare.
Can you give more concrete example that one can actually try to
execute? Like:
python -c "list([list([0]*xxx)+list([1]*xxx)+list([2]*xxx)
+list([3]*xxx) for xxx in range(10)])" &
-- Dmitry
--
http://mail.python.
> Why does it have to be a one-liner? Is the Enter key on your keyboard
> broken?
Nah. I was simply looking for something natural and intuitive, like: m
= object(); m.a = 1;
Usually python is pretty good providing these natural and intuitive
solutions.
> You have a perfectly good solution: defin
Right.
>>> m = lambda:expando
>>> m.myattr = 1
>>> print m.myattr
1
-- Cheers, Dmitry
--
http://mail.python.org/mailman/listinfo/python-list
On Jun 9, 7:31 pm, a...@pythoncraft.com (Aahz) wrote:
> dmtr wrote:
>
> >>>> m = lambda:expando
> >>>> m.myattr = 1
> >>>> print m.myattr
> >1
>
> That's a *great* technique if your goal is to confuse people.
> -
I need to print the regexp pattern text (SRE_Pattern object ) for
debugging purposes, is there any way to do it gracefully? I've came up
with the following hack, but it is rather crude... Is there an
official way to get the regexp pattern text?
>>> import re, pickle
>>> r = re.compile('^abc$', re.
On Jun 17, 3:35 pm, MRAB wrote:
>
> >>> import re
> >>> r = re.compile('^abc$', re.I)
> >>> r.pattern
> '^abc$'
> >>> r.flags
> 2
Hey, thanks. It works.
Couldn't find it in a reference somehow.
And it's not in the inspect.getmembers(r).
Must be doing something wrong.
-- Cheers, Dmitry
I'm running into some performance / memory bottlenecks on large lists.
Is there any easy way to minimize/optimize memory usage?
Simple str() and unicode objects() [Python 2.6.4/Linux/x86]:
>>> sys.getsizeof('') 24 bytes
>>> sys.getsizeof('0')25 bytes
>>> sys.getsizeof(u'')28 bytes
>>>
Steven, thank you for answering. See my comments inline. Perhaps I
should have formulated my question a bit differently: Are there any
*compact* high performance containers for unicode()/str() objects in
Python? By *compact* I don't mean compression. Just optimized for
memory usage, rather than per
> > Well... 63 bytes per item for very short unicode strings... Is there
> > any way to do better than that? Perhaps some compact unicode objects?
>
> There is a certain price you pay for having full-feature Python objects.
Are there any *compact* Python objects? Optimized for compactness?
> Wha
On Aug 6, 10:56 pm, Michael Torrie wrote:
> On 08/06/2010 07:56 PM, dmtr wrote:
>
> > Ultimately a dict that can store ~20,000,000 entries: (u'short
> > string' : (int, int, int, int, int, int, int)).
>
> I think you really need a real database engine. With th
On Aug 6, 11:50 pm, Peter Otten <__pete...@web.de> wrote:
> I don't know to what extent it still applys but switching off cyclic garbage
> collection with
>
> import gc
> gc.disable()
Haven't tried it on the real dataset. On the synthetic test it (and
sys.setcheckinterval(10)) gave ~2% speedu
Correction. I've copy-pasted it wrong! array.array('i', (i, i+1, i+2, i
+3, i+4, i+5, i+6)) was the best.
>>> for i in xrange(0, 100): d[unicode(i)] = (i, i+1, i+2, i+3, i+4, i+5,
>>> i+6)
100 keys, ['VmPeak:\t 224704 kB', 'VmSize:\t 224704 kB'],
4.079240 seconds, 245143.698209 keys pe
> Looking at your benchmark, random.choice(letters) has probably less overhead
> than letters[random.randint(...)]. You might even try to inline it as
Right... random.choice()... I'm a bit new to python, always something
to learn. But anyway in that benchmark (from http://bugs.python.org/issue952
I guess with the actual dataset I'll be able to improve the memory
usage a bit, with BioPython::trie. That would probably be enough
optimization to continue working with some comfort. On this test code
BioPython::trie gives a bit of improvement in terms of memory. Not
much though...
>>> d = dict()
25 matches
Mail list logo