Re: [BangPypers] parsing xml

2011-09-30 Thread Dhananjay Nene
On Fri, Jul 29, 2011 at 10:47 AM, Anand Chitipothu wrote: > 2011/7/28 Venkatraman S : > > parsing using minidom is one of the slowest. if you just want to extract > the > > distance and assuming that it(the tag) will always be consistent, then i > > would always suggest regexp. xml parsing is a pa

Re: [BangPypers] parsing xml

2011-08-01 Thread Dhananjay Nene
On Mon, Aug 1, 2011 at 7:51 PM, Noufal Ibrahim wrote: > Anand Balachandran Pillai writes: > >> On Mon, Aug 1, 2011 at 6:08 AM, Anand Chitipothu wrote: > > [...] > >> It is more subtler than that. >> >> List comprehensions are faster than map functions when >> the latter needs to invoke a user-def

Re: [BangPypers] parsing xml

2011-08-01 Thread Noufal Ibrahim
Anand Balachandran Pillai writes: > On Mon, Aug 1, 2011 at 6:08 AM, Anand Chitipothu wrote: [...] > It is more subtler than that. > > List comprehensions are faster than map functions when > the latter needs to invoke a user-defined function call or a lambda. > > Maps score over list comprehens

Re: [BangPypers] parsing xml

2011-08-01 Thread Dhananjay Nene
On Mon, Aug 1, 2011 at 12:46 AM, Noufal Ibrahim wrote: > Venkatraman S writes: > > >> Hang around in #django or #python. The most elegant code that you >> *should* write would invariably be pretty fast (am not ref to asm). > > I agree with you here. Pythonicity is best defined as what the > exper

Re: [BangPypers] parsing xml

2011-08-01 Thread Smrutilekha Swain
by using lxml...for example-: from lxml import etree content = etree.iterparse( *name of the xml file*, events=('start', 'end')) for event, elem in content: if elem.tag == 'distance': print elem.text Hope it will work.. On Mon, Aug 1, 2011 at 1:43 PM,

Re: [BangPypers] parsing xml

2011-08-01 Thread Anand Balachandran Pillai
On Mon, Aug 1, 2011 at 1:25 PM, Kiran Jonnalagadda wrote: > On 31-Jul-2011, at 11:33 PM, Venkatraman S wrote: > > > A regex is the simplest IMHO, because you need not know the syntax of the > > minidom parser. > > But, again i have seen this quiet often that lack of knowledge of regexp > has > >

Re: [BangPypers] parsing xml

2011-08-01 Thread Kiran Jonnalagadda
On 31-Jul-2011, at 11:33 PM, Venkatraman S wrote: > A regex is the simplest IMHO, because you need not know the syntax of the > minidom parser. > But, again i have seen this quiet often that lack of knowledge of regexp has > led people to other solutions (the grapes are sour!) In the eternal word

Re: [BangPypers] parsing xml

2011-08-01 Thread Dhananjay Nene
On Mon, Aug 1, 2011 at 12:43 AM, Noufal Ibrahim wrote: > Dhananjay Nene writes: > > > [...] > > > re.search("\s*(\d+)\s*",data).group(1) > > > > would appear to be the most succinct and quite fast. Adjust for > whitespace > > as and if necessary. > > Whitespace (including newlines), mixed cases

Re: [BangPypers] parsing xml

2011-07-31 Thread Kenneth Gonsalves
On Sun, 2011-07-31 at 19:57 +0530, Anand Balachandran Pillai wrote: > > xml parsing in the case when all that you need from the string is a > simple > > numeric value(not a string), then good luck; unlike esr i will not > use > > adjectives; but i would not use your code either. > > > > To be fair

Re: [BangPypers] parsing xml

2011-07-31 Thread Anand Balachandran Pillai
On Mon, Aug 1, 2011 at 6:08 AM, Anand Chitipothu wrote: > > Hang around in #django or #python. The most elegant code that you > *should* > > write would invariably be pretty fast (am not ref to asm). > > That doesn't mean that any code that is faster is elegant. > > IIRC, in python, map function r

Re: [BangPypers] parsing xml

2011-07-31 Thread Anand Chitipothu
> Hang around in #django or #python. The most elegant code that you *should* > write would invariably be pretty fast (am not ref to asm). That doesn't mean that any code that is faster is elegant. IIRC, in python, map function runs slightly faster than list comprehensions, but list comprehensions

Re: [BangPypers] parsing xml

2011-07-31 Thread Noufal Ibrahim
Venkatraman S writes: [...] > A regex is the simplest IMHO, because you need not know the syntax of the > minidom parser. Oh come on. This sounds like doing it the wrong way because you're not going to spend time reading the docs and then using performance as a cover for the laziness. [...]

Re: [BangPypers] parsing xml

2011-07-31 Thread Noufal Ibrahim
Dhananjay Nene writes: [...] > re.search("\s*(\d+)\s*",data).group(1) > > would appear to be the most succinct and quite fast. Adjust for whitespace > as and if necessary. Whitespace (including newlines), mixed cases etc. [...] > As far as optimisation goes - I can see at least 3 options >

Re: [BangPypers] parsing xml

2011-07-31 Thread Venkatraman S
On Sun, Jul 31, 2011 at 10:58 PM, Dhananjay Nene wrote: > a. the minidom performance is acceptable - no further optimisation required > b. minidom performance is not acceptable - try the regex one > c. python library performance is not acceptable - switch to 'c' > > I can imagine people starting w

Re: [BangPypers] parsing xml

2011-07-31 Thread Dhananjay Nene
On Thu, Jul 28, 2011 at 3:18 PM, Kenneth Gonsalves wrote: > hi, > > here is a simplified version of an xml file: > > > > > >CloudMade > >http://maps.cloudmade.com";> > > >h

Re: [BangPypers] parsing xml

2011-07-31 Thread Noufal Ibrahim
Anand Balachandran Pillai writes: > On Fri, Jul 29, 2011 at 4:41 PM, Venkatraman S wrote: [...] > To be fair here, I think what he is saying is that Kenneth's problem > (getting at the particular value) can be solved by using an aptly > written regular expression which might be the fastest - n

Re: [BangPypers] parsing xml

2011-07-31 Thread Anand Balachandran Pillai
On Fri, Jul 29, 2011 at 4:41 PM, Venkatraman S wrote: > Noufal, > > I have nothing more to say than this(as i see some tangential replies which > i am not interested in substantiating - for eg, i never suggested to use a > regexp based parser - a regexp based xml parser is different from using 'a

Re: [BangPypers] parsing xml

2011-07-29 Thread Noufal Ibrahim
Venkatraman S writes: [...] > Read my replies properly. Read my assumptions properly w.r.t the xml > structure and the requested value in the xml. Read the link that you > have pasted again. If possible, read the comments in the link > shared(from esr) again. Once done, think twice and tell me

Re: [BangPypers] parsing xml

2011-07-29 Thread Venkatraman S
Noufal, I have nothing more to say than this(as i see some tangential replies which i am not interested in substantiating - for eg, i never suggested to use a regexp based parser - a regexp based xml parser is different from using 'a' regexp on a string!) : Read my replies properly. Read my a

Re: [BangPypers] parsing xml

2011-07-29 Thread Sidu Ponnappa
> Along the same lines...the problems are more when xml is concerned, for even > if some other tag is malformed, then the > whole document is 'gone'. Well, then - the API is broken and is basically violating the TOS for the API (which I would at a minimum expect to return valid output or the approp

Re: [BangPypers] parsing xml

2011-07-29 Thread Gora Mohanty
On Fri, Jul 29, 2011 at 10:55 AM, Baishampayan Ghose wrote: >> minidom is the fastest solution if you consider the programmer time >> instead of developer time.  Minidom is available in standard library, >> you don't have to add another dependency and worry about PyPI >> downtimes and lxml compila

Re: [BangPypers] parsing xml

2011-07-29 Thread Baishampayan Ghose
On Fri, Jul 29, 2011 at 1:09 PM, Venkatraman S wrote: > IMHO, regexps are much more powerful and fault tolerant than XML parsing. > XMLs are brittle. Did you mean parsing XML using Regular Expressions is "more powerful and fault tolerant" than using a XML parser? Regards, BG -- Baishampayan Gh

Re: [BangPypers] parsing xml

2011-07-29 Thread Noufal Ibrahim
Venkatraman S writes: > n Fri, Jul 29, 2011 at 12:20 PM, Noufal Ibrahim wrote: > >> I agree and I try my best to do the same thing. However, I differentiate >> between micro optimsations like rewriting parts in C and XML and top >> level optimisations like good design and the right data structur

Re: [BangPypers] parsing xml

2011-07-29 Thread Venkatraman S
n Fri, Jul 29, 2011 at 12:20 PM, Noufal Ibrahim wrote: > I agree and I try my best to do the same thing. However, I differentiate > between micro optimsations like rewriting parts in C and XML and top > level optimisations like good design and the right data structures. > > Using regexp is micro

Re: [BangPypers] parsing xml

2011-07-29 Thread Umar Shah
+1 On Fri, Jul 29, 2011 at 12:50 PM, Sidu Ponnappa wrote: > +1. > > On Fri, Jul 29, 2011 at 12:20 PM, Noufal Ibrahim wrote: > > Venkatraman S writes: > > > >> On Fri, Jul 29, 2011 at 11:31 AM, Noufal Ibrahim > wrote: > >> > >>> > I am a speed-maniac and crave for speed; so if the assumption is

Re: [BangPypers] parsing xml

2011-07-29 Thread Sidu Ponnappa
+1. On Fri, Jul 29, 2011 at 12:20 PM, Noufal Ibrahim wrote: > Venkatraman S writes: > >> On Fri, Jul 29, 2011 at 11:31 AM, Noufal Ibrahim wrote: >> >>> > I am a speed-maniac and crave for speed; so if the assumption is >>> > valid, i can vouch for the fact that regexp would be faster and neater

Re: [BangPypers] parsing xml

2011-07-28 Thread Noufal Ibrahim
Venkatraman S writes: > On Fri, Jul 29, 2011 at 11:31 AM, Noufal Ibrahim wrote: > >> > I am a speed-maniac and crave for speed; so if the assumption is >> > valid, i can vouch for the fact that regexp would be faster and neater >> > solution. I have done some speed experiments in past on this (r

Re: [BangPypers] parsing xml

2011-07-28 Thread Venkatraman S
On Fri, Jul 29, 2011 at 11:31 AM, Noufal Ibrahim wrote: > > I am a speed-maniac and crave for speed; so if the assumption is > > valid, i can vouch for the fact that regexp would be faster and neater > > solution. I have done some speed experiments in past on this (results > > of which i do not h

Re: [BangPypers] parsing xml

2011-07-28 Thread Venkatraman S
On Fri, Jul 29, 2011 at 11:44 AM, Noufal Ibrahim wrote: > > And I'm telling you that even a slight change to the tag - an extra > space, a newline, a new attribute, a change in case or any such thing > which doesn't modify it's meaning as far as the XML snippet is concerned > will break your rege

Re: [BangPypers] parsing xml

2011-07-28 Thread Noufal Ibrahim
Venkatraman S writes: [...] > Sigh! Again, guys, i am referring to regexp when all you need is some > number within a tag! If the content of that tag was text, i would > have never suggested this solution. [...] And I'm telling you that even a slight change to the tag - an extra space, a newli

Re: [BangPypers] parsing xml

2011-07-28 Thread Venkatraman S
On Fri, Jul 29, 2011 at 11:31 AM, Noufal Ibrahim wrote: > > > Well, i have clearly mentioned my assumptions - i.e, when you treat > > the XML as a 'string' and do not want to retrieve anything else in a > > 'structured manner'. > > > If the data is structured, it makes sense to exploit that struc

Re: [BangPypers] parsing xml

2011-07-28 Thread Venkatraman S
On Fri, Jul 29, 2011 at 11:15 AM, Anand Chitipothu wrote: > 2011/7/29 Venkatraman S : > > On Fri, Jul 29, 2011 at 10:47 AM, Anand Chitipothu >wrote: > > > >> 2011/7/28 Venkatraman S : > >> > parsing using minidom is one of the slowest. if you just want to > extract > >> the > >> > distance and as

Re: [BangPypers] parsing xml

2011-07-28 Thread Noufal Ibrahim
Venkatraman S writes: [...] > Well, i have clearly mentioned my assumptions - i.e, when you treat > the XML as a 'string' and do not want to retrieve anything else in a > 'structured manner'. If the data is structured, it makes sense to exploit that structure and use a proper solution. > I a

Re: [BangPypers] parsing xml

2011-07-28 Thread Anand Chitipothu
2011/7/29 Baishampayan Ghose : >> minidom is the fastest solution if you consider the programmer time >> instead of developer time.  Minidom is available in standard library, >> you don't have to add another dependency and worry about PyPI >> downtimes and lxml compilations failures. > > FWIW, Elem

Re: [BangPypers] parsing xml

2011-07-28 Thread Anand Chitipothu
2011/7/29 Venkatraman S : > On Fri, Jul 29, 2011 at 10:47 AM, Anand Chitipothu > wrote: > >> 2011/7/28 Venkatraman S : >> > parsing using minidom is one of the slowest. if you just want to extract >> the >> > distance and assuming that it(the tag) will always be consistent, then i >> > would alway

Re: [BangPypers] parsing xml

2011-07-28 Thread Venkatraman S
On Fri, Jul 29, 2011 at 10:47 AM, Anand Chitipothu wrote: > 2011/7/28 Venkatraman S : > > parsing using minidom is one of the slowest. if you just want to extract > the > > distance and assuming that it(the tag) will always be consistent, then i > > would always suggest regexp. xml parsing is a pa

Re: [BangPypers] parsing xml

2011-07-28 Thread Baishampayan Ghose
> minidom is the fastest solution if you consider the programmer time > instead of developer time.  Minidom is available in standard library, > you don't have to add another dependency and worry about PyPI > downtimes and lxml compilations failures. FWIW, ElementTree is a part of the standard libr

Re: [BangPypers] parsing xml

2011-07-28 Thread Anand Chitipothu
2011/7/28 Venkatraman S : > parsing using minidom is one of the slowest. if you just want to extract the > distance and assuming that it(the tag) will always be consistent, then i > would always suggest regexp. xml parsing is a pain. regexp is a bad solution to parse xml. minidom is the fastest s

Re: [BangPypers] parsing xml

2011-07-28 Thread Joseph Gladson
Hi, check and try pyparsing module... U could do it so simple:) regards, joseph On 7/29/11, Ramdas S wrote: > On Fri, Jul 29, 2011 at 1:23 AM, Gora Mohanty wrote: > >> On Thu, Jul 28, 2011 at 10:37 PM, Venkatraman S >> wrote: >> > parsing using minidom is one of the slowest. if you just w

Re: [BangPypers] parsing xml

2011-07-28 Thread Ramdas S
On Fri, Jul 29, 2011 at 1:23 AM, Gora Mohanty wrote: > On Thu, Jul 28, 2011 at 10:37 PM, Venkatraman S > wrote: > > parsing using minidom is one of the slowest. if you just want to extract > the > > distance and assuming that it(the tag) will always be consistent, then i > > would always suggest

Re: [BangPypers] parsing xml

2011-07-28 Thread Gora Mohanty
On Thu, Jul 28, 2011 at 10:37 PM, Venkatraman S wrote: > parsing using minidom is one of the slowest. if you just want to extract the > distance and assuming that it(the tag) will always be consistent, then i > would always suggest regexp. xml parsing is a pain. [...] Strongly disagree. IMHO, reg

Re: [BangPypers] parsing xml

2011-07-28 Thread Venkatraman S
parsing using minidom is one of the slowest. if you just want to extract the distance and assuming that it(the tag) will always be consistent, then i would always suggest regexp. xml parsing is a pain. ___ BangPypers mailing list BangPypers@python.org htt

Re: [BangPypers] parsing xml

2011-07-28 Thread Sidu Ponnappa
If you're doing this repeatedly, you may want to just delegate to a native XPath implementation. I haven't done much Python, so I can't comment on your choices, but in Ruby I'd simply hand off to libXML using Nokogiri. This approach should be a whole lot faster, but I'd advise benchmarking first be

Re: [BangPypers] parsing xml

2011-07-28 Thread kracekumar ramaraju
You can try beautifulsoup, recommended for python/XML Parsing. ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers

Re: [BangPypers] parsing xml

2011-07-28 Thread Kenneth Gonsalves
On Thu, 2011-07-28 at 15:33 +0530, Anand Chitipothu wrote: > > I want to get the value of the distance element - 1489. What is the > > simplest way of doing this? > > >>> from xml.dom import minidom > >>> dom = minidom.parseString(x) > >>> dom.getElementsByTagName("distance")[0].childNodes[0].node

Re: [BangPypers] parsing xml

2011-07-28 Thread Baishampayan Ghose
> here is a simplified version of an xml file: > > >     >         >                 >                CloudMade >                 >                http://maps.cloudmade.com";> >                 >                 >                http://cloudmade.com/faq#license >                 >                2

Re: [BangPypers] parsing xml

2011-07-28 Thread Anand Chitipothu
2011/7/28 Kenneth Gonsalves : > hi, > > here is a simplified version of an xml file: > > >     >         >                 >                CloudMade >                 >                http://maps.cloudmade.com";> >                 >                 >                http://cloudmade.com/faq#licens

Re: [BangPypers] parsing xml

2011-07-28 Thread Ramdas S
On Thu, Jul 28, 2011 at 3:20 PM, Venkatraman S wrote: > grep or regexp? > > -V > ___ > BangPypers mailing list > BangPypers@python.org > http://mail.python.org/mailman/listinfo/bangpypers > can write an Xml parsing query -- Ramdas S +91 9342 583 065

Re: [BangPypers] parsing xml

2011-07-28 Thread hemant
Using xpath such as: /gpx/extensions/distance(:text) ? On Thu, Jul 28, 2011 at 3:20 PM, Venkatraman S wrote: > grep or regexp? > > -V > ___ > BangPypers mailing list > BangPypers@python.org > http://mail.python.org/mailman/listinfo/bangpypers > -

Re: [BangPypers] parsing xml

2011-07-28 Thread Venkatraman S
grep or regexp? -V ___ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers