On Tue, Sep 24, 2013 at 6:11 PM, Dhananjay Nene <dhananjay.n...@gmail.com> wrote: > On Tue, Sep 24, 2013 at 6:04 PM, Dhananjay Nene > <dhananjay.n...@gmail.com> wrote: >> On Tue, Sep 24, 2013 at 5:48 PM, Vineet Naik <naik...@gmail.com> wrote: >>> Hi, >>> >>> On Tue, Sep 24, 2013 at 10:38 AM, bab mis <bab...@outlook.com> wrote: >>> >>>> Hi ,Any XML parser which gives the same kind of data structure as yaml >>>> parser gives in python. Tried with xmlmindom but ir's not of a proper >>>> datastrucure ,every time i need to read by element and create the dict. >>>> >>> >>> You can try xmltodict[1]. It also retains the node attributes and makes >>> than accessible using the '@' prefix (See the example in README of the repo) >>> >>> [1]: https://github.com/martinblech/xmltodict >> >> Being curious I immediately took a look and tried the following : >> >> import xmltodict >> >> doc1 = xmltodict.parse(""" >> <mydocument has="an attribute"> >> <and> >> <many>elements</many> >> <many>more elements</many> >> </and> >> <plus a="complex"> >> element as well >> </plus> >> </mydocument> >> """) >> >> doc2 = xmltodict.parse(""" >> <mydocument has="an attribute"> >> <and> >> <many>more elements</many> >> </and> >> <plus a="complex"> >> element as well >> </plus> >> </mydocument> >> """) >> print(doc1['mydocument']['and']) >> print(doc2['mydocument']['and']) >> >> The output was : >> OrderedDict([(u'many', [u'elements', u'more elements'])]) >> OrderedDict([(u'many', u'more elements')]) >> >> The only difference is there is only one "many" node inside the "and" >> node in doc2. Do you see an issue here (at least I do). The output >> structure is a function of the cardinality of the inner nodes. Since >> it changes shape from a list of many to not a list of 1 but just 1 >> element (throwing away the list). Which can make things rather >> unpredictable. Since you cannot predict upfront whether the existence >> of just one node inside a parent node is consistent with the xml >> schema or is just applicable in that particular instance. >> >> I do think the problem is tractable so long as one clearly documents >> the specific constraints which the underlying XML will satisfy, >> constraints which will allow transformations to lists or dicts safe. >> Trying to make it easy without clearly documenting the constraints >> could lead to violations of the principle of least surprise like >> above. >> > It gets even more interesting, eg. below > > doc3 = xmltodict.parse(""" > <mydocument has="an attribute"> > <and> > <many>elements</many> > </and> > <plus a="complex"> > element as well > </plus> > <and> > <many>more elements</many> > </and> > </mydocument> > """) > > print(doc3['mydocument']['and']) > > leads to the output : > > [OrderedDict([(u'many', u'elements')]), OrderedDict([(u'many', u'more > elements')])] > > Definitely not what would be naively expected.
Correction: print(doc3['mydocument']) prints OrderedDict([(u'@has', u'an attribute'), (u'and', [OrderedDict([(u'many', u'elements')]), OrderedDict([(u'many', u'more elements')])]), (u'plus', OrderedDict([(u'@a', u'complex'), ('#text', u'element as well')]))]) which just trashed the ordering of an and followed by a plus followed by an and. Dhananjay -- ---------------------------------------------------------------------------------------------------------------------------------- http://blog.dhananjaynene.com twitter: @dnene google plus: http://gplus.to/dhananjaynene _______________________________________________ BangPypers mailing list BangPypers@python.org https://mail.python.org/mailman/listinfo/bangpypers