On Saturday, 10 May 2014 22:10:14 UTC+10, Peter Otten wrote: > flebber wrote: > > > > > I am using xmltodict. > > > > > > This is how I have accessed and loaded my file. > > > > > > import xmltodict > > > document = open("/home/sayth/Scripts/va_benefits/20140508GOSF0.xml", "r") > > > read_doc = document.read() > > > xml_doc = xmltodict.parse(read_doc) > > > > > > The start of the file I am trying to get data out of is. > > > > > > <meeting id="35483" barriertrial="0" venue="Gosford" > > > date="2014-05-08T00:00:00" gearchanges="-1" stewardsreport="-1" > > > gearlist="-1" racebook="0" postracestewards="0" meetingtype="TAB" > > > rail="True" weather="Fine " trackcondition="Dead " > > > nomsdeadline="2014-05-02T11:00:00" weightsdeadline="2014-05-05T16:00:00" > > > acceptdeadline="2014-05-06T09:00:00" jockeydeadline="2014-05-06T12:00:00"> > > > <club abbrevname="Gosford Race Club" code="49" associationclass="2" > > > website="http://" /> > > > <race id="185273" number="1" nomnumber="7" division="0" name="GOSFORD > > > ROTARY MAIDEN HANDICAP" mediumname="MDN" shortname="MDN" > > > stage="Acceptances" distance="1600" minweight="55" raisedweight="0" > > > class="MDN " age="~ " grade="0" weightcondition="HCP > > > " trophy="0" owner="0" trainer="0" jockey="0" strapper="0" > > > totalprize="22000" first="12250" second="4250" third="2100" > > > fourth="1000" fifth="525" time="2014-05-08T12:30:00" bonustype="BX02 > > > " nomsfee="0" acceptfee="0" trackcondition=" " timingmethod=" > > > " fastesttime=" " sectionaltime=" " > > > formavailable="0" racebookprize="Of $22000. First $12250, second $4250, > > > third $2100, fourth $1000, fifth $525, sixth $375, seventh $375, eighth > > > $375, ninth $375, tenth $375"> > > > <condition line="1"> > > > > > > So thought I had it figured. Can access the elements of meeting and the > > > elements of club such as by doing this. > > > > > > In [5]: xml_doc['meeting']['club']['@abbrevname'] > > > Out[5]: u'Gosford Race Club' > > > > > > However whenever I try and access race in the same manner I get errors. > > > > > > In [11]: xml_doc['meeting']['club']['race']['@id'] > > > > > --------------------------------------------------------------------------- > > > KeyError Traceback (most recent call > > > last) <ipython-input-11-cce362d7e6fc> in <module>() > > > ----> 1 xml_doc['meeting']['club']['race']['@id'] > > > > > > KeyError: 'race' > > > > > > In [12]: xml_doc['meeting']['race']['@id'] > > > > > --------------------------------------------------------------------------- > > > TypeError Traceback (most recent call > > > last) <ipython-input-12-c304e2b8f9be> in <module>() > > > ----> 1 xml_doc['meeting']['race']['@id'] > > > > > > TypeError: list indices must be integers, not str > > > > > > why is accessing race @id any different to the access of club @abbrevname > > > and how do I get it for race? > > > > If I were to guess: there are multiple races per meeting, xmltodict puts > > them into a list under the "race" key, and you have to pick one: > > > > >>> doc = xmltodict.parse("""\ > > ... <meeting> > > ... <race id="first race">...</race> > > ... <race id="second race">...</race> > > ... </meeting> > > ... """) > > >>> type(doc["meeting"]["race"]) > > <class 'list'> > > >>> doc["meeting"]["race"][0]["@id"] > > 'first race' > > >>> doc["meeting"]["race"][1]["@id"] > >>> > >>> > > 'second race' > > > > > > So > > > > xml_doc['meeting']['race'][0]['@id'] > > > > or > > > > for race in xml_doc["meeting"]["race"]: > > print(race["@id"]) > > > > might work for you.
Thanks so much Peter, yes both worked indeed. Sayth -- https://mail.python.org/mailman/listinfo/python-list