Hello, I've spent the morning trying to parse a simple xml file and have the
following:
import sys
from xml.dom import minidom

doc=minidom.parse('topstories.xml')

items = doc.getElementsByTagName("item")
text=''
for i in items:
    t = i.firstChild
    print t.nodeName
    if t.nodeType == t.TEXT_NODE:
        print "TEXT_NODE"
        print t.nodeValue
        text += t.data

print text

I can't figure out how to print the text value for a text node type. There
must be something obvious I'm missing, any suggestions?

Thanks.

XML is as follows:

<?xml version="1.0"?>
<rss version="2.0">
  <channel>
    <title>Stuff.co.nz - Top Stories</title>
    <link>http://www.stuff.co.nz</link>
    <description>Top Stories from Stuff.co.nz. New Zealand, world, sport,
business &amp; entertainment news on Stuff.co.nz. </description>
    <language>en-nz</language>
    <copyright>Fairfax New Zealand Ltd.</copyright>
    <ttl>30</ttl>
    <image>
      <url>/static/images/logo.gif</url>
      <title>Stuff News</title>
      <link>http://www.stuff.co.nz</link>
    </image>

<item id="4423924" count="1">
<title>Prince Harry &apos;wants to live in Africa&apos;</title>
<link>http://www.stuff.co.nz/4423924a10.html?source=RSStopstories_20080303
</link>
<description>For Prince Harry it must be the ultimate dark irony: to be in
such a privileged position and have so much opportunity, and yet be unable
to fulfil a dream of fighting for the motherland.</description>
<author>EDMUND TADROS</author>
<guid isPermaLink="false">stuff.co.nz/4423924</guid>
<pubDate>Mon, 03 Mar 2008 00:44:00 GMT</pubDate>
</item>

  </channel>
</rss>
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to