On 22 Jan, 15:11, John Carlyle-Clarke <[EMAIL PROTECTED]> wrote: > > I wrote some code that works on my Linux box using xml.dom.minidom, but > it will not run on the windows box that I really need it on. Python > 2.5.1 on both. > > On the windows machine, it's a clean install of the Python .msi from > python.org. The linux box is Ubuntu 7.10, which has some Python XML > packages installed which can't easily be removed (namely python-libxml2 > and python-xml).
I don't think you're straying into libxml2 or PyXML territory here... > I have boiled the code down to its simplest form which shows the problem:- > > import xml.dom.minidom > import sys > > input_file = sys.argv[1]; > output_file = sys.argv[2]; > > doc = xml.dom.minidom.parse(input_file) > file = open(output_file, "w") On Windows, shouldn't this be the following...? file = open(output_file, "wb") > doc.writexml(file) > > The error is:- > > $ python test2.py input2.xml output.xml > Traceback (most recent call last): > File "test2.py", line 9, in <module> > doc.writexml(file) > File "c:\Python25\lib\xml\dom\minidom.py", line 1744, in writexml > node.writexml(writer, indent, addindent, newl) > File "c:\Python25\lib\xml\dom\minidom.py", line 814, in writexml > node.writexml(writer,indent+addindent,addindent,newl) > File "c:\Python25\lib\xml\dom\minidom.py", line 809, in writexml > _write_data(writer, attrs[a_name].value) > File "c:\Python25\lib\xml\dom\minidom.py", line 299, in _write_data > data = data.replace("&", "&").replace("<", "<") > AttributeError: 'NoneType' object has no attribute 'replace' > > As I said, this code runs fine on the Ubuntu box. If I could work out > why the code runs on this box, that would help because then I call set > up the windows box the same way. If I encountered the same issue, I'd have to inspect the goings-on inside minidom, possibly using judicious trace statements in the minidom.py file. Either way, the above looks like an attribute node produces a value of None rather than any kind of character string. > The input file contains an <xsd:schema> block which is what actually > causes the problem. If you remove that node and subnodes, it works > fine. For a while at least, you can view the input file at > http://rafb.net/p/5R1JlW12.html The horror! ;-) > Someone suggested that I should try xml.etree.ElementTree, however > writing the same type of simple code to import and then write the file > mangles the xsd:schema stuff because ElementTree does not understand > namespaces. I'll leave this to others: I don't use ElementTree. > By the way, is pyxml a live project or not? Should it still be used? > It's odd that if you go to http://www.python.org/and click the link > "Using python for..." XML, it leads you to > http://pyxml.sourceforge.net/topics/ > > If you then follow the download links to > http://sourceforge.net/project/showfiles.php?group_id=6473 you see that > the latest file is 2004, and there are no versions for newer pythons. > It also says "PyXML is no longer maintained". Shouldn't the link be > removed from python.org? The XML situation in Python's standard library is controversial and can be probably inaccurately summarised by the following chronology: 1. XML is born, various efforts start up (see the qp_xml and xmllib modules). 2. Various people organise themselves, contributing software to the PyXML project (4Suite, xmlproc). 3. The XML backlash begins: we should all apparently be using stuff like YAML (but don't worry if you haven't heard of it). 4. ElementTree is released, people tell you that you shouldn't be using SAX or DOM any more, "pull" parsers are all the rage (although proponents overlook the presence of xml.dom.pulldom in the Python standard library). 5. ElementTree enters the standard library as xml.etree; PyXML falls into apparent disuse (see remarks about SAX and DOM above). I think I looked seriously at wrapping libxml2 (with libxml2dom [1]) when I experienced issues with both PyXML and 4Suite when used together with mod_python, since each project used its own Expat libraries and the resulting mis-linked software produced very bizarre results. Moreover, only cDomlette from 4Suite seemed remotely fast, and yet did not seem to be an adequate replacement for the usual PyXML functionality. People will, of course, tell you that you shouldn't use a DOM for anything and that the "consensus" is to use ElementTree or lxml (see above), but I can't help feeling that this has a damaging effect on the XML situation for Python: some newcomers would actually benefit from the traditional APIs, may already be familiar with them from other contexts, and may consider Python lacking if the support for them is in apparent decay. It requires a degree of motivation to actually attempt to maintain software providing such APIs (which was my solution to the problem), but if someone isn't totally bound to Python then they might easily start looking at other languages and tools in order to get the job done. Meanwhile, here are some resources: http://wiki.python.org/moin/PythonXml Paul [1] http://www.python.org/pypi/libxml2dom -- http://mail.python.org/mailman/listinfo/python-list