On Thursday, November 20, 2014 12:04:09 PM UTC-5, Denis McMahon wrote: > On Wed, 19 Nov 2014 13:43:17 -0800, Novocastrian_Nomad wrote: > > > On Wednesday, November 19, 2014 2:08:27 PM UTC-7, Denis McMahon wrote: > >> So what I'm looking for is a method to create an html5 document using > >> "dom manipulation", ie: > >> > >> doc = new htmldocument(doctype="HTML") > >> html = new html5element("html") > >> doc.appendChild(html) > >> head = new html5element("body") > >> html.appendChild(head) > >> body = new html5element("body") > >> html.appendChild(body) > >> title = new html5element("title") > >> txt = new textnode("This Is The Title") > >> title.appendChild(txt) > >> head.appendChild(title) > >> para = new html5element("p") > >> txt = new textnode("This is some text.") > >> para.appendChild(txt) > >> body.appendChild(para) > >> > >> print(doc.serialise()) > >> > >> generates: > >> > >> <!doctype HTML><html><head><title>This Is The Title</title></ > >> head><body><p>This is some text.</p></body></html> > >> > >> I'm finding various mechanisms to generate the structure from an > >> existing piece of html (eg html5lib, beautifulsoup etc) but I can't > >> seem to find any mechanism to generate, manipulate and produce html5 > >> documents using this dom manipulation approach. Where should I be > >> looking? > > > Use a search engine (Google, DuckDuckGo etc) and search for 'python > > write html' > > Surprise surprise, already tried that, can't find anything that holds the > document in the sort of tree structure that I want to manipulate it in. > > Everything there seems to assume I'll be creating a document serially, eg > that I won't get to some point in the document and decide that I want to > add an element earlier. > > bs4 and html5lib will parse a document into a tree structure, but they're > not so hot on manipulating the tree structure, eg adding and moving nodes. > > Actually it looks like bs4 is going to be my best bet, although limited > it does have most of what I'm looking for. I just need to start by giving > it "<html></html>" to parse. > > -- > Denis McMahon
I believe lxml should work for this. Here's a snippet that I have used to create an HTML document: from lxml import etree page = etree.Element('html') doc = etree.ElementTree(page) head = etree.SubElement(page, 'head') body = etree.SubElement(page, 'body') table = etree.SubElement(body, 'table') etc etc with open('mynewfile.html', 'wb') as f: doc.write(f, pretty_print=True, method='html') (you can leave out the method= option to get xhtml). hope that helps, --Tim -- https://mail.python.org/mailman/listinfo/python-list