Gelonida N, 27.11.2011 18:57:
I'd like to verify some (x)html / / html5 / xml documents from a server.
These documents have a very limited number of different doc types / DTDs.
So what I would like to do is to build a small DTD cache and some code,
that would avoid searching the DTDs over and over from the net.
What would be the best way to do this?
Configure your XML catalogues.
I guess, that
the fields od en ElementTre, that I have to look at are
docinfo.public_id
docinfo.system_uri
Yes, catalogue lookups generally happen through the public ID.
There's also mentioning af a catalogue, but I don't know how to
use a catalog and how to know what is inside my catalogue
and what isn't.
Does this help?
http://lxml.de/resolvers.html#xml-catalogs
http://xmlsoft.org/catalog.html
They should normally come pre-configured on Linux distributions, but you
may have to install additional packages with the respective DTDs. Look for
any packages with "dtd" and "html" in their name, for example.
Stefan
--
http://mail.python.org/mailman/listinfo/python-list