Gelonida N, 27.11.2011 18:57:
I'd like to verify some (x)html / / html5 / xml documents from a server.

These documents have a very limited number of different doc types / DTDs.

So what I would like to do is to build a small DTD cache and some code,
that would avoid searching the DTDs over and over from the net.

What would be the best way to do this?

Configure your XML catalogues.


I guess, that
the fields od en ElementTre, that I have to look at are
docinfo.public_id
docinfo.system_uri

Yes, catalogue lookups generally happen through the public ID.


There's also mentioning af a catalogue, but I don't know how to
use a catalog and how to know what is inside my catalogue
and what isn't.

Does this help?

http://lxml.de/resolvers.html#xml-catalogs

http://xmlsoft.org/catalog.html

They should normally come pre-configured on Linux distributions, but you may have to install additional packages with the respective DTDs. Look for any packages with "dtd" and "html" in their name, for example.

Stefan

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to