Thanks to all those who replied! Your replies directed me to simply use another 
library to parse HTML, some of which work in a way that can be interfaced with 
SAX (though not perfectly), as I was hoping. I have succeeded using Nu 
htmlparser library.

Le jeudi 29 mai 2025 à 19:02 +0300, Stanimir Stamenkov a écrit :

Tue, 27 May 2025 17:08:55 +0000, /Olivier Cailloux/:


Can anyone point me towards some way of reading HTML (non XML) files

using Xerces-J? I tried various things using

org.apache.xerces.parsers.DOMParserImpl but parsing this file for

example (valid according to Nu validator) fails.

Reply via email to