On Mon, Dec 03, 2012 at 01:49:08AM -0800, Benito2313 wrote:
>     What is it that you're trying to do? HTML is an XML dialect, after
>     all (or can be, if XHTML). You should be able to parse it with all
>     XML tools.
>  
> My program handles with Xml's.
> I can see the script code of the HTML when i open it noteblock. how can i see
> if it is XHTML?

I just checked the HTML output from Tesseract. It is XHTML, so it is
a proper dialect of XML. You can tell from the <?xml opening tag,
plus the doctype and xmlns on the following lines.

Nick

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to