On Mon, Aug 30, 2021 at 10:00:51AM -0700, Per Bothner wrote:
> > However, having a customization variable to
> > output only numerical entities would be ok to me, maybe something like
> > 
> > USE_ONLY_NUMERICAL_ENTITY or NO_NAMED_ENTITY to avoid confusion with
> > USE_NUMERICAL_ENTITY.
> 
> I think more valuable would an "XML_COMPATIBLE" variable.
> In addition to numeric entities, it would guarantee to close all tags.
> E.g. instead of <br> it would emit <br/> - which also works with
> most (all?) HTML parsers.  And possibly other issues.

I don't see the point in adding customization variables to fine-tune
details like whether to use named entities or not.  If named entities
are valid HTML there's no problem.  I believe the decimal entities
existed first although hexadecimal entities would certainly be more
legible especially for codepoints > 255.

> 
> What I'm looking for is:
> (1) Be able to post-process html output with xml tools, such as xslt.
> (2) Generate valid epub3 ebooks.

These seem like valid goals so would be happy to see patches that produced
XML output, likely as an option.

Reply via email to