Hi Nick,
Your glass of wine was inspiring: just removed
> ProxyHTMLCharsetOut * # Backend (Tomcat) charset is ISO-8859-1
and the problem's gone!
Also commented out
> ProxyHTMLMeta on
with no noticeable change in behaviour. As per the docs "turning ProxyHTMLMeta
Off will give a small performance boost", so off it goes.
Thank you so much!
FYI, by increasing LogLevel to INFO, error log shows:
[Fri May 08 07:42:35.790051 2020] [xml2enc:info] [pid 13183:tid
139823008806656] [client _redacted_:55344] AH01431: Got charset ISO-8859-1 from
HTTP headers
So our backend's stated charset is ISO-8859-1.
About your questions:
> Are you sure your backend is sending literally those entities, as opposed to
> their byte representations in its charset?
> Note that libxml2 is doing the hard work here: what version of libxml2 do you
> have?
"Faulty" entities are coded verbatim (i.e. "→") in the backend JSP pages,
and are rendered exactly that way in non-proxied responses. libxml2 version is
2.9.4 (within Debian 10.3 amd64).
I can do further testing, if you need it.
FYI 2 (side point):
> <Location "/">
> ProxyHTMLURLMap "/backend-path/(.*)" "/$1" R
We had some previous experience with proxy URL mapping, and "/frontend-path/"
<-> "/backend-path/" has always worked fine for us without the regexp. But
mapping the root frontend path "/" gave us some trouble; maybe there's a better
solution, but that regexp solved the issue.
Thank you again. Best regards,
Antonio
----- Mensaje original -----
De: "Nick Kew" <[email protected]>
Para: "users" <[email protected]>
Enviados: Viernes, 8 de Mayo 2020 1:49:25
Asunto: Re: [users@httpd] proxy_html / xml2enc won't handle certain HTML
entities
> On 7 May 2020, at 17:52, Antonio Suárez Pozuelo <[email protected]>
> wrote:
>
> Hi there,
Further to my last reply, I can see what may possibly be wrong:
> We have a Tomcat 8 backend server behind an Apache 2.4 proxy. Our Apache conf:
>
> ProxyPreserveHost on
> ProxyHTMLEnable on
> ProxyHTMLExtended on
You probably don't want that.
> ProxyHTMLCharsetOut * # Backend (Tomcat) charset is ISO-8859-1
I suspect that is very probably the culprit.
Does removing it fix the problem?
> ProxyHTMLMeta on
You probably also don't want that. I think the documentation of that
is misleadingly out-of-date, but I don't want to check now (late, and
after a glass of wine).
--
Nick Kew
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]