Thorsten,
On 5/2/25 2:49 PM, Thorsten Heit wrote:
please excuse the long delay in answering (unplanned holidays...)
Tomcat is never going to figure out what MIME type should be used for
a request like "/my/servlet/app?version=!!1.22.32-4-g8a3c060!!"
So I think Mark is probably right (well, he's right like 99.999% of
the time, so...) about this being related to https://bz.apache.org/
bugzilla/ show_bug.cgi?id=69623 but I suspect your servlet is not
explicitly setting a content-type.
It took quite some time debugging into our application to find out the
place where the output difference happens between Tomcat 10.1.39 and
10.1.40:
We're using our own template engine that renders the HTML output,
depending on the actual state of the application for the (logged-in)
user. Basically the code is working the same way as this minimal
reproducer:
Thanks for the reproducer. I can confirm that on my Tomcat/10.1.41-dev
environment I am receiving this response:
$ curl -v http://localhost:8080/test/hello
* Host localhost:8080 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
* Trying [::1]:8080...
* Connected to localhost (::1) port 8080
> GET /test/hello HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200
< Content-Type: content/unknown;charset=UTF-8
< Content-Length: 226
< Date: Mon, 05 May 2025 15:48:59 GMT
<
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" >
<title>Hello World!</title>
</head>
<body>
<p>Hello World!</p>
</body>
</html>
* Connection #0 to host localhost left intact
Note that your call to getResource("HelloWorld.html") is technically
illegal as your string must begin with a / but it worked once I got
HelloWorld.html into the right place :)
If I make the following change, I get the expected text/html
content-type coming back:
var output = baos.toByteArray();
response.setContentLength(length);
- response.setContentType(contentType);
+ //response.setContentType(contentType);
try (var os = response.getOutputStream()) {
os.write(output, 0, output.length);
That is, if you don't try to change the content-type a second time.
So it seems that URLConnection.getContentType is returning an unexpected
content type.
ServletContext.getResource().openConnection() is returning an instance
of
org.apache.catalina.webresources.CachedResource$CachedResourceURLConnection
The code for CachedResourceURLConnection.getContentType is:
@Override
public String getContentType() {
// "content/unknown" is the value used by
sun.net.www.URLConnection. It is used here for consistency.
return
Objects.requireNonNullElse(getResource().getMimeType(), "content/unknown");
}
So I think this is "expected behavior", though it may not be expected by
you.
My assertion that returning text/unknown was a Tomcat bug was based upon
the idea that Tomcat was either providing the wrong default (as per HTTP
spec) MIME type for an HTTP response, or that Tomcat was somehow
ignoring the content type you explicitly requested, but neither are the
case.
The cached resource is not providing the correct MIME type, but that
content type is being used without re-checking it in any way.
Instead of using the content type coming from the URLConnection, perhaps
you want to use name-based MIME type detection?
contentType =
request.getServletContext().getMimeType("HelloWorld.html");
But the change that did it was 84608e11906b4d56e74a3ea2f5a4df0b9e8ee09a:
+
+ @Override
+ public String getContentType() {
+ // "content/unknown" is the value used by
sun.net.www.URLConnection. It is used here for consistency.
+ return
Objects.requireNonNullElse(getResource().getMimeType(), "content/unknown");
+ }
This was added to fix this bug:
https://bz.apache.org/bugzilla/show_bug.cgi?id=69623
It seems that returning a null MIME type here causes problems
downstream. I might change your code to this:
- if (null == contentType) {
+ if (null == contentType || contentType.equals("content/unknown")) {
contentType = "text/html";
}
This will make the code compatible with Sun's URLConnection class when
it can't determine the content-type of a resource.
I think an argument might be made for attempting to use Tomcat's
existing filename-based MIME registry from within this class, though.
-chris
package com.example;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import jakarta.servlet.ServletException;
import jakarta.servlet.annotation.WebServlet;
import jakarta.servlet.http.HttpServlet;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
@WebServlet("/HelloWorld")
public class HelloWorld extends HttpServlet {
@Override
protected void doGet(HttpServletRequest request, HttpServletResponse
response) throws ServletException, IOException {
response.setContentType("text/html; charset=UTF-8");
var resource =
request.getServletContext().getResource("HelloWorld.html");
var connection = resource.openConnection();
String contentType = connection.getContentType();
if (null == contentType) {
contentType = "text/html";
}
var length = connection.getContentLength();
var baos = new ByteArrayOutputStream();
try (var is = connection.getInputStream()) {
is.transferTo(baos);
}
var output = baos.toByteArray();
response.setContentLength(length);
response.setContentType(contentType);
try (var os = response.getOutputStream()) {
os.write(output, 0, output.length);
}
}
}
web.xml:
<?xml version="1.0" encoding="UTF-8"?>
<web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee
http://xmlns.jcp.org/xml/ns/javaee/web-app_3_1.xsd"
version="3.1" metadata-complete="true">
<display-name>Hello World Web Application</display-name>
<servlet>
<servlet-name>HelloWorld</servlet-name>
<servlet-class>com.example.HelloWorld</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>HelloWorld</servlet-name>
<url-pattern>/*</url-pattern>
</servlet-mapping>
</web-app>
HelloWorld.html:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" >
<title>Hello World!</title>
</head>
<body>
<p>Hello World!</p>
</body>
</html>
With Tomcat 10.1.39 I'm getting the following result with curl:
%> curl -v http://localhost:8080/hello
(...)
* Request completely sent off
< HTTP/1.1 200
< Content-Type: text/html;charset=UTF-8
< Content-Length: 212
(...)
With Tomcat 10.1.40:
(...)
* Request completely sent off
< HTTP/1.1 200
< Content-Type: content/unknown;charset=UTF-8
< Content-Length: 212
(...)
The reason is exactly what you assumed and the change that Mark mentioned:
Since 10.1.40 the class CachedResource$CachedResourceURLConnection now
has a new method "public String getContentType()" that is causing this
difference...
Ok, we could change our code so that in case the content type is set to
"content/unknown" we're replacing that by "text/html". OTOH with respect
to our customers this isn't really a good solution: On the one hand they
partly have older releases that would have to be patched. On the other
hand we normally don't have control about their environment; we could
only give advises, especially in this case don't upgrade to Tomcat
10.1.40...
WDYT?
Thorsten
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org