Copilot commented on code in PR #2870:
URL: https://github.com/apache/tika/pull/2870#discussion_r3363959200
##########
docs/modules/ROOT/pages/migration-to-4x/migrating-tika-server-4x.adoc:
##########
@@ -89,6 +89,25 @@ The separate `/config` endpoints have been removed.
Configuration is now handled
**Migration:** Use `POST /tika` or `POST /tika/json` with a `config` part in
your multipart request.
+=== Error Response Bodies Are Now JSON
+
+In 3.x, error responses from `/tika`, `/rmeta`, and `/unpack` returned a
plain-text
+body such as `"Parse failed: TIMEOUT"`. In 4.x these endpoints return a JSON
body:
+
+[source,json]
+----
+{"status": "TIMEOUT", "message": "Task timed out after 60000ms"}
+----
Review Comment:
The migration guide example implies the `message` field is always present in
JSON error bodies, but the implementation only includes it when
`returnStackTrace=true` (otherwise the body is typically status-only). This can
mislead client migrations that expect `message` to exist.
##########
tika-server/tika-server-core/src/main/java/org/apache/tika/server/core/resource/PipesParsingHelper.java:
##########
@@ -184,33 +194,59 @@ private String getSuffix(Metadata metadata) {
return ".tmp";
}
+ /**
+ * Builds a JSON error response carrying a subset of the {@code
PipesResult}
+ * serialization. By default the body is just {@code {"status":
"TIMEOUT"}}. The
+ * {@code PipesResult} message frequently contains a server-side stack
trace
+ * (e.g. for {@code *_EXCEPTION} statuses), so the {@code message} field
is included
+ * only when {@code returnStackTrace} is enabled — matching the legacy
+ * {@code TikaServerParseExceptionMapper}, which gates stack traces the
same way.
+ * Successful-parse fields such as {@code emitData} are never part of an
error body.
+ * <p>
+ * This allows clients to distinguish failure modes (TIMEOUT, OOM,
UNSPECIFIED_CRASH, …)
+ * without parsing plain-text bodies or inspecting custom headers.
+ */
+ private Response buildProcessFailureResponse(PipesResult result) {
+ ObjectMapper mapper = new ObjectMapper();
+ ObjectNode node = mapper.createObjectNode();
+ node.put("status", result.status().name());
+ if (returnStackTrace && result.message() != null &&
!result.message().isBlank()) {
+ node.put("message", result.message());
+ }
+ String json;
+ try {
+ json = mapper.writeValueAsString(node);
+ } catch (Exception e) {
+ json = "{\"status\":\"" + result.status().name() + "\"}";
+ }
Review Comment:
Exceptions during JSON serialization are swallowed silently here. If
serialization fails, we’ll return a fallback body without any indication in
logs, which makes diagnosing unexpected failures harder.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]