LantaoJin opened a new issue, #79:
URL: https://github.com/apache/datafusion-java/issues/79
### Is your feature request related to a problem or challenge?
Every error from the native side currently surfaces as a single
`java.lang.RuntimeException`. `try_unwrap_or_throw` in `native/src/errors.rs`
calls `env.throw_new("java/lang/RuntimeException", message)` for every
`Err(...)` it sees, so a `DataFusionError::Plan` and a
`DataFusionError::ResourcesExhausted` and a `DataFusionError::IoError` are
indistinguishable on the Java side. Callers that want to retry on transient
resource pressure but not on a malformed query plan have only the message
string to discriminate on — fragile and brittle to upstream wording changes.
A representative pain point: a service wrapper around `df.collect()` wants
to bubble plan errors up as 4xx and resource-exhausted as 5xx-with-retry, and
today that has to be done with
`e.getMessage().contains("ResourcesExhausted")`-style sniffing.
### Describe the solution you'd like
A small typed hierarchy under `org.apache.datafusion`, plus a routing pass
in `try_unwrap_or_throw` that picks the right subclass based on the originating
`DataFusionError` variant.
```
DataFusionException extends RuntimeException
├── PlanException // DataFusionError::Plan, ::SQL, ::SchemaError
├── ExecutionException // DataFusionError::Execution, ::ExecutionJoin
├── ResourcesExhaustedException
├── IoException // DataFusionError::IoError, ::ObjectStore,
::ArrowError, ::ParquetError, ::AvroError
├── NotImplementedException
└── ConfigurationException
```
Mapping (one Java class per "error category that matters to a caller";
multiple Rust variants fold into one Java class when they're the same kind of
problem from the caller's standpoint):
| `DataFusionError` variant | Java exception |
|---|---|
| `Plan`, `SQL`, `SchemaError` | `PlanException` |
| `Execution`, `ExecutionJoin`, `External`, `Ffi` | `ExecutionException` |
| `ResourcesExhausted` | `ResourcesExhaustedException` |
| `IoError`, `ObjectStore`, `ArrowError`, `ParquetError`, `AvroError` |
`IoException` |
| `NotImplemented` | `NotImplementedException` |
| `Configuration` | `ConfigurationException` |
| `Internal`, `Substrait`, `Diagnostic`, `Collection`, `Shared`, `Context` |
`DataFusionException` (parent; no good caller-facing category) |
| Rust panic (caught by `try_unwrap_or_throw`) | `DataFusionException`
(parent) with a `panic: ` prefix |
Every subclass is a `RuntimeException`, so existing callers that `catch
(RuntimeException)` keep working unchanged. The mapping is deliberately
conservative — the hierarchy is shallow, and "I don't know what to do with
this" stays as the parent class rather than inventing a class per Rust variant.
Each subclass has the standard pair of constructors:
```java
public PlanException(String message) { super(message); }
public PlanException(String message, Throwable cause) { super(message,
cause); }
```
The `(String, Throwable)` constructor is the seam for a follow-up that
propagates Java stack traces from upcalls (issue **#55**) — when a
Java-implemented UDF or table-provider scan throws, the resulting
`DataFusionException` can carry the original `Throwable` as its `cause` instead
of stringifying it. This PR doesn't actually wire that path — it just makes
sure the constructor exists so #55 doesn't have to change the public API
surface.
### Describe alternatives you've considered
**Mirror every `DataFusionError` variant 1:1.** Maximally faithful, but
exposes implementation details (`SchemaError`, `Diagnostic`, `Shared`,
`Collection`, `Context`) that aren't useful to a Java caller and that upstream
itself may renumber. The proposed mapping is a stable contract; the underlying
variant set is not.
### Additional context
This issue lands before #55 . Reasons:
1. #55 needs a Java parent class to carry the original `Throwable` as
`cause`. This PR creates it. Otherwise #55 has to invent its own and renames it
later.
2. This issue is mechanical (one match arm per variant); #55 has a real
design decision (stringify the Java stack trace, or keep the `Throwable` alive
across the JNI via a registry handle so it can become `cause`). That decision
is easier with the parent already merged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]