LantaoJin opened a new issue, #79:
URL: https://github.com/apache/datafusion-java/issues/79

   ### Is your feature request related to a problem or challenge?
   
   Every error from the native side currently surfaces as a single 
`java.lang.RuntimeException`. `try_unwrap_or_throw` in `native/src/errors.rs` 
calls `env.throw_new("java/lang/RuntimeException", message)` for every 
`Err(...)` it sees, so a `DataFusionError::Plan` and a 
`DataFusionError::ResourcesExhausted` and a `DataFusionError::IoError` are 
indistinguishable on the Java side. Callers that want to retry on transient 
resource pressure but not on a malformed query plan have only the message 
string to discriminate on — fragile and brittle to upstream wording changes.
   
   A representative pain point: a service wrapper around `df.collect()` wants 
to bubble plan errors up as 4xx and resource-exhausted as 5xx-with-retry, and 
today that has to be done with 
`e.getMessage().contains("ResourcesExhausted")`-style sniffing.
   
   ### Describe the solution you'd like
   
   A small typed hierarchy under `org.apache.datafusion`, plus a routing pass 
in `try_unwrap_or_throw` that picks the right subclass based on the originating 
`DataFusionError` variant.
   
   ```
   DataFusionException        extends RuntimeException
   ├── PlanException          // DataFusionError::Plan, ::SQL, ::SchemaError
   ├── ExecutionException     // DataFusionError::Execution, ::ExecutionJoin
   ├── ResourcesExhaustedException
   ├── IoException            // DataFusionError::IoError, ::ObjectStore, 
::ArrowError, ::ParquetError, ::AvroError
   ├── NotImplementedException
   └── ConfigurationException
   ```
   
   Mapping (one Java class per "error category that matters to a caller"; 
multiple Rust variants fold into one Java class when they're the same kind of 
problem from the caller's standpoint):
   
   | `DataFusionError` variant | Java exception |
   |---|---|
   | `Plan`, `SQL`, `SchemaError` | `PlanException` |
   | `Execution`, `ExecutionJoin`, `External`, `Ffi` | `ExecutionException` |
   | `ResourcesExhausted` | `ResourcesExhaustedException` |
   | `IoError`, `ObjectStore`, `ArrowError`, `ParquetError`, `AvroError` | 
`IoException` |
   | `NotImplemented` | `NotImplementedException` |
   | `Configuration` | `ConfigurationException` |
   | `Internal`, `Substrait`, `Diagnostic`, `Collection`, `Shared`, `Context` | 
`DataFusionException` (parent; no good caller-facing category) |
   | Rust panic (caught by `try_unwrap_or_throw`) | `DataFusionException` 
(parent) with a `panic: ` prefix |
   
   Every subclass is a `RuntimeException`, so existing callers that `catch 
(RuntimeException)` keep working unchanged. The mapping is deliberately 
conservative — the hierarchy is shallow, and "I don't know what to do with 
this" stays as the parent class rather than inventing a class per Rust variant.
   
   Each subclass has the standard pair of constructors:
   
   ```java
   public PlanException(String message) { super(message); }
   public PlanException(String message, Throwable cause) { super(message, 
cause); }
   ```
   
   The `(String, Throwable)` constructor is the seam for a follow-up that 
propagates Java stack traces from upcalls (issue **#55**) — when a 
Java-implemented UDF or table-provider scan throws, the resulting 
`DataFusionException` can carry the original `Throwable` as its `cause` instead 
of stringifying it. This PR doesn't actually wire that path — it just makes 
sure the constructor exists so #55 doesn't have to change the public API 
surface.
   
   ### Describe alternatives you've considered
   
   **Mirror every `DataFusionError` variant 1:1.** Maximally faithful, but 
exposes implementation details (`SchemaError`, `Diagnostic`, `Shared`, 
`Collection`, `Context`) that aren't useful to a Java caller and that upstream 
itself may renumber. The proposed mapping is a stable contract; the underlying 
variant set is not.
   
   ### Additional context
   
   This issue lands before #55 . Reasons:
   1. #55 needs a Java parent class to carry the original `Throwable` as 
`cause`. This PR creates it. Otherwise #55 has to invent its own and renames it 
later.
   2. This issue is mechanical (one match arm per variant); #55 has a real 
design decision (stringify the Java stack trace, or keep the `Throwable` alive 
across the JNI via a registry handle so it can become `cause`). That decision 
is easier with the parent already merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to