timsaucer commented on code in PR #43: URL: https://github.com/apache/datafusion-ray/pull/43#discussion_r1856564321
########## src/context.rs: ########## @@ -45,22 +46,30 @@ pub struct PyContext { pub(crate) fn execution_plan_from_pyany( py_plan: &Bound<PyAny>, + py: Python, ) -> PyResult<Arc<dyn ExecutionPlan>> { - let py_proto = py_plan.call_method0("to_proto")?; - let plan_bytes: &[u8] = py_proto.extract()?; - let plan_node = protobuf::PhysicalPlanNode::try_decode(plan_bytes).map_err(|e| { - PyRuntimeError::new_err(format!( - "Unable to decode physical plan protobuf message: {}", - e - )) - })?; - - let codec = ShuffleCodec {}; - let runtime = RuntimeEnv::default(); - let registry = SessionContext::new(); - plan_node - .try_into_physical_plan(®istry, &runtime, &codec) - .map_err(|e| e.into()) + if let Ok(py_plan) = py_plan.to_object(py).downcast_bound::<PyExecutionPlan>(py) { + // For session contexts created with datafusion_ray.extended_session_context(), the inner + // execution plan can be used as such (and the enabled extensions are all available). + Ok(py_plan.borrow().plan.clone()) + } else { + // The session context originates from outside our library, so we'll grab the protobuf plan + // by calling the python method with no extension codecs. + let py_proto = py_plan.call_method0("to_proto")?; + let plan_bytes: &[u8] = py_proto.extract()?; + let plan_node = protobuf::PhysicalPlanNode::try_decode(plan_bytes).map_err(|e| { + PyRuntimeError::new_err(format!( + "Unable to decode physical plan protobuf message: {}", + e + )) + })?; + + let runtime = RuntimeEnv::default(); + let registry = SessionContext::new(); + plan_node + .try_into_physical_plan(®istry, &runtime, Extensions::codec()) + .map_err(|e| e.into()) + } Review Comment: One of my goals was to remove the `datafusion-python` dependency from `datafusion-ray` so that you wouldn't have hard requirements about having the exact same version between the two. It can be worse in that you also have to have the same *compiler* for both. Now for `datafusion-ray` we may be able to get away with it for official releases since we control the build pipeline for both. This does place a restriction on end users in that they have to make sure they keep these versions synced on their machine. In my opinion it would be better to lean on things like the FFI interface that is coming in `datafusion-python` 43.0.0. I know that right now doesn't solve the problem of having all extensions, though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org