milenkovicm commented on code in PR #1374:
URL: 
https://github.com/apache/datafusion-ballista/pull/1374#discussion_r2678525417


##########
ballista/core/src/execution_plans/shuffle_reader.rs:
##########
@@ -521,33 +539,44 @@ async fn fetch_partition_remote(
 
 async fn fetch_partition_local(
     location: &PartitionLocation,
+    arrow_ipc_reader_skip_validation: bool,
 ) -> result::Result<SendableRecordBatchStream, BallistaError> {
     let path = &location.path;
     let metadata = &location.executor_meta;
     let partition_id = &location.partition_id;
 
-    let reader = fetch_partition_local_inner(path).map_err(|e| {
-        // return BallistaError::FetchFailed may let scheduler retry this task.
-        BallistaError::FetchFailed(
-            metadata.id.clone(),
-            partition_id.stage_id,
-            partition_id.partition_id,
-            e.to_string(),
-        )
-    })?;
+    let reader = fetch_partition_local_inner(path, 
arrow_ipc_reader_skip_validation)
+        .map_err(|e| {
+            // return BallistaError::FetchFailed may let scheduler retry this 
task.
+            BallistaError::FetchFailed(
+                metadata.id.clone(),
+                partition_id.stage_id,
+                partition_id.partition_id,
+                e.to_string(),
+            )
+        })?;
     Ok(Box::pin(LocalShuffleStream::new(reader)))
 }
 
 fn fetch_partition_local_inner(
     path: &str,
+    arrow_ipc_reader_skip_validation: bool,
 ) -> result::Result<StreamReader<BufReader<File>>, BallistaError> {
     let file = File::open(path).map_err(|e| {
         BallistaError::General(format!("Failed to open partition file at 
{path}: {e:?}"))
     })?;
     let file = BufReader::new(file);
-    let reader = StreamReader::try_new(file, None).map_err(|e| {
-        BallistaError::General(format!("Failed to new arrow FileReader at 
{path}: {e:?}"))
-    })?;
+    // Safety: setting `skip_validation` requires `unsafe`, user assures data 
is valid

Review Comment:
   I'm not sure if we're introducing security issue with `skip validation`, at 
the moment shuffle files use absolutely paths which may bring issues with skip 
validation. 
   
   as I mentioned already, would it make sense to make this compile time 
configuration so users could decide for themself if they want it or not?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to