XiangpengHao commented on code in PR #15325: URL: https://github.com/apache/datafusion/pull/15325#discussion_r2008190901
########## datafusion/wasmtest/src/lib.rs: ########## @@ -182,4 +182,29 @@ mod test { let task_ctx = ctx.task_ctx(); let _ = collect(physical_plan, task_ctx).await.unwrap(); } + + #[wasm_bindgen_test(unsupported = tokio::test)] + async fn test_parquet_write() { + let schema = Arc::new(Schema::new(vec![ + Field::new("id", DataType::Int32, false), + Field::new("value", DataType::Utf8, false), + ])); + + let data: Vec<ArrayRef> = vec![ + Arc::new(Int32Array::from(vec![1])), + Arc::new(StringArray::from(vec!["a"])), + ]; + + let batch = RecordBatch::try_new(schema.clone(), data).unwrap(); + let mut buffer = Vec::new(); + let mut writer = datafusion::parquet::arrow::ArrowWriter::try_new( + &mut buffer, + schema.clone(), + None, + ) + .unwrap(); + + writer.write(&batch).unwrap(); Review Comment: I agree; I think the current code tests the re-exported Parquet functionalities, not touching the DataFusion-related code. Ideally, we should test the end-to-end Parquet reading process. The process roughly looks like this: 1. Create a [in-memory object_store](https://docs.rs/object_store/latest/object_store/memory/struct.InMemory.html), and put the Parquet data you generated into the object_store. 2. [Register the object_store](https://github.com/apache/datafusion/blob/3269f01b42021cfab181577d579b0544808b4fca/datafusion/core/src/execution/context/mod.rs#L494) along with the path to the DataFusion. 3. Run a SQL query from the DataFusion side to see if the results can be read back. A loosely related test can be found here: https://github.com/XiangpengHao/parquet-viewer/blob/main/src/tests.rs#L9 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org