alamb commented on PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#issuecomment-3025176564
Thank you @corwinjoy
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comme
corwinjoy commented on PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#issuecomment-3021088650
Thanks @alamb much appreciated for the review and helpful feedback! We hope
to have a followup PR soon with a config to make encryption optional.
--
This is an automated message
alamb commented on PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#issuecomment-3015285765
Thanks again @corwinjoy / @adamreeve and everyone else. This is great
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
alamb merged PR #16351:
URL: https://github.com/apache/datafusion/pull/16351
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2168131023
##
datafusion-cli/tests/sql/encrypted_parquet.sql:
##
@@ -0,0 +1,75 @@
+/*
+Test parquet encryption and decryption in DataFusion SQL.
+See datafusion/com
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2168097037
##
datafusion/common/src/config.rs:
##
@@ -2017,6 +2056,305 @@ config_namespace_with_hashmap! {
}
}
+#[derive(Clone, Debug, Default, PartialEq)]
+pub str
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2168085502
##
docs/source/user-guide/configs.md:
##
@@ -81,6 +81,8 @@ Environment variables are read during `SessionConfig`
initialisation so they mus
| datafusion.execut
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2168078769
##
datafusion/proto-common/src/from_proto/mod.rs:
##
@@ -1066,6 +1066,7 @@ impl TryFrom<&protobuf::TableParquetOptions> for
TableParquetOptions {
adamreeve commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2167900882
##
datafusion/datasource-parquet/src/file_format.rs:
##
@@ -930,12 +959,14 @@ pub async fn fetch_parquet_metadata(
store: &dyn ObjectStore,
meta: &Obje
adamreeve commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2167883986
##
datafusion/datasource-parquet/src/file_format.rs:
##
@@ -930,12 +959,14 @@ pub async fn fetch_parquet_metadata(
store: &dyn ObjectStore,
meta: &Obje
alamb commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2167542855
##
docs/source/user-guide/configs.md:
##
@@ -81,6 +81,8 @@ Environment variables are read during `SessionConfig`
initialisation so they mus
| datafusion.execution.
alamb commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2167546893
##
datafusion-cli/tests/sql/encrypted_parquet.sql:
##
@@ -0,0 +1,75 @@
+/*
+Test parquet encryption and decryption in DataFusion SQL.
+See datafusion/common/
corwinjoy commented on PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#issuecomment-3003280082
> Thank you @corwinjoy and @adamreeve -- this PR was a joy to read and
review. The code is clear, well commented, and well tested ❤️ 🏆
>
> I think we should follow up with:
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2165756822
##
datafusion-cli/tests/sql/encrypted_parquet.sql:
##
@@ -0,0 +1,75 @@
+/*
+Test parquet encryption and decryption in DataFusion SQL.
+See datafusion/com
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2165246656
##
docs/source/user-guide/configs.md:
##
@@ -81,6 +81,8 @@ Environment variables are read during `SessionConfig`
initialisation so they mus
| datafusion.execut
alamb commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2165219803
##
datafusion/core/src/dataframe/parquet.rs:
##
@@ -246,4 +246,72 @@ mod tests {
Ok(())
}
+
+#[tokio::test]
+async fn roundtrip_parquet_with_
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2165202964
##
datafusion/core/src/dataframe/parquet.rs:
##
@@ -246,4 +246,72 @@ mod tests {
Ok(())
}
+
+#[tokio::test]
+async fn roundtrip_parquet_w
alamb commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2164809665
##
datafusion/core/src/dataframe/parquet.rs:
##
@@ -246,4 +246,72 @@ mod tests {
Ok(())
}
+
+#[tokio::test]
+async fn roundtrip_parquet_with_
mbutrovich commented on PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#issuecomment-3000137303
> I am sorry I haven't had a chance to review this yet. It would be great if
@mbutrovich could also take a look. I have this on my list to review but I
haven't been able to find t
adamreeve commented on PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#issuecomment-2998703950
I've been experimenting with how this work could be extended to support more
ways of configuring encryption beyond having fixed and known AES keys for all
files. For example, data
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2151077910
##
datafusion/common/src/config.rs:
##
@@ -591,6 +930,12 @@ config_namespace! {
/// writing out already in-memory data, such as from a cached
/
adamreeve commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2151059658
##
datafusion/common/src/config.rs:
##
@@ -188,6 +195,338 @@ macro_rules! config_namespace {
}
}
+#[derive(Clone, Default, Debug, PartialEq)]
+pub struct
alamb commented on PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#issuecomment-2978284818
I am sorry I haven't had a chance to review this yet. It would be great if
@mbutrovich could also take a look. I have this on my list to review but I
haven't been able to find the tim
mbutrovich commented on PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#issuecomment-2977839997
Thank you and @adamreeve for driving so much of the modular encryption work!
I'll take a look at this branch this week and see how this might get Comet
supporting modular encrypti
corwinjoy commented on PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#issuecomment-2968354595
@alamb One piece I would like to solicit feedback on is if there is a way to
leverage the existing tests to more thoroughly vet encryption. What I mean by
that, is that we uncovere
alamb commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2139135378
##
datafusion/common/src/config.rs:
##
@@ -591,6 +930,12 @@ config_namespace! {
/// writing out already in-memory data, such as from a cached
/// d
adamreeve commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2136730391
##
datafusion/datasource-parquet/src/file_format.rs:
##
@@ -1259,9 +1302,14 @@ impl FileSink for ParquetSink {
object_store: Arc,
) -> Result {
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2136710028
##
benchmarks/src/bin/dfbench.rs:
##
@@ -60,11 +60,11 @@ pub async fn main() -> Result<()> {
Options::Cancellation(opt) => opt.run().await,
Opt
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2136714217
##
datafusion/common/src/config.rs:
##
@@ -188,6 +195,338 @@ macro_rules! config_namespace {
}
}
+#[derive(Clone, Default, Debug, PartialEq)]
+pub struct
corwinjoy commented on PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#issuecomment-2957368512
@adamreeve @rok
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2136732397
##
datafusion/datasource-parquet/src/file_format.rs:
##
@@ -1259,9 +1302,14 @@ impl FileSink for ParquetSink {
object_store: Arc,
) -> Result {
adamreeve commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2136733767
##
datafusion/common/src/config.rs:
##
@@ -188,6 +195,338 @@ macro_rules! config_namespace {
}
}
+#[derive(Clone, Default, Debug, PartialEq)]
+pub struct
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2136718671
##
datafusion/common/src/config.rs:
##
@@ -591,6 +930,12 @@ config_namespace! {
/// writing out already in-memory data, such as from a cached
/
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2136721651
##
datafusion/datasource-parquet/src/file_format.rs:
##
@@ -1259,9 +1302,14 @@ impl FileSink for ParquetSink {
object_store: Arc,
) -> Result {
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2136718671
##
datafusion/common/src/config.rs:
##
@@ -591,6 +930,12 @@ config_namespace! {
/// writing out already in-memory data, such as from a cached
/
corwinjoy commented on code in PR #16351:
URL: https://github.com/apache/datafusion/pull/16351#discussion_r2136715685
##
datafusion/common/src/config.rs:
##
@@ -188,6 +195,338 @@ macro_rules! config_namespace {
}
}
+#[derive(Clone, Default, Debug, PartialEq)]
+pub struct
corwinjoy opened a new pull request, #16351:
URL: https://github.com/apache/datafusion/pull/16351
## Which issue does this PR close?
- Closes #15216.
## What changes are included in this PR?
This PR adds support for encryption in DataFusion’s Parquet implementation.
37 matches
Mail list logo