mattcuento commented on PR #1360:
URL: 
https://github.com/apache/datafusion-ballista/pull/1360#issuecomment-3708756589

   > there is one issue with docker build and looks like issue with disk space 
failing other (not sure how to fix)
   Thanks, looks like `substrait` doesn't run a high enough version of `protoc` 
to to support optional fields by default.
   ```
   error: failed to run custom build command for `substrait v0.62.2`
   note: To improve backtraces for build dependencies, set the 
CARGO_PROFILE_RELEASE_BUILD_OVERRIDE_DEBUG=true environment variable to enable 
debug information generation.
   
   Caused by:
     process didn't exit successfully: 
`/home/builder/workspace/target/release/build/substrait-339db7bdcba362ae/build-script-build`
 (exit status: 1)
     --- stdout
     cargo:rerun-if-env-changed=FORCE_REBUILD
     cargo:rerun-if-changed=substrait
     cargo:rerun-if-changed=substrait/text/dialect_schema.yaml
     cargo:rerun-if-changed=substrait/text/simple_extensions_schema.yaml
     cargo:rerun-if-changed=substrait/proto/substrait/plan.proto
     
cargo:rerun-if-changed=substrait/proto/substrait/extensions/extensions.proto
     cargo:rerun-if-changed=substrait/proto/substrait/type.proto
     cargo:rerun-if-changed=substrait/proto/substrait/parameterized_types.proto
     cargo:rerun-if-changed=substrait/proto/substrait/algebra.proto
     cargo:rerun-if-changed=substrait/proto/substrait/extended_expression.proto
     cargo:rerun-if-changed=substrait/proto/substrait/capabilities.proto
     cargo:rerun-if-changed=substrait/proto/substrait/function.proto
     cargo:rerun-if-changed=substrait/proto/substrait/type_expressions.proto
   
     --- stderr
     Error: Custom { kind: Other, error: "protoc failed: substrait/type.proto: 
This file contains proto3 optional fields, but 
--experimental_allow_proto3_optional was not set.\n" }
   ```
   
   From this [issue](https://github.com/apache/datafusion/issues/13853) I 
gathered that we could compile `protoc` ourselves for the build, and that's 
what I have now for the latest commit. However, looks like a few distributions 
(don't ship with or download cmake for 
compilation)[https://github.com/apache/datafusion-ballista/actions/runs/20702788219/job/59427713820?pr=1360].
 I'm curious, @milenkovicm, have any advice here? Should I just add cmake in 
the necessary Dockerfiles?
   
   
   > Maybe as a follow up we should put a bit more documentation around this 
and example(s)
   Agreed, happy to file an issue to track it. I'm still kind of curious if 
more changes would be desired for user ergonomics. Do we have existing examples 
of connecting to a scheduler besides the client? I'd imagine it might be useful 
to add some convenience/wrapper methods to create scheduler gRPC clients, such 
as combining `create_grpc_client_connection` + 
`SchedulerGrpcClient::new(connection)` like in `distributed_query.rs`:
   ```
       info!("Connecting to Ballista scheduler at {scheduler_url}");
       // TODO reuse the scheduler to avoid connecting to the Ballista 
scheduler again and again
       let connection = create_grpc_client_connection(scheduler_url, 
&grpc_config)
           .await
           .map_err(|e| DataFusionError::Execution(format!("{e:?}")))?;
   
       let mut scheduler = SchedulerGrpcClient::new(connection)
           .max_encoding_message_size(max_message_size)
           .max_decoding_message_size(max_message_size);
   ```
   
   I don't know much about Ibis, but will take a look to see how it would 
integrate with this.
   
   > Also, could we gate substrait with config option, which could be on by 
default?
   > Users not needing it could disable it at compile time.
   Done! Updated the PR description, added support/conditional dependencies 
under `substrait`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to