[ML] Setting Non-Transform Params for a Pipeline & PipelineModel

2018-09-05 Thread Aleksander Eskilson
I had originally sent this to the Dev list since the API discussed here is still marked as experimental in portions, but it occurs to me this may still be a general use question, sorry for the cross-listing. In a nutshell, what I'd like to do is instantiate a Pipeline (or extension class of Pipeli

Driver/Executor Memory values during Unit Testing

2016-12-07 Thread Aleksander Eskilson
Hi there, I've been trying to increase the spark.driver.memory and spark.executor.memory during some unit tests. Most of the information I can find about increasing memory for Spark is based on either flags to spark-submit, or settings in the spark-defaults.conf file. Running unit tests with Maven

[SQL/Catalyst] Janino Generated Code Debugging

2016-11-16 Thread Aleksander Eskilson
Hi there, I have some jobs generating Java code (via Janino) that I would like to inspect more directly during runtime. The Janino page seems to indicate an environmental variable can be set to support debugging the generated code, allowing one to step into it directly and inspect variables and se

Extracting Row Value for Deserializer Expression

2016-10-04 Thread Aleksander Eskilson
Hi there, Currently working on a custom Encoder for a kind of schema-based Java object. For the object's schema, field positions, and types are isomorphic to SQL column ordinals and types. The implementation should be quite similar to the JavaBean Encoder, but as we have a schema, class-based refl