Re: Question Regarding Spark Dependencies in Scala

2025-06-06 Thread Sem
> I may not need anything from spark but if I'll declare a dependency in Jackson or guava with a different version than spark already use and package- I might break things... In that case I would recommend you to use assembly / assemblyShadeRules for sbt-assembly or maven-shade-plugin for maven an

Re: [DISCUSS] SPIP: Declarative Pipelines

2025-04-10 Thread Sem
+1 (non-binding) On April 9, 2025 7:29:40 AM GMT+02:00, Rishab Joshi wrote: >+1 Exciting. >Rishab Joshi > >On Tue, Apr 8, 2025, 10:04 PM Ruifeng Zheng wrote: > >> +1 >> >> On Wed, Apr 9, 2025 at 12:57 PM Denny Lee wrote: >> >>> +1 (non-binding) >>> >>> On Tue, Apr 8, 2025 at 9:53 PM Yuming Wan

Re: [VOTE] SPIP: Declarative Pipelines

2025-04-09 Thread Sem
+1 (non-binding) On Wed, 2025-04-09 at 07:22 -0700, Sandy Ryza wrote: > We started to get some votes on the discussion thread, so I'd like to > move to a formal vote on adding support for declarative pipelines. > > *Discussion thread: > * https://lists.apache.org/thread/lsv8f829ps0bog41fjoqc45xk7

Re: [PROPOSAL] Unified PySpark-Pandas API to Bridge Data Engineering and ML Workflows

2025-02-12 Thread Sem
DuckDB provides "PySpark syntax" on top of fast single node engine: https://duckdb.org/docs/clients/python/spark_api.html As I remember, DuckDB is much faster than pandas on a single node and it already provides a spark-compatible API. On 2/10/25 1:02 PM, José Müller wrote: Hi all, I'm new