Is there only one RowGroup for this file? You can check this by printing the
file's metadata using the `meta` command of `parquet-cli`.
Yang Jie
发件人: zhangliyun
日期: 2023年3月23日 星期四 15:16
收件人: Spark Dev List
主题: please help the problem of big parquet file can not be splitted to read
hi all
i
What is unclear to me is why we are introducing this integration, how users
will leverage it.
* Are we replacing spark-shell with it ?
Given the existing gaps, this is not the case.
* Is it an example to showcase how to build an integration ?
That could be interesting, and we can add it to extern
The goal of adding this, is to make it easy for a user to connect a scala
REPL to a Spark Connect server. Just like Spark shell makes it easy to work
with a regular Spark environment.
It is not meant as a Spark shell replacement. They represent two different
modes of working with Spark, and they h
Sounds good, thanks for clarifying !
Regards,
Mridul
On Thu, Mar 23, 2023 at 9:09 AM Herman van Hovell
wrote:
> The goal of adding this, is to make it easy for a user to connect a scala
> REPL to a Spark Connect server. Just like Spark shell makes it easy to work
> with a regular Spark environm
I also support Herman's `SPARK-42884 Add Ammonite REPL integration` PR.
Thanks,
Dongjoon.
On Thu, Mar 23, 2023 at 7:51 AM Mridul Muralidharan
wrote:
>
> Sounds good, thanks for clarifying !
>
> Regards,
> Mridul
>
> On Thu, Mar 23, 2023 at 9:09 AM Herman van Hovell
> wrote:
>
>> The goal of a
+1 on better notebook and other REPL experience
On Thu, Mar 23, 2023 at 9:17 AM Dongjoon Hyun
wrote:
> I also support Herman's `SPARK-42884 Add Ammonite REPL integration` PR.
>
> Thanks,
> Dongjoon.
>
>
> On Thu, Mar 23, 2023 at 7:51 AM Mridul Muralidharan
> wrote:
>
>>
>> Sounds good, thanks f
I’m pretty sure snappy file is not splittable. That’s why you have a single
task (and most likely core) reading the 1.9GB snappy file
Sent from my iPhone
> On 23 Mar 2023, at 07:36, yangjie01 wrote:
>
> Is there only one RowGroup for this file? You can check this by printing the
> file's met
Hi,
Considering https://issues.apache.org/jira/browse/SPARK-42693 is a release
blocker, I would suggest we postpone the v3.4.0-rc5 until next week.
I appreciate the ongoing efforts into API auditing. Please feel free to
participate in the auditing if you are interested! Please refer to the
ticket