Re: [VOTE] Release Spark 4.0.0 (RC4)

2025-04-21 Thread Manu Zhang
I don't think PARQUET-2432 has any issue itself. It looks to have triggered a deadlock case like https://github.com/apache/spark/pull/50594. I'd suggest that we fix forward if possible. Thanks, Manu On Mon, Apr 21, 2025 at 11:19 PM Rozov, Vlad wrote: > The deadlock is reproducible without Parqu

Re: Support custom driver metrics in writing to v2 table

2024-10-28 Thread Manu Zhang
Hi all, Much appreciated if I can get some eyes on the PR! Thanks, Manu On Wed, Oct 23, 2024 at 9:41 AM Manu Zhang wrote: > Hi community, > > I've opened a PR[1] to support custom driver metrics in writing to the v2 > table. > Please help review and leave your comments. A

Support custom driver metrics in writing to v2 table

2024-10-22 Thread Manu Zhang
Hi community, I've opened a PR[1] to support custom driver metrics in writing to the v2 table. Please help review and leave your comments. Appreciate it! [1] https://github.com/apache/spark/pull/48573 Thanks, Manu

Re: Inconsistent behavior between SQL and DataFrame API on store assignment policy

2024-10-15 Thread Manu Zhang
te: > Yea looks like a bug, the SQL and DataFrame APIs should be consistent. > Please create a JIRA ticket, thanks! > > On Mon, Oct 14, 2024 at 3:29 PM Manu Zhang > wrote: > >> Hi community, >> >> With `spark.sql.storeAssignmentPolicy=LEGACY` in Spark 3.5, it

Inconsistent behavior between SQL and DataFrame API on store assignment policy

2024-10-13 Thread Manu Zhang
Hi community, With `spark.sql.storeAssignmentPolicy=LEGACY` in Spark 3.5, it's not allowed to write to DSv2 with insert SQL. However, this can be worked around with DataFrame API, i.e., `df.writeTo($dsv2Table).append()` Is this expected? Thanks, Manu

[DISCUSS] Spark 3.5.3 breaks Iceberg SparkSessionCatalog

2024-09-22 Thread Manu Zhang
Hi Iceberg and Spark community, I'd like to bring your attention to a recent change[1] in Spark 3.5.3 that effectively breaks Iceberg's SparkSessionCatalog[2] and blocks Iceberg upgrading to Spark 3.5.3[3]. SparkSessionCatalog, as a customized Spark V2 session catalog, supports creating a V1 tabl

Re: Spark 3.0.0 EOL

2023-07-26 Thread Manu Zhang
7:35 PM Sean Owen wrote: > There aren't "LTS" releases, though you might expect the last 3.x release > will see maintenance releases longer. See end of > https://spark.apache.org/versioning-policy.html > > On Wed, Jul 26, 2023 at 3:56 AM Manu Zhang > wrot

Re: Spark 3.0.0 EOL

2023-07-26 Thread Manu Zhang
Will Apache Spark 3.5 be a LTS version? Thanks, Manu On Mon, Jul 24, 2023 at 4:26 PM Dongjoon Hyun wrote: > As Hyukjin replied, Apache Spark 3.0.0 is already in EOL status. > > To Pralabh, FYI, in the community, > > - Apache Spark 3.2 also reached the EOL already. > https://lists.apache.org/t

Re: please provide the detail way to generate the tpcds data in TPCDSQueryBenchmark

2023-05-15 Thread Manu Zhang
Hi Kelly, You may follow the steps in the benchmark GitHub workflow https://github.com/apache/spark/blob/master/.github/workflows/benchmark.yml Regards, Manu On Mon, May 15, 2023 at 5:49 PM zhangliyun wrote: > hi > > i want to set up a tpcds benchmark to test some performance of some > spark

Re: CVE-2021-38296: Apache Spark Key Negotiation Vulnerability

2022-03-09 Thread Manu Zhang
> > On Wed, Mar 9, 2022 at 2:58 PM Manu Zhang wrote: > >> Hi Sean, >> >> I don't find it in 3.1.3 release notes >> https://spark.apache.org/releases/spark-release-3-1-3.html. Is it >> tracked somewhere? >> >> On Thu, Mar 10, 2022 at

Re: CVE-2021-38296: Apache Spark Key Negotiation Vulnerability

2022-03-09 Thread Manu Zhang
Hi Sean, I don't find it in 3.1.3 release notes https://spark.apache.org/releases/spark-release-3-1-3.html. Is it tracked somewhere? On Thu, Mar 10, 2022 at 6:14 AM Sean R. Owen wrote: > Severity: moderate > > Description: > > Apache Spark supports end-to-end encryption of RPC connections via >

Why is failing to get Hive token non fatal ?

2021-04-22 Thread Manu Zhang
swallowed the error and continued to submit application. Is there any specific reason for this design ? I also created https://issues.apache.org/jira/browse/SPARK-35160 <https://issues.apache.org/jira/browse/SPARK-35160>. Thanks, Manu Zhang

Re: scalastyle failing even after ./dev/scalafmt

2020-05-16 Thread Manu Zhang
I recently found it's slick to add `dev/lint-scala` to git pre-commit hooks. On Sun, May 17, 2020 at 9:22 AM Sean Owen wrote: > You just follow the standard style guide - pretty much copy what you see - > and run scalastyle locally to fix any few issues that pop up. > > On Sat, May 16, 2020, 6

Re: is there any tool to visualize the spark physical plan or spark plan

2020-04-30 Thread Manu Zhang
Hi Kelly, If you can parse event log, then try listening on `SparkListenerSQLExecutionStart` event and build a `SparkPlanGraph` like https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala#L306 . `SparkPlanGraph` has a `make

Re: [FYI] SBT Build Failure

2020-01-18 Thread Manu Zhang
Thanks Dongjoon for the information and the fix in https://github.com/apache/spark/pull/27242 On Fri, Jan 17, 2020 at 6:58 AM Sean Owen wrote: > Ah. The Maven build already long since points at https:// for > resolution for security. I tried just overriding the resolver for the > SBT build, but

Re: Minimum JDK8 version

2019-10-24 Thread Manu Zhang
> > Probably, but what is the difference that makes it different to > support u81 vs later? > How about docker support https://blog.softwaremill.com/docker-support-in-new-java-8-finally-fd595df0ca54 On Fri, Oct 25, 2019 at 9:05 AM Takeshi Yamamuro wrote: > Hi, Dongjoon > > It might be worth c

Re: [Performance] Spark DataFrame is slow with wide data. Polynomial complexity on the number of columns is observed. Why?

2018-08-20 Thread Manu Zhang
. There is `RuleExecutor.dumpTimeSpent` that prints analysis time and turning on DEBUG log will also give you much more info. Thanks, Manu Zhang On Mon, Aug 20, 2018 at 10:25 PM antonkulaga wrote: > makatun, did you try to test somewhing more complex, like > dataframe.describe &g