[ANNOUNCE] Announcing Apache Spark 2.3.4

2019-09-09 Thread Kazuaki Ishizaki
. Kazuaki Ishizaki

RE: Release Spark 2.3.4

2019-08-22 Thread Kazuaki Ishizaki
ext Monday. Regards, Kazuaki Ishizaki From: "Kazuaki Ishizaki" To: "Kazuaki Ishizaki" Cc: Dilip Biswal , dev , Hyukjin Kwon , jzh...@apache.org, Takeshi Yamamuro , Xiao Li Date: 2019/08/20 13:12 Subject:[EXTERNAL] RE: Release Spark 2.3.4 Due to the r

Release Spark 2.3.4

2019-08-16 Thread Kazuaki Ishizaki
)%20ORDER%20BY%20priority%20DESC%2C%20key%20ASC Best Regards, Kazuaki Ishizaki

RE: Release Apache Spark 2.4.4

2019-08-13 Thread Kazuaki Ishizaki
Thanks, Dongjoon! +1 Kazuaki Ishizaki, From: Hyukjin Kwon To: Takeshi Yamamuro Cc: Dongjoon Hyun , dev , User Date: 2019/08/14 09:21 Subject:[EXTERNAL] Re: Release Apache Spark 2.4.4 +1 2019년 8월 14일 (수) 오전 9:13, Takeshi Yamamuro 님 이 작성: Hi, Thanks for your

Re: Re: Release Apache Spark 2.4.4 before 3.0.0

2019-07-16 Thread Kazuaki Ishizaki
Thank you Dongjoon for being a release manager. If the assumed dates are ok, I would like to volunteer for an 2.3.4 release manager. Best Regards, Kazuaki Ishizaki, From: Dongjoon Hyun To: dev , "user @spark" , Apache Spark PMC Date: 2019/07/13 07:18 Subject:[EX

Re: [Help] Codegen Stage grows beyond 64 KB

2018-06-20 Thread Kazuaki Ishizaki
83647". The log has to include the all of the generated Java methods. The community may take more time to address this problem than the case with the small program. Best Regards, Kazuaki Ishizaki From: Aakash Basu To: Kazuaki Ishizaki Cc: vaquar khan , Eyal Zituny , user Date:

Re: [Help] Codegen Stage grows beyond 64 KB

2018-06-20 Thread Kazuaki Ishizaki
that the community will address this problem. Best regards, Kazuaki Ishizaki From: vaquar khan To: Eyal Zituny Cc: Aakash Basu , user Date: 2018/06/18 01:57 Subject:Re: [Help] Codegen Stage grows beyond 64 KB Totally agreed with Eyal . The problem is that when Java

Re: Strange codegen error for SortMergeJoin in Spark 2.2.1

2018-06-06 Thread Kazuaki Ishizaki
Thank you for reporting a problem. Would it be possible to create a JIRA entry with a small program that can reproduce this problem? Best Regards, Kazuaki Ishizaki From: Rico Bergmann To: "user@spark.apache.org" Date: 2018/06/05 19:58 Subject:Strange codegen

Re: tuning - Spark data serialization for cache() ?

2017-08-07 Thread Kazuaki Ishizaki
, Kazuaki Ishizaki From: Ofir Manor To: Kazuaki Ishizaki Cc: user Date: 2017/08/08 03:12 Subject:Re: tuning - Spark data serialization for cache() ? Thanks a lot for the quick pointer! So, is the advice I linked to in official Spark 2.2 documentation misleading? You are

Re: tuning - Spark data serialization for cache() ?

2017-08-07 Thread Kazuaki Ishizaki
these PRs will be integrated into Spark 2.3. Kazuaki Ishizaki From: Ofir Manor To: user Date: 2017/08/08 02:04 Subject:tuning - Spark data serialization for cache() ? Hi, I'm using Spark 2.2, and have a big batch job, using dataframes (with built-in, basic types

Re: Java access to internal representation of DataTypes.DateType

2017-06-14 Thread Kazuaki Ishizaki
Does this code help you? https://github.com/apache/spark/blob/master/sql/core/src/test/java/test/org/apache/spark/sql/JavaDataFrameSuite.java#L156-L194 Kazuaki Ishizaki From: Anton Kravchenko To: "user @spark" Date: 2017/06/14 01:16 Subject:Java access t

Re: how do i force unit test to do whole stage codegen

2017-04-04 Thread Kazuaki Ishizaki
opic for d...@spark.apache.org. Kazuaki Ishizaki From: Koert Kuipers To: "user@spark.apache.org" Date: 2017/04/05 05:12 Subject:how do i force unit test to do whole stage codegen i wrote my own expression with eval and doGenCode, but doGenCode never gets called in test

Re: [Spark SQL & Core]: RDD to Dataset 1500 columns data with createDataFrame() throw exception of grows beyond 64 KB

2017-03-18 Thread Kazuaki Ishizaki
org.codehaus.janino.util.ClassFile.addConstantNameAndTypeInfo(ClassFile.java:439) at org.codehaus.janino.util.ClassFile.addConstantMethodrefInfo(ClassFile.java:358) ... While this PR https://github.com/apache/spark/pull/16648 addresses the number of the constant pool issue, it has not been merged yet. Regards, Kazuaki Ishizaki

Re: Linear regression + Janino Exception

2016-11-21 Thread Kazuaki Ishizaki
Thank you for reporting the error. I think that this is associated to https://issues.apache.org/jira/browse/SPARK-18492 The reporter of this JIRA entry has not posted the program yet. Would it be possible to add your program that can reproduce this issue to this JIRA entry? Regards, Kazuaki

Re: Spark SQL is slower when DataFrame is cache in Memory

2016-10-27 Thread Kazuaki Ishizaki
Hi Chin Wei, Thank you for confirming this on 2.0.1 and being happy to hear it never happens. The performance will be improved when this PR ( https://github.com/apache/spark/pull/15219) is integrated. Regards, Kazuaki Ishizaki From: Chin Wei Low To: Kazuaki Ishizaki/Japan/IBM@IBMJP Cc

Re: [Spark 2.0.1] Error in generated code, possible regression?

2016-10-24 Thread Kazuaki Ishizaki
Can you have a smaller program that can reproduce the same error? If you also create a JIRA entry, it would be great. Kazuaki Ishizaki From: Efe Selcuk To: "user @spark" Date: 2016/10/25 10:23 Subject:[Spark 2.0.1] Error in generated code, possible regression?

Re: Spark SQL is slower when DataFrame is cache in Memory

2016-10-24 Thread Kazuaki Ishizaki
Hi Chin Wei, I am sorry for being late to reply. Got it. Interesting behavior. How did you measure the time between 1st and 2nd events? Best Regards, Kazuaki Ishizaki From: Chin Wei Low To: Kazuaki Ishizaki/Japan/IBM@IBMJP Cc: user@spark.apache.org Date: 2016/10/10 11:33 Subject

Re: Spark SQL is slower when DataFrame is cache in Memory

2016-10-07 Thread Kazuaki Ishizaki
you use to get data, cache or parquet? val res = sqlContext.sql("table1 union table2 union table3") res.explain(true) res.collect() Do I make some misunderstandings? Best Regards, Kazuaki Ishizaki From: Chin Wei Low To: Kazuaki Ishizaki/Japan/IBM@IBMJP Cc: user@spark.apach

Re: Spark SQL is slower when DataFrame is cache in Memory

2016-10-07 Thread Kazuaki Ishizaki
/spark/pull/15219) is ready for review. It would achieve 1.2x performance improvement for a compressed column and much performance improvement for an uncompressed column. Best Regards, Kazuaki Ishizaki From: Chin Wei Low To: user@spark.apache.org Date: 2016/10/07 13:05 Subject

Re: Change nullable property in Dataset schema

2016-08-16 Thread Kazuaki Ishizaki
not work for my purpose. It actually does nothing. Kazuaki Ishizaki From: Jacek Laskowski To: Kazuaki Ishizaki/Japan/IBM@IBMJP Cc: user Date: 2016/08/15 04:56 Subject:Re: Change nullable property in Dataset schema On Wed, Aug 10, 2016 at 12:04 AM, Kazuaki Ishizaki wrote:

Re: Change nullable property in Dataset schema

2016-08-16 Thread Kazuaki Ishizaki
branches. For example, in the above URL, we can say the condition at line 45 is always false since the result of map() is never null by using our schema. As a result, we can eliminate assignments at lines 52 and 56, and conditional branches at lines 55 and 61. Kazuaki Ishizaki From: Koert Kuipers

Re: Spark 2.0.0 JaninoRuntimeException

2016-08-16 Thread Kazuaki Ishizaki
I just realized it since it broken a build with Scala 2.10. https://github.com/apache/spark/commit/fa244e5a90690d6a31be50f2aa203ae1a2e9a1cf I can reproduce the problem in SPARK-15285 with master branch. Should we reopen SPARK-15285? Best Regards, Kazuaki Ishizaki, From: Ted Yu To

Re: Change nullable property in Dataset schema

2016-08-10 Thread Kazuaki Ishizaki
alse), nullable = false))) .as(newDoubleArrayEncoder) ds1.printSchema ds2.printSchema } } root |-- value: array (nullable = true) ||-- element: integer (containsNull = false) root |-- value: array (nullable = false) ||-- element: integer (containsNull = false) K

Change nullable property in Dataset schema

2016-08-03 Thread Kazuaki Ishizaki
a ds2.printSchema } } root |-- value: array (nullable = true) ||-- element: integer (containsNull = false) root |-- value: array (nullable = true) // Expected (nullable = false) ||-- element: integer (containsNull = false) Kazuaki Ishizaki

Re: Spark GraphFrames

2016-08-02 Thread Kazuaki Ishizaki
Sorry Please ignore this mail. Sorry for misinterpretation of GraphFrame in Spark. I thought that Frame Graph for profiling tool. Kazuaki Ishizaki, From: Kazuaki Ishizaki/Japan/IBM@IBMJP To: Divya Gehlot Cc: "user @spark" Date: 2016/08/02 17:06 Subject:

Re: Spark GraphFrames

2016-08-02 Thread Kazuaki Ishizaki
Hi, Kay wrote a procedure to use GraphFrames with Spark. https://gist.github.com/kayousterhout/7008a8ebf2babeedc7ce6f8723fd1bf4 Kazuaki Ishizaki From: Divya Gehlot To: "user @spark" Date: 2016/08/02 14:52 Subject:Spark GraphFrames Hi, Has anybody has w

Re: Catalyst optimizer cpu/Io cost

2016-06-10 Thread Kazuaki Ishizaki
Hi Yin Huai's slide is avaiable at http://www.slideshare.net/databricks/deep-dive-into-catalyst-apache-spark-20s-optimizer Kazuaki Ishizaki From: Takeshi Yamamuro To: Srinivasan Hariharan02 Cc: "user@spark.apache.org" Date: 2016/06/10 18:09 Subject: