Re: [VOTE] SPIP: Structured Streaming - Arbitrary State API v2

2024-01-09 Thread Shixiong Zhu
+1 (binding) Best Regards, Shixiong Zhu On Tue, Jan 9, 2024 at 6:47 PM 刘唯 wrote: > This is a good addition! +1 > > Raghu Angadi 于2024年1月9日周二 13:17写道: > >> +1. This is a major improvement to the state API. >> >> Raghu. >> >> On Tue, Jan 9, 2024 at 1:

Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2024-01-05 Thread Shixiong Zhu
+1. Looking forward to seeing how the new API brings in new streaming use cases! Best Regards, Shixiong Zhu On Wed, Nov 29, 2023 at 6:42 PM Anish Shrigondekar wrote: > Hi dev, > > Addressed the comments that Jungtaek had on the doc. Bumping the thread > once again to see if othe

Re: [VOTE] SPIP: State Data Source - Reader

2023-10-25 Thread Shixiong Zhu
+1 Best Regards, Shixiong Zhu On Wed, Oct 25, 2023 at 4:20 PM Yuanjian Li wrote: > +1 > > Jungtaek Lim 于2023年10月25日周三 01:06写道: > >> Friendly reminder: the VOTE thread got 2 binding votes and needs 1 more >> binding vote to pass. >> >> On Wed, Oct 2

Re: [DISCUSS] Deprecate DStream in 3.4

2023-01-12 Thread Shixiong Zhu
+1 On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das wrote: > +1 > > On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon wrote: > >> +1 >> >> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim >> wrote: >> >>> bump for more visibility. >>> >>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim < >>> kabhwan.opensou.

Re: [VOTE][SPIP] Asynchronous Offset Management in Structured Streaming

2022-11-30 Thread Shixiong Zhu
+1 On Wed, Nov 30, 2022 at 8:04 PM Hyukjin Kwon wrote: > +1 > > On Thu, 1 Dec 2022 at 12:39, Mridul Muralidharan wrote: > >> >> +1 >> >> Regards, >> Mridul >> >> On Wed, Nov 30, 2022 at 8:55 PM Xingbo Jiang >> wrote: >> >>> +1 >>> >>> On Wed, Nov 30, 2022 at 5:59 PM Jungtaek Lim < >>> kabhwan

Re: [DISCUSSION] SPIP: Asynchronous Offset Management in Structured Streaming

2022-11-30 Thread Shixiong Zhu
+1 This is exciting. I agree with Jerry that this SPIP and continuous processing are orthogonal. This SPIP itself would be a great improvement and impact most Structured Streaming users. Best Regards, Shixiong On Wed, Nov 30, 2022 at 6:57 AM Mridul Muralidharan wrote: > > Thanks for all the c

Welcome Jose Torres as a Spark committer

2019-01-29 Thread Shixiong Zhu
Hi all, The Apache Spark PMC recently added Jose Torres as a committer on the project. Jose has been a major contributor to Structured Streaming. Please join me in welcoming him! Best Regards, Shixiong Zhu

Re: Spark Streaming Application is Stuck Under Heavy Load Due to DeadLock

2016-01-04 Thread Shixiong Zhu
Hye Rachana, could you provide the full jstack outputs? Maybe it's same as https://issues.apache.org/jira/browse/SPARK-11104 Best Regards, Shixiong Zhu 2016-01-04 12:56 GMT-08:00 Rachana Srivastava < rachana.srivast...@markmonitor.com>: > Hello All, > > > > I am runni

Re: Spark streaming 1.6.0-RC4 NullPointerException using mapWithState

2015-12-29 Thread Shixiong Zhu
Could you create a JIRA? We can continue the discussion there. Thanks! Best Regards, Shixiong Zhu 2015-12-29 3:42 GMT-08:00 Jan Uyttenhove : > Hi guys, > > I upgraded to the RC4 of Spark (streaming) 1.6.0 to (re)test the new > mapWithState API, after previously reporting issue

Re: A bug in Spark standalone? Worker registration and deregistration

2015-12-10 Thread Shixiong Zhu
Jacek, could you create a JIRA for it? I just reproduced it. It's a bug in how Master handles the Worker disconnection. Best Regards, Shixiong Zhu 2015-12-10 2:45 GMT-08:00 Jacek Laskowski : > Hi, > > I'm on yesterday's master HEAD. > > Pozdrawiam, > Jacek

Re: tests blocked at "don't call ssc.stop in listener"

2015-11-26 Thread Shixiong Zhu
Just found a potential dead-lock in this test. Will send a PR to fix it soon. Best Regards, Shixiong Zhu 2015-11-26 18:55 GMT-08:00 Saisai Shao : > Might be related to this JIRA ( > https://issues.apache.org/jira/browse/SPARK-11761), not very sure about > it. > > On Fri, Nov 27

Re: NettyRpcEnv adverisedPort

2015-11-26 Thread Shixiong Zhu
I think you are right. The executor gets the driver port from "RpcEnv.address". Best Regards, Shixiong Zhu 2015-11-26 11:45 GMT-08:00 Rad Gruchalski : > Dear all, > > I am currently looking at modifying NettyRpcEnv for this PR: > https://github.com/apache/spark/pull/96

Re: Why there's no api for SparkContext#textFiles to support multiple inputs ?

2015-11-11 Thread Shixiong Zhu
In addition, if you have more than two text files, you can just put them into a Seq and use "reduce(_ ++ _)". Best Regards, Shixiong Zhu 2015-11-11 10:21 GMT-08:00 Jakob Odersky : > Hey Jeff, > Do you mean reading from multiple text files? In that case, as a > workaround,

Re: [SQL] Memory leak with spark streaming and spark sql in spark 1.5.1

2015-10-14 Thread Shixiong Zhu
Thanks for reporting it Terry. I submitted a PR to fix it: https://github.com/apache/spark/pull/9132 Best Regards, Shixiong Zhu 2015-10-15 2:39 GMT+08:00 Reynold Xin : > +dev list > > On Wed, Oct 14, 2015 at 1:07 AM, Terry Hoo wrote: > >> All, >> >> Does anyo

Re: Get only updated RDDs from or after updateStateBykey

2015-09-24 Thread Shixiong Zhu
ss database } else { // update to new state and save to database } // return new state } TaskContext.get().addTaskCompletionListener(_ => db.disconnect()) } Best Regards, Shixiong Zhu 2015-09-24 17:42 GMT+08:00 Bin Wang : > It seems like a work ar

Re: Get only updated RDDs from or after updateStateBykey

2015-09-24 Thread Shixiong Zhu
rg/jira/browse/SPARK-2629 but doesn't have a doc now... Best Regards, Shixiong Zhu 2015-09-24 17:26 GMT+08:00 Bin Wang : > Data that are not updated should be saved earlier: while the data added to > the DStream at the first time, it should be considered as updated. So save > th

Re: Get only updated RDDs from or after updateStateBykey

2015-09-24 Thread Shixiong Zhu
For data that are not updated, where do you save? Or do you only want to avoid accessing database for those that are not updated? Besides, the community is working on optimizing "updateStateBykey"'s performance. Hope it will be delivered soon. Best Regards, Shixiong Zhu 2015-09-

Re: 答复: bug in Worker.scala, ExecutorRunner is not serializable

2015-09-18 Thread Shixiong Zhu
I'm wondering if we should create a tag trait (e.g., LocalMessage) for messages like this and add the comment in the trait. Looks better than adding inline comments for all these messages. Best Regards, Shixiong Zhu 2015-09-18 15:10 GMT+08:00 Reynold Xin : > Maybe we should add som

Re: bug in Worker.scala, ExecutorRunner is not serializable

2015-09-17 Thread Shixiong Zhu
RequestWorkerState is an internal message between Worker and WorkerWebUI. Since they are in the same process, that's fine. Actually, these are not public APIs. Could you elaborate your use case? Best Regards, Shixiong Zhu 2015-09-17 16:36 GMT+08:00 Huangguowei : > > > Is it p

Re: jenkins failing on Kinesis shard limits

2015-07-25 Thread Shixiong Zhu
The issue is in KinesisBackedBlockRDDSuite I have sent https://github.com/apache/spark/pull/7661 to remove the streams created by KinesisBackedBlockRDDSuite and https://github.com/apache/spark/pull/7663 to fix the issue. Best Regards, Shixiong Zhu 2015-07-25 14:46 GMT+08:00 Prabeesh K. : >

Re: Spark 1.5.0-SNAPSHOT broken with Scala 2.11

2015-06-28 Thread Shixiong Zhu
Could you update your maven to 3.3.3? I'm not sure if this is a known issue but the exception message looks same. See https://github.com/apache/spark/pull/6770 Best Regards, Shixiong Zhu 2015-06-29 9:02 GMT+08:00 Alessandro Baretta : > I am building the current master branch with Sc

Re: Welcoming three new committers

2015-02-03 Thread Shixiong Zhu
Congrats guys! Best Regards, Shixiong Zhu 2015-02-04 6:34 GMT+08:00 Matei Zaharia : > Hi all, > > The PMC recently voted to add three new committers: Cheng Lian, Joseph > Bradley and Sean Owen. All three have been major contributors to Spark in > the past year: Cheng on Spark

Why the major.minor version of the new hive-exec is 51.0?

2014-12-30 Thread Shixiong Zhu
ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:169) at Test.main(Test.java:5) Best Regards, Shixiong Zhu

Re: Announcing Spark 1.2!

2014-12-19 Thread Shixiong Zhu
Congrats! A little question about this release: Which commit is this release based on? v1.2.0 and v1.2.0-rc2 are pointed to different commits in https://github.com/apache/spark/releases Best Regards, Shixiong Zhu 2014-12-19 16:52 GMT+08:00 Patrick Wendell : > > I'm happy to a

Re: Eliminate copy while sending data : any Akka experts here ?

2014-11-24 Thread Shixiong Zhu
commit for this idea here: https://github.com/zsxwing/spark/commit/c998856cdf747aa0452d030e58c3c2dd4ef7f97d Best Regards, Shixiong Zhu 2014-11-21 12:28 GMT+08:00 Reynold Xin : > Can you elaborate? Not 100% sure if I understand what you mean. > > On Thu, Nov 20, 2014 at 7:14 PM, Shixiong Z

Re: Eliminate copy while sending data : any Akka experts here ?

2014-11-20 Thread Shixiong Zhu
, Shixiong Zhu 2014-09-20 16:24 GMT+08:00 Reynold Xin : > BTW - a partial solution here: https://github.com/apache/spark/pull/2470 > > This doesn't address the 0 size block problem yet, but makes my large job > on hundreds of terabytes of data much more reliable. > > > On F

Re: About implicit rddToPairRDDFunctions

2014-11-13 Thread Shixiong Zhu
OK. I'll take it. Best Regards, Shixiong Zhu 2014-11-14 12:34 GMT+08:00 Reynold Xin : > That seems like a great idea. Can you submit a pull request? > > > On Thu, Nov 13, 2014 at 7:13 PM, Shixiong Zhu wrote: > >> If we put the `implicit` into "pacakge object r

Re: About implicit rddToPairRDDFunctions

2014-11-13 Thread Shixiong Zhu
hem explicitly. Here is a post about the implicit search logic: http://eed3si9n.com/revisiting-implicits-without-import-tax To maintain the compatibility, we can keep `rddToPairRDDFunctions` in the SparkContext but remove `implicit`. The disadvantage is there are two copies of same codes. Best Re

About implicit rddToPairRDDFunctions

2014-11-06 Thread Shixiong Zhu
g[K], vt: ClassTag[V], ord: Ordering[K] = null) = { new PairRDDFunctions(rdd) } If so, the converting will be automatic and not need to import org.apache.spark.SparkContext._ I tried to search some discussion but found nothing. Best Regards, Shixiong Zhu