from:"Yuval Itzchakov"

Re: Spark Kubernetes Operator

2023-04-14 Thread Yuval Itzchakov

al content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Fri, 14 Apr 2023 at 17:42, Yuval Itzchakov wrote: > >> Hi, >> >> ATM I see the most used opti

Spark Kubernetes Operator

2023-04-14 Thread Yuval Itzchakov

Hi, ATM I see the most used option for a Spark operator is the one provided by Google: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator Unfortunately, it doesn't seem actively maintained. Are there any plans to support an official Apache Spark community driven operator?

Re: _spark_metadata path issue with S3 lifecycle policy

2023-04-13 Thread Yuval Itzchakov

)"‬‎ < yur...@gmail.com> wrote:‬ > Yeah but can’t you use following? > 1 . For data files: My/path/part- > 2. For partitioned data: my/path/partition= > > > Best regards > > On 13 Apr 2023, at 12:58, Yuval Itzchakov wrote: > > > The problem is that specifyi

Re: _spark_metadata path issue with S3 lifecycle policy

2023-04-13 Thread Yuval Itzchakov

sue > > Best regards > > > On 13 Apr 2023, at 11:52, Yuval Itzchakov wrote: > > > > > > Hi everyone, > > > > I am using Sparks FileStreamSink in order to write files to S3. On the > S3 bucket, I have a lifecycle policy that deletes data older than

_spark_metadata path issue with S3 lifecycle policy

2023-04-13 Thread Yuval Itzchakov

nyone run into a similar problem? -- Best Regards, Yuval Itzchakov.

Regression in Spark SQL UI Tab in Spark 2.2.1

2018-01-11 Thread Yuval Itzchakov

Hi, I've recently installed Spark 2.2.1, and it seems like the SQL tab isn't getting updated at all, although the "Jobs" tab gets updated with new incoming jobs, the SQL tab remains empty, all the time. I was wondering if anyone noticed such regression in 2.2.1? --

Re: Terminating Structured Streaming Applications on Source Failure

2017-08-29 Thread Yuval Itzchakov

tion > threads are all daemon threads, so it should not affect the termination of > the application whether the queries are active or not. May be something > else is keeping the application alive? > > > > On Tue, Aug 29, 2017 at 2:09 AM, Yuval Itzchakov > wrote: > >> I

Terminating Structured Streaming Applications on Source Failure

2017-08-29 Thread Yuval Itzchakov

y it isn't consuming anything from the source and is logically dead. Should this be the behavior? I think that perhaps there should be a configuration that asks whether to completely shutdown the application on source failure. What do you guys think? -- Best Regards, Yuval Itzchakov.

Re: Structured Streaming in Spark 2.0 and DStreams

2016-05-16 Thread Yuval Itzchakov

DFS (Parquet), NoSQL (HBase, Cassandra), RDBMS > (PostgreSQL, MySQL), Object Store (S3, Swift), or any else I can’t think of > going to be the underlying near real-time storage system? > > Thanks, > Ben > > > On May 15, 2016, at 3:36 PM, Yuval Itzchakov wrote: > > H

Re: Structured Streaming in Spark 2.0 and DStreams

2016-05-16 Thread Yuval Itzchakov

how that will be > handled... > > At a quick glance at the code, it seems to be used already in streaming > aggregations. > > Just my two cents, > > Ofir Manor > > Co-Founder & CTO | Equalum > > Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io > >

Re: Structured Streaming in Spark 2.0 and DStreams

2016-05-16 Thread Yuval Itzchakov

r how things like "mapWithState" are going to be translated into RQ, and I think thats the hole that's causing my misunderstanding. On Mon, May 16, 2016 at 1:36 AM Yuval Itzchakov wrote: > Hi Ofir, > Thanks for the elaborated answer. I have read both documents, where they >

Re: Structured Streaming in Spark 2.0 and DStreams

2016-05-15 Thread Yuval Itzchakov

Hi Ofir, Thanks for the elaborated answer. I have read both documents, where they do a light touch on infinite Dataframes/Datasets. However, they do not go in depth as regards to how existing transformations on DStreams, for example, will be transformed into the Dataset APIs. I've been browsing the

Re: Using dynamic allocation and shuffle service in Standalone Mode

2016-03-08 Thread Yuval Itzchakov

t; > -Andrew > > 2016-03-08 11:21 GMT-08:00 Silvio Fiorito : > >> There’s a script to start it up under sbin, start-shuffle-service.sh. Run >> that on each of your worker nodes. >> >> >> >> >> >> >> >> *From: *Yuval It

Re: Using dynamic allocation and shuffle service in Standalone Mode

2016-03-08 Thread Yuval Itzchakov

Actually, I assumed that setting the flag in the spark job would turn on the shuffle service in the workers. I now understand that assumption was wrong. Is there any way to set the flag via the driver? Or must I manually set it via spark-env.sh on each worker? On Tue, Mar 8, 2016, 20:14 Silvio F

Re: Using a non serializable third party JSON serializable on a spark worker node throws NotSerializableException

2016-03-01 Thread Yuval Itzchakov

As I said, it is the method which eventually serializes the object. It is declared inside a companion object of a case class. The problem is that Spark will still try to serialize the method, as it needs to execute on the worker. How will that change the fact that `EncodeJson[T]` is not serializab

Re: PairDStreamFunctions.mapWithState fails in case timeout is set without updating State[S]

2016-02-04 Thread Yuval Itzchakov

Awesome. Thanks for the super fast reply. On Thu, Feb 4, 2016, 21:16 Tathagata Das wrote: > Shixiong has already opened the PR - > https://github.com/apache/spark/pull/11081 > > On Thu, Feb 4, 2016 at 11:11 AM, Yuval Itzchakov > wrote: > >> Let me know if you do need a

Re: PairDStreamFunctions.mapWithState fails in case timeout is set without updating State[S]

2016-02-04 Thread Yuval Itzchakov

gt;> } else if (wrappedState.isUpdated || timeoutThresholdTime.isDefined) >> /* <--- problem is here */ { >> newStateMap.put(key, wrappedState.get(), batchTime.milliseconds) >> } >> mappedData ++= returned >> } >> {code} >> >> In case the stream has a timeout set, but the state wasn't set at all, the >> "else-if" will still follow through because the timeout is defined but >> "wrappedState" is empty and wasn't set. >> >> If it is mandatory to update state for each entry of *mapWithState*, then >> this code should throw a better exception than "NoSuchElementException", >> which doesn't really saw anything to the developer. >> >> I haven't provided a fix myself because I'm not familiar with the spark >> implementation, but it seems to be there needs to either be an extra check >> if the state is set, or as previously stated a better exception message. >> >> >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/PairDStreamFunctions-mapWithState-fails-in-case-timeout-is-set-without-updating-State-S-tp26147.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > -- Best Regards, Yuval Itzchakov.

Re: Spark Kubernetes Operator

Spark Kubernetes Operator

Re: _spark_metadata path issue with S3 lifecycle policy

Re: _spark_metadata path issue with S3 lifecycle policy

_spark_metadata path issue with S3 lifecycle policy

Regression in Spark SQL UI Tab in Spark 2.2.1

Re: Terminating Structured Streaming Applications on Source Failure

Terminating Structured Streaming Applications on Source Failure

Re: Structured Streaming in Spark 2.0 and DStreams

Re: Structured Streaming in Spark 2.0 and DStreams

Re: Structured Streaming in Spark 2.0 and DStreams

Re: Structured Streaming in Spark 2.0 and DStreams

Re: Using dynamic allocation and shuffle service in Standalone Mode

Re: Using dynamic allocation and shuffle service in Standalone Mode

Re: Using a non serializable third party JSON serializable on a spark worker node throws NotSerializableException

Re: PairDStreamFunctions.mapWithState fails in case timeout is set without updating State[S]

Re: PairDStreamFunctions.mapWithState fails in case timeout is set without updating State[S]

17 matches

Site Navigation

Mail list logo

Footer information