al content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Fri, 14 Apr 2023 at 17:42, Yuval Itzchakov wrote:
>
>> Hi,
>>
>> ATM I see the most used opti
Hi,
ATM I see the most used option for a Spark operator is the one provided by
Google: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator
Unfortunately, it doesn't seem actively maintained. Are there any plans to
support an official Apache Spark community driven operator?
)" <
yur...@gmail.com> wrote:
> Yeah but can’t you use following?
> 1 . For data files: My/path/part-
> 2. For partitioned data: my/path/partition=
>
>
> Best regards
>
> On 13 Apr 2023, at 12:58, Yuval Itzchakov wrote:
>
>
> The problem is that specifyi
sue
>
> Best regards
>
> > On 13 Apr 2023, at 11:52, Yuval Itzchakov wrote:
> >
> >
> > Hi everyone,
> >
> > I am using Sparks FileStreamSink in order to write files to S3. On the
> S3 bucket, I have a lifecycle policy that deletes data older than
nyone run into a similar problem?
--
Best Regards,
Yuval Itzchakov.
Hi,
I've recently installed Spark 2.2.1, and it seems like the SQL tab isn't
getting updated at all, although the "Jobs" tab gets updated with new
incoming jobs, the SQL tab remains empty, all the time.
I was wondering if anyone noticed such regression in 2.2.1?
--
tion
> threads are all daemon threads, so it should not affect the termination of
> the application whether the queries are active or not. May be something
> else is keeping the application alive?
>
>
>
> On Tue, Aug 29, 2017 at 2:09 AM, Yuval Itzchakov
> wrote:
>
>> I
y it isn't consuming anything from the
source and is logically dead.
Should this be the behavior? I think that perhaps there should be a
configuration that asks whether to completely shutdown the application on
source failure.
What do you guys think?
--
Best Regards,
Yuval Itzchakov.
DFS (Parquet), NoSQL (HBase, Cassandra), RDBMS
> (PostgreSQL, MySQL), Object Store (S3, Swift), or any else I can’t think of
> going to be the underlying near real-time storage system?
>
> Thanks,
> Ben
>
>
> On May 15, 2016, at 3:36 PM, Yuval Itzchakov wrote:
>
> H
how that will be
> handled...
>
> At a quick glance at the code, it seems to be used already in streaming
> aggregations.
>
> Just my two cents,
>
> Ofir Manor
>
> Co-Founder & CTO | Equalum
>
> Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io
>
>
r how
things like "mapWithState" are going to be translated into RQ, and I think
thats the hole that's causing my misunderstanding.
On Mon, May 16, 2016 at 1:36 AM Yuval Itzchakov wrote:
> Hi Ofir,
> Thanks for the elaborated answer. I have read both documents, where they
>
Hi Ofir,
Thanks for the elaborated answer. I have read both documents, where they do
a light touch on infinite Dataframes/Datasets. However, they do not go in
depth as regards to how existing transformations on DStreams, for example,
will be transformed into the Dataset APIs. I've been browsing the
t;
> -Andrew
>
> 2016-03-08 11:21 GMT-08:00 Silvio Fiorito :
>
>> There’s a script to start it up under sbin, start-shuffle-service.sh. Run
>> that on each of your worker nodes.
>>
>>
>>
>>
>>
>>
>>
>> *From: *Yuval It
Actually, I assumed that setting the flag in the spark job would turn on
the shuffle service in the workers. I now understand that assumption was
wrong.
Is there any way to set the flag via the driver? Or must I manually set it
via spark-env.sh on each worker?
On Tue, Mar 8, 2016, 20:14 Silvio F
As I said, it is the method which eventually serializes the object. It is
declared inside a companion object of a case class.
The problem is that Spark will still try to serialize the method, as it
needs to execute on the worker. How will that change the fact that
`EncodeJson[T]` is not serializab
Awesome. Thanks for the super fast reply.
On Thu, Feb 4, 2016, 21:16 Tathagata Das
wrote:
> Shixiong has already opened the PR -
> https://github.com/apache/spark/pull/11081
>
> On Thu, Feb 4, 2016 at 11:11 AM, Yuval Itzchakov
> wrote:
>
>> Let me know if you do need a
gt;> } else if (wrappedState.isUpdated || timeoutThresholdTime.isDefined)
>> /* <--- problem is here */ {
>> newStateMap.put(key, wrappedState.get(), batchTime.milliseconds)
>> }
>> mappedData ++= returned
>> }
>> {code}
>>
>> In case the stream has a timeout set, but the state wasn't set at all, the
>> "else-if" will still follow through because the timeout is defined but
>> "wrappedState" is empty and wasn't set.
>>
>> If it is mandatory to update state for each entry of *mapWithState*, then
>> this code should throw a better exception than "NoSuchElementException",
>> which doesn't really saw anything to the developer.
>>
>> I haven't provided a fix myself because I'm not familiar with the spark
>> implementation, but it seems to be there needs to either be an extra check
>> if the state is set, or as previously stated a better exception message.
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/PairDStreamFunctions-mapWithState-fails-in-case-timeout-is-set-without-updating-State-S-tp26147.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>
--
Best Regards,
Yuval Itzchakov.
17 matches
Mail list logo