integration? I
came across a library called Apache Bahir, but is it a must to use a
library like that?
The code for my example can be found here:
https://github.com/pramodbiligiri/pubsub-spark
Pramod Biligiri
I was able to get it working. It needed a SparkSession to be instantiated
and wait for termination signal from the user. In my case I used a
StreamingContext -
https://spark.apache.org/docs/2.2.0/api/java/org/apache/spark/streaming/StreamingContext.html
Pramod Biligiri
On Sun, Aug 7, 2022 at 9
Hi,
Is there an easy way to see how a SparkSQL query plan maps to different
stages of the generated Spark job? The WebUI is entirely in terms of RDD
stages and I'm having a hard time mapping it back to my query.
Pramod
Hi,
Has anyone successfully used Java Flight Recorder (JFR) with Spark
Streaming on Oracle Java 8? JFR works for me on batch jobs but not with
Streaming.
I'm running my streaming job on Amazon EMR. I have enabled Java Flight
Recorder (JFR) to profile CPU usage. But at the end of the job, the JFR
o
+1. I would love to have the code for this as well.
Pramod
On Fri, Apr 3, 2015 at 12:47 PM, Tom wrote:
> Hi all,
>
> As we all know, Spark has set the record for sorting data, as published on:
> https://databricks.com/blog/2014/10/10/spark-petabyte-sort.html.
>
> Here at our group, we would lov
Hi,
I remember seeing a similar performance problem with Apache Shark last year
when compared to Hive, though that was in a company specific port of the
code. Unfortunately I no longer have access to that code. The problem then
was reflection based class creation in the critical path of reading eac
Hi,
I'm running Spark tasks with speculation enabled. I'm noticing that Spark
seems to wait in a given stage for all stragglers to finish, even though
the speculated alternative might have finished sooner. Is that correct?
Is there a way to indicate to Spark not to wait for stragglers to finish?
as
without speculation.
Pramod
On Mon, Sep 15, 2014 at 4:22 PM, Du Li wrote:
> There is a parameter spark.speculation that is turned off by default.
> Look at the configuration doc:
> http://spark.apache.org/docs/latest/configuration.html
>
>
>
> From: Pramod Biligiri
> D
Hi,
I'm trying to read some data in RCFiles using Spark, but can't seem to find
a suitable example anywhere. Currently I've written the following bit of
code that lets me count() the no. of records, but when I try to do a
collect() or a map(), it fails with a ConcurrentModificationException. I'm
ru
naged by Hive (and thus present in a Hive metastore)? In
>> that case, Spark SQL (
>> https://spark.apache.org/docs/latest/sql-programming-guide.html) is the
>> easiest way.
>>
>> Matei
>>
>> On September 23, 2014 at 2:26:10 PM, Pramod Biligiri (
>> pram
10 matches
Mail list logo