Re: How to run spark connect in kubernetes?

2024-10-02 Thread kant kodali
please ignore this. it was a dns issue On Wed, Oct 2, 2024 at 11:16 AM kant kodali wrote: > Here > <https://stackoverflow.com/questions/79048006/how-to-run-spark-connect-server-in-kubernetes-as-a-spark-driver-in-client-mode> > are more details about my question that I posted i

Re: How to run spark connect in kubernetes?

2024-10-02 Thread kant kodali
Here <https://stackoverflow.com/questions/79048006/how-to-run-spark-connect-server-in-kubernetes-as-a-spark-driver-in-client-mode> are more details about my question that I posted in SO On Tue, Oct 1, 2024 at 11:32 PM kant kodali wrote: > Hi All, > > Is it possible to run a Spark

How to run spark connect in kubernetes?

2024-10-01 Thread kant kodali
Hi All, Is it possible to run a Spark Connect server in Kubernetes while configuring it to communicate with Kubernetes as the cluster manager? If so, is there any example? Thanks

Re: structured streaming join of streaming dataframe with static dataframe performance

2022-08-04 Thread kant kodali
I suspect it is probably because the incoming rows when I joined with static frame can lead to variable degree of skewness over time and if so it is probably better to employ different join strategies at run time. But if you know your Dataset I believe you can just do broadcast join for your cas

Re: https://spark-project.atlassian.net/browse/SPARK-1153

2020-02-24 Thread kant kodali
x27;s great! Thanks On Sun, Feb 23, 2020 at 3:53 PM kant kodali wrote: > Hi All, > > Any chance of fixing this one ? > https://spark-project.atlassian.net/browse/SPARK-1153 or offer some work > around may be? > > Currently, I got bunch of events streaming into kafka across var

Re: SparkGraph review process

2020-02-23 Thread kant kodali
Hi Sean, In that case, Can we have Graphframes as part of spark release? or separate release is also fine. Currently, I don't see any releases w.r.t Graphframes. Thanks On Fri, Feb 14, 2020 at 9:06 AM Sean Owen wrote: > This will not be Spark 3.0, no. > > On Fri, Feb 14, 2020 a

https://spark-project.atlassian.net/browse/SPARK-1153

2020-02-23 Thread kant kodali
Hi All, Any chance of fixing this one ? https://spark-project.atlassian.net/browse/SPARK-1153 or offer some work around may be? Currently, I got bunch of events streaming into kafka across various topics and they are stamped with an UUIDv1 for each event. so it is easy to construct edges using UU

Re: SparkGraph review process

2020-02-13 Thread kant kodali
any update on this? Is spark graph going to make it into Spark or no? On Mon, Oct 14, 2019 at 12:26 PM Holden Karau wrote: > Maybe let’s ask the folks from Lightbend who helped with the previous > scala upgrade for their thoughts? > > On Mon, Oct 14, 2019 at 8:24 PM Xiao Li wrote: > >> 1. On th

https://github.com/google/zetasql

2019-05-21 Thread kant kodali
https://github.com/google/zetasql

Re: queryable state & streaming

2019-03-16 Thread kant kodali
Any update on this? On Wed, Oct 24, 2018 at 4:26 PM Arun Mahadevan wrote: > I don't think separate API or RPCs etc might be necessary for queryable > state if the state can be exposed as just another datasource. Then the sql > queries can be issued against it just like executing sql queries agai

Re: Plan on Structured Streaming in next major/minor release?

2018-11-01 Thread kant kodali
If I can add one thing to this list I would say stateless aggregations using Raw SQL. For example: As I read micro-batches from Kafka I want to do say a count of that micro batch and spit it out using Raw SQL . (No Count aggregation across batches.) On Tue, Oct 30, 2018 at 4:55 PM Jungtaek Lim

Re: Plan on Structured Streaming in next major/minor release?

2018-10-20 Thread kant kodali
+1 For Raising all this. +1 For Queryable State (SPARK-16738 [3]) On Thu, Oct 18, 2018 at 9:59 PM Jungtaek Lim wrote: > Small correction: "timeout" in map/flatmapGroupsWithState would not work > similar as State TTL when event time and watermark is set. So timeout in > map/flatmapGroupsWithState

Re: Feature request: Java-specific transform method in Dataset

2018-07-01 Thread kant kodali
I am not affiliated with Flink or Spark but I do think some of the thoughts here makes sense On Sun, Jul 1, 2018 at 4:12 PM, Sean Owen wrote: > It's true, that is

Re: [VOTE] Spark 2.3.1 (RC1)

2018-05-16 Thread kant kodali
16, 2018 at 3:22 AM, Marco Gaido wrote: > I'd be against having a new feature in a minor maintenance release. I > think such a release should contain only bugfixes. > > 2018-05-16 12:11 GMT+02:00 kant kodali : > >> Can this https://issues.apache.org/jira/browse/SPARK-23

Re: [VOTE] Spark 2.3.1 (RC1)

2018-05-16 Thread kant kodali
Can this https://issues.apache.org/jira/browse/SPARK-23406 be part of 2.3.1? On Tue, May 15, 2018 at 2:07 PM, Marcelo Vanzin wrote: > Bummer. People should still feel welcome to test the existing RC so we > can rule out other issues. > > On Tue, May 15, 2018 at 2:04 PM, Xiao Li wrote: > > -1 >

Re: [VOTE] Spark 2.3.0 (RC4)

2018-02-21 Thread kant kodali
Hi All, +1 for the tickets proposed by Ryan Blue Any possible chance of this one https://issues.apache.org/jira/browse/SPARK-23406 getting into 2.3.0? It's a very important feature for us so if it doesn't make the cut I would have to cherry-pick this commit and compile from the source for our pro

Re: [ANNOUNCE] Announcing Apache Spark 2.2.0

2017-07-17 Thread kant kodali
+1 On Tue, Jul 11, 2017 at 3:56 PM, Jean Georges Perrin wrote: > Awesome! Congrats! Can't wait!! > > jg > > > On Jul 11, 2017, at 18:48, Michael Armbrust > wrote: > > Hi all, > > Apache Spark 2.2.0 is the third release of the Spark 2.x line. This > release removes the experimental tag from Stru

Re: Question on Spark code

2017-06-25 Thread kant kodali
/simple/SimpleLogger.java#L599 >> >> Please correct me if I am wrong. >> >> >> >> >> On Sun, Jun 25, 2017 at 3:04 AM, Sean Owen wrote: >> >>> Maybe you are looking for declarations like this. "=> String" means the >>> arg

Re: Question on Spark code

2017-06-25 Thread kant kodali
ith log > statements. The message isn't constructed unless it will be logged. > > protected def logInfo(msg: => String) { > > > On Sun, Jun 25, 2017 at 10:28 AM kant kodali wrote: > >> Hi All, >> >> I came across this file https://github.com/apache/spark/b

Question on Spark code

2017-06-25 Thread kant kodali
Hi All, I came across this file https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/Logging.scala and I am wondering what is the purpose of this? Especially it doesn't prevent any string concatenation and also the if checks are already done by the library itse

Re: Running into the same problem as JIRA SPARK-19268

2017-05-24 Thread kant kodali
Even if I do simple count aggregation like below I get the same error as https://issues.apache.org/jira/browse/SPARK-19268 Dataset df2 = df1.groupBy(functions.window(df1.col("Timestamp5"), "24 hours", "24 hours"), df1.col("AppName")).count(); On Wed, May

Running into the same problem as JIRA SPARK-19268

2017-05-24 Thread kant kodali
Hi All, I am using Spark 2.1.1 and running in a Standalone mode using HDFS and Kafka I am running into the same problem as https://issues.apache.org/jira/browse/SPARK-19268 with my app(not KafkaWordCount). Here is my sample code *Here is how I create ReadStream* sparkSession.readStream()

Spark 2.2.0 or Spark 2.3.0?

2017-05-01 Thread kant kodali
Hi All, If I understand the Spark standard release process correctly. It looks like the official release is going to be sometime end of this month and it is going to be 2.2.0 right (not 2.3.0)? I am eagerly looking for Spark 2.2.0 because of the "update mode" option in Spark Streaming. Please corr

Re: is there a way to persist the lineages generated by spark?

2017-04-06 Thread kant kodali
ion, but in the end the description is wrong. > > > On 4. Apr 2017, at 05:19, kant kodali wrote: > > > > Hi All, > > > > I am wondering if there a way to persist the lineages generated by spark > underneath? Some of our clients want us to prove if the result of t

is there a way to persist the lineages generated by spark?

2017-04-03 Thread kant kodali
Hi All, I am wondering if there a way to persist the lineages generated by spark underneath? Some of our clients want us to prove if the result of the computation that we are showing on a dashboard is correct and for that If we can show the lineage of transformations that are executed to get to th

Are we still dependent on Guava jar in Spark 2.1.0 as well?

2017-02-26 Thread kant kodali
Are we still dependent on Guava jar in Spark 2.1.0 as well (Given Guava jar incompatibility issues)?

Re: Java 9

2017-02-07 Thread kant kodali
Well and the module system! On Tue, Feb 7, 2017 at 4:03 AM, Timur Shenkao wrote: > If I'm not wrong, they got fid of *sun.misc.Unsafe *in Java 9. > > This class is till used by several libraries & frameworks. > > http://mishadoff.com/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/ > > On Tue

Re: Wrting data from Spark streaming to AWS Redshift?

2016-12-11 Thread kant kodali
@shyla a side question: What does Redshift can do that Spark cannot do? Trying to understand your use case. On Fri, Dec 9, 2016 at 8:47 PM, ayan guha wrote: > Ideally, saving data to external sources should not be any different. give > the write options as stated in the bloga shot, but changing

Re: How do I convert json_encoded_blob_column into a data frame? (This may be a feature request)

2016-11-17 Thread kant kodali
e mapping your JSON payloads to > tractable data structures will depend on business logic. > > The strategy of pulling out a blob into its on rdd and feeding it into the > JSON loader should work for any data source once you have your data > strategy figured out. > > On Wed,

Another Interesting Question on SPARK SQL

2016-11-17 Thread kant kodali
​ Which parts in the diagram above are executed by DataSource connectors and which parts are executed by Tungsten? or to put it in another way which phase in the diagram above does Tungsten leverages the Datasource connectors (such as say cassandra connector ) ? My understanding so far is that con

How do I convert json_encoded_blob_column into a data frame? (This may be a feature request)

2016-11-16 Thread kant kodali
https://spark.apache.org/docs/2.0.2/sql-programming-guide.html#json-datasets "Spark SQL can automatically infer the schema of a JSON dataset and load it as a DataFrame. This conversion can be done using SQLContext.read.json() on either an RDD of String, or a JSON file." val df = spark.sql("SELECT

Re: Spark Improvement Proposals

2016-10-12 Thread kant kodali
Some of you guys may have already seen this but in case if you haven't you may want to check it out. http://www.slideshare.net/sbaltagi/flink-vs-spark On Tue, Oct 11, 2016 at 1:57 PM, Ryan Blue wrote: > I don't think we will have trouble with whatever rule that is adopted for > accepting prop

Re: This Exception has been really hard to trace

2016-10-10 Thread kant kodali
27;META-INF/*.DSA'zip64 true} This successfully creates the jar but the error still persists. On Sun, Oct 9, 2016 11:44 PM, Shixiong(Ryan) Zhu shixi...@databricks.com wrote: Seems the runtime Spark is different from the compiled one. You should mark the Spark components  "provided&q

Re: This Exception has been really hard to trace

2016-10-09 Thread kant kodali
Hi Reynold, Actually, I did that a well before posting my question here. Thanks,kant On Sun, Oct 9, 2016 8:48 PM, Reynold Xin r...@databricks.com wrote: You should probably check with DataStax who build the Cassandra connector for Spark. On Sun, Oct 9, 2016 at 8:13 PM, kant kodali wrote

This Exception has been really hard to trace

2016-10-09 Thread kant kodali
I tried SpanBy but look like there is a strange error that happening no matter which way I try. Like the one here described for Java solution. http://qaoverflow.com/question/how-to-use-spanby-in-java/ java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$Serializ

Fwd: seeing this message repeatedly.

2016-09-05 Thread kant kodali
-- Forwarded message -- From: kant kodali Date: Sat, Sep 3, 2016 at 5:39 PM Subject: seeing this message repeatedly. To: "user @spark" Hi Guys, I am running my driver program on my local machine and my spark cluster is on AWS. The big question is I don't kn

Re: What are the names of the network protocols used between Spark Driver, Master and Workers?

2016-08-30 Thread kant kodali
Ok I will answer my own question. Looks like Netty based RPC On Mon, Aug 29, 2016 9:22 PM, kant kodali kanth...@gmail.com wrote: What are the names of the network protocols used between Spark Driver, Master and Workers?

What are the names of the network protocols used between Spark Driver, Master and Workers?

2016-08-29 Thread kant kodali
What are the names of the network protocols used between Spark Driver, Master and Workers?

Re: is the Lineage of RDD stored as a byte code in memory or a file?

2016-08-24 Thread kant kodali
ed, Aug 24, 2016 at 2:00 AM, kant kodali < kanth...@gmail.com > wrote: Hi Guys, I have this question for a very long time and after diving into the source code(specifically from the links below) I have a feeling that the lineage of an RDD (the transformations) are converted into byte code and sto

is the Lineage of RDD stored as a byte code in memory or a file?

2016-08-23 Thread kant kodali
Hi Guys, I have this question for a very long time and after diving into the source code(specifically from the links below) I have a feeling that the lineage of an RDD (the transformations) are converted into byte code and stored in memory or disk. or if I were to ask another question on a similar