Custom Session Windowing in Spark using Scala/Python

2023-08-03 Thread Ravi Teja
Hi, I am new to Spark and looking for help regarding the session windowing in Spark. I want to create session windows on a user activity stream with a gap duration of `x` minutes and also have a

Reg: create spark using virtual machine through chef

2023-04-24 Thread sunkara akhil sai teja
Hi team, Myself akhil, Iam trying to create a spark using virtual machine through chef. Could you please help us how we can do it. If possible could you please share the documentation. Regards Akhil

[Spark Java] Longest Continuous Subsequence

2020-12-13 Thread Ravi Teja
Hi All, Any help in writing a code to find longest Continuous Subsequence between two columns Like :COL1:"sparkJava' col2:Java8 -->Result :Java Thanks in advance Regards Raviteja

Re: LiveListenerBus is occupying most of the Driver Memory and frequent GC is degrading the performance

2020-09-11 Thread Teja
so much memory ? > > This will help us eliminate if the memory usage is due to some user > code/library holding references to large objects/graph of objects - or > memory usage is actually in listener/related code. > > Regards, > Mridul > > > On Tue, Aug 11, 2020

Re: LiveListenerBus is occupying most of the Driver Memory and frequent GC is degrading the performance

2020-09-11 Thread Teja
Sorry for the poor formatting -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user

LiveListenerBus is occupying most of the Driver Memory and frequent GC is degrading the performance

2020-08-11 Thread Teja
We have ~120 executors with 5 cores each, for a very long-running job which crunches ~2.5 TB of data with has too many filters to query. Currently, we have ~30k partitions which make ~90MB per partition. We are using Spark v2.2.2 as of now. The major problem we are facing is due to GC on the drive

Is it possible to use Hadoop 3.x and Hive 3.x using spark 2.4?

2020-07-06 Thread Teja
We use spark 2.4.0 to connect to Hadoop 2.7 cluster and query from Hive Metastore version 2.3. But the Cluster managing team has decided to upgrade to Hadoop 3.x and Hive 3.x. We could not migrate to spark 3 yet, which is compatible with Hadoop 3 and Hive 3, as we could not test if anything breaks.