from:"Amit Kumar"

issue with regexp_replace

2019-10-26 Thread amit kumar singh

Hi Team, I am trying to use regexp_replace in spark sql it throwing error expected , but found Scalar in 'reader', line 9, column 45: ... select translate(payload, '"', '"') as payload i am trying to remove all character from \\\" with "

convert josn string in spark sql

2019-10-16 Thread amit kumar singh

Hi Team, I have kafka messages where json is coming as string how can create table after converting json string to json using spark sql

Re: Spark-hive integration on HDInsight

2019-02-21 Thread amit kumar singh

Hey jay How you are making your cluster are you using spark cluster All this thing should be set up automatically Sent from my iPhone > On Feb 21, 2019, at 12:12 PM, Felix Cheung wrote: > > You should check with HDInsight support > > From: Jay Singh > Sent: Wednesday, February 20, 2019

executing stored procedure through spark

2018-08-12 Thread amit kumar singh

Hi /team, The way we call java program to executed stored procedure is there any way we can achieve the same using pyspark

Re: Can we deploy python script on a spark cluster

2018-08-02 Thread amit kumar singh

Hi Lehak You can make a scala project with oozing class And one run class which will ship your python file to cluster Define oozie coordinator with spark action or shell action We are deploying pyspark based machine learning code Sent from my iPhone > On Aug 2, 2018, at 8:46 AM, Lehak D

Pyspark is not picking up correct python version on azure hdinsight

2018-06-25 Thread amit kumar singh

Hi Guys, Pyspark is not picking up correct python version on azure hdinsight property setup in spark2-env PYSPARK_PYTHON=${PYSPARK3_PYTHON:-/usr/bin/anaconda/envs/py35/bin/python3} export PYSPARK_DRIVER_PYTHON=${PYSPARK3_PYTHON:-/usr/bin/anaconda/envs/py35/bin/python3} Thanks

Re: help needed in perforance improvement of spark structured streaming

2018-05-30 Thread amit kumar singh

alesce(1) .as[String] .writeStream .trigger(ProcessingTime("60 seconds")) .option("checkpointLocation", checkpointUrl) .foreach(new SimpleSqlServerSink(jdbcUrl, connectionProperties)) On Sat, May 5, 2018 at 12:20 PM, amit kumar singh wrote: > Hi Community, > &g

How can we group by messages coming in per batch of structured streaming

2018-05-30 Thread amit kumar singh

Hi Team, I have a requirement where i need to to combine all json messages coming in batch of structured streaming into one single json messages which can be separated by comma or any other delimiter and store it i have tried to group by kafka partition i tried using concat but its not working

help in copying data from one azure subscription to another azure subscription

2018-05-21 Thread amit kumar singh

HI Team, We are trying to move data between one azure subscription to another azure subscription is there a faster way to do through spark i am using distcp and its taking for ever thanks rohit

help needed in perforance improvement of spark structured streaming

2018-05-05 Thread amit kumar singh

Hi Community, I have a use case where i need to call stored procedure through structured streaming. I am able to send kafka message and call stored procedure , but since foreach sink keeps on executing stored procedure per message i want to combine all the messages in single dtaframe and then c

User class threw exception: java.lang.ClassNotFoundException: Failed to find data source: kafka. Please find packages at http://spark.apache.org/third-party-projects.html

2018-04-27 Thread amit kumar singh

Hi Team, I am working on structured streaming i have added all libraries in build,sbt then also its not picking up right library an failing with error User class threw exception: java.lang.ClassNotFoundException: Failed to find data source: kafka. Please find packages at http://spark.apache.org/

how to call stored procedure from spark

2018-04-26 Thread amit kumar singh

Hi Guys I have stored procedure which does transformation and write it to sql server table i am not able to execute this through spark is there any way in which spark streaming sink can just be calling these stored procedure

How to bulk insert using spark streaming job

2018-04-19 Thread amit kumar singh

How to bulk insert using spark streaming job Sent from my iPhone

Re: optimize hive query to move a subset of data from one partition table to another table

2018-02-11 Thread amit kumar singh

iao wrote: > Would you mind share your code with us to analyze? > > > On Feb 10, 2018, at 10:18 AM, amit kumar singh > wrote: > > > > Hi Team, > > > > We have hive external table which has 50 tb of data partitioned on year > month day > > > &g

optimize hive query to move a subset of data from one partition table to another table

2018-02-10 Thread amit kumar singh

Hi Team, We have hive external table which has 50 tb of data partitioned on year month day i want to move last 2 month of data into another table when i try to do this through spark ,more than 120k task are getting created what is the best way to do this thanks Rohit

How to configure spark with java

2017-07-23 Thread amit kumar singh

Hello everyone I want to use spark with java API Please let me know how can I configure it Thanks A - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: type issue: found RDD[T] expected RDD[A]

2014-08-19 Thread Amit Kumar

Hi Evan, Patrick and Tobias, So, It worked for what I needed it to do. I followed Yana's suggestion of using parameterized type of [T <: Product:ClassTag:TypeTag] more concretely, I was trying to make the query process a bit more fluent -some pseudocode but with correct types val table:SparkTa

type issue: found RDD[T] expected RDD[A]

2014-08-05 Thread Amit Kumar

Hi All, I am having some trouble trying to write generic code that uses sqlContext and RDDs. Can you suggest what might be wrong? class SparkTable[T : ClassTag](val sqlContext:SQLContext, val extractor: (String) => (T) ) { private[this] var location:Option[String] =None private[this] var na

RDD[(K,V)] for a Map File on HDFS

2014-06-04 Thread Amit Kumar

Hey guys, What is the best way for me to get an RDD[(K,V)] for a Map File created by MapFile.Writer? The Map file has Text key and MyArrayWritable as the value. Something akin to sc.textFile($path) So far I have tried two approaches -sc.hadoopFile and sc.sequenceFile #1 val rdd= sc.hadoopFile[

RDD with a Map

2014-06-03 Thread Amit Kumar

Hi Folks, I am new to spark -and this is probably a basic question. I have a file on the hdfs 1, one 1, uno 2, two 2, dos I want to create a multi Map RDD RDD[Map[String,List[String]]] {"1"->["one","uno"], "2"->["two","dos"]} First I read the file val identityData:RDD[String] = sc.textFile(

issue with regexp_replace

convert josn string in spark sql

Re: Spark-hive integration on HDInsight

executing stored procedure through spark

Re: Can we deploy python script on a spark cluster

Pyspark is not picking up correct python version on azure hdinsight

Re: help needed in perforance improvement of spark structured streaming

How can we group by messages coming in per batch of structured streaming

help in copying data from one azure subscription to another azure subscription

help needed in perforance improvement of spark structured streaming

User class threw exception: java.lang.ClassNotFoundException: Failed to find data source: kafka. Please find packages at http://spark.apache.org/third-party-projects.html

how to call stored procedure from spark

How to bulk insert using spark streaming job

Re: optimize hive query to move a subset of data from one partition table to another table

optimize hive query to move a subset of data from one partition table to another table

How to configure spark with java

Re: type issue: found RDD[T] expected RDD[A]

type issue: found RDD[T] expected RDD[A]

RDD[(K,V)] for a Map File on HDFS

RDD with a Map

20 matches

Site Navigation

Mail list logo

Footer information