subject:"Database insert happening two times"

Re: Database insert happening two times

2017-10-17 Thread Harsh Choudhary

Hi @Marco, the multiple rows written are not dupes as current timestamp field is different in each of them. @Ayan I checked and found that my whole code is rerun twice. Although there seems to be no error, is it configurable to re-run by cluster manager? On Tue, Oct 17, 2017 at 6:45 PM, ayan g

Re: Database insert happening two times

2017-10-17 Thread ayan guha

It should not be parallel exec as the logging code is called in driver. Have you checked if your driver is reran by cluster manager due to any failure or error situation> On Tue, Oct 17, 2017 at 11:52 PM, Marco Mistroni wrote: > Hi > Uh if the problem is really with parallel exec u can try to c

Re: Database insert happening two times

2017-10-17 Thread Marco Mistroni

Hi Uh if the problem is really with parallel exec u can try to call repartition(1) before u save Alternatively try to store data in a csv file and see if u have same behaviour, to exclude dynamodb issues Also ..are the multiple rows being written dupes (they have all same fields/values)? Hth On

Re: Database insert happening two times

2017-10-17 Thread Harsh Choudhary

This is the code - hdfs_path= if(hdfs_path.contains(".avro")){ data_df = spark.read.format("com.databricks.spark.avro").load(hdfs_path) }else if(hdfs_path.contains(".tsv")){ data_df = spark.read.option("delimiter","\t").option("header","true").csv(hdfs_path) }else if(hdfs_path.c

Re: Database insert happening two times

2017-10-17 Thread ayan guha

Can you share your code? On Tue, 17 Oct 2017 at 10:22 pm, Harsh Choudhary wrote: > Hi > > I'm running a Spark job in which I am appending new data into Parquet > file. At last, I make a log entry in my Dynamodb table stating the number > of records appended, time etc. Instead of one single entry

Database insert happening two times

2017-10-17 Thread Harsh Choudhary

Hi I'm running a Spark job in which I am appending new data into Parquet file. At last, I make a log entry in my Dynamodb table stating the number of records appended, time etc. Instead of one single entry in the database, multiple entries are being made to it. Is it because of parallel execution

Re: Database insert happening two times

Re: Database insert happening two times

Re: Database insert happening two times

Re: Database insert happening two times

Re: Database insert happening two times

Database insert happening two times

6 matches

Site Navigation

Mail list logo

Footer information