Bagavath,
Sometimes we need to merge existing records, due to recomputations of the
whole data. I don't think we could achieve this with pure insert, or is
there a way?
On 24 July 2015 at 08:53, Bagavath wrote:
> Try using insert instead of merge. Typically we use insert append to do
> bulk
Try using insert instead of merge. Typically we use insert append to do
bulk inserts to oracle.
On Thu, Jul 23, 2015 at 1:12 AM, diplomatic Guru
wrote:
> Thanks Robin for your reply.
>
> I'm pretty sure that writing to Oracle is taking longer as when writing to
> HDFS is only taking ~5minutes.
Thanks Robin for your reply.
I'm pretty sure that writing to Oracle is taking longer as when writing to
HDFS is only taking ~5minutes.
The job is writing about ~5 Million of records. I've set the job to call
executeBatch() when the batchSize reaches 200,000 of records, so I assume
that commit wil
The first question I would ask is have you determined whether you have a
performance issue writing to Oracle? In particular how many commits are you
making? If you are issuing a lot of commits that would be a performance problem.
Robin
> On 22 Jul 2015, at 19:11, diplomatic Guru wrote:
>
> He
Hello all,
We are having a major performance issue with the Spark, which is holding us
from going live.
We have a job that carries out computation on log files and write the
results into Oracle DB.
The reducer 'reduceByKey' have been set to parallelize by 4 as we don't
want to establish too man