I have similar requirement to export the data to mysql. Just wanted to know
what the best approach is so far after the research you guys have done.
Currently thinking of saving to hdfs and use sqoop to handle export. Is that
the best approach or is there any other way to write to mysql? Thanks!
.youtube.com/watch?v=C7gWtxelYNM&feature=youtu.be.
Jim Donahue
Adobe
-Original Message-
From: Ron Gonzalez [mailto:zlgonza...@yahoo.com.INVALID]
Sent: Wednesday, August 06, 2014 7:18 AM
To: Vida Ha
Cc: u...@spark.incubator.apache.org
Subject: Re: Save an RDD to a SQL Database
Hi Vida,
That's a good idea - to write to files first and then load. Thanks.
On Thu, Aug 7, 2014 at 11:26 AM, Flavio Pompermaier
wrote:
> Isn't sqoop export meant for that?
>
>
> http://hadooped.blogspot.it/2013/06/apache-sqoop-part-3-data-transfer.html?m=1
> On Aug 7, 2014 7:59 PM, "Nicholas Chammas"
Isn't sqoop export meant for that?
http://hadooped.blogspot.it/2013/06/apache-sqoop-part-3-data-transfer.html?m=1
On Aug 7, 2014 7:59 PM, "Nicholas Chammas"
wrote:
> Vida,
>
> What kind of database are you trying to write to?
>
> For example, I found that for loading into Redshift, by far the ea
Vida,
What kind of database are you trying to write to?
For example, I found that for loading into Redshift, by far the easiest
thing to do was to save my output from Spark as a CSV to S3, and then load
it from there into Redshift. This is not a slow as you think, because Spark
can write the outp
The use case I was thinking of was outputting calculations made in Spark
into a SQL database for the presentation layer to access. So in other
words, having a Spark backend in Java that writes to a SQL database and
then having a Rails front-end that can display the data nicely.
On Thu, Aug 7, 20
right, Spark is more like to act as an OLAP, i believe no one will use spark
as an OLTP, so there is always some question about how to share the data
between these two platform efficiently
and a more important is that most of enterprise BI tools rely on RDBMS or at
least a JDBC/ODBC interface
On Thu, Aug 7, 2014 at 11:25 AM, Cheng Lian wrote:
> Maybe a little off topic, but would you mind to share your motivation of
> saving the RDD into an SQL DB?
Many possible reasons (Vida, please chime in with yours!):
- You have an existing database you want to load new data into so
ever
On Thu, Aug 7, 2014 at 11:08 AM, 诺铁 wrote:
> what if network broken in half of the process? should we drop all data in
> database and restart from beginning?
The best way to deal with this -- which, unfortunately, is not commonly
supported -- is with a two-phase commit that can span connection
Maybe a little off topic, but would you mind to share your motivation of saving
the RDD into an SQL DB?
If you’re just trying to do further transformations/queries with SQL for
convenience, then you may just use Spark SQL directly within your Spark
application without saving them into DB:
va
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala
On Thu, Aug 7, 2014 at 8:08 AM, 诺铁 wrote:
> I haven't seen people write directly to sql database,
> mainly because it's difficult to deal with failure,
> what if network broken in half of the pro
I haven't seen people write directly to sql database,
mainly because it's difficult to deal with failure,
what if network broken in half of the process? should we drop all data in
database and restart from beginning? if the process is "Appending" data to
database, then things becomes even complex
Hi Vida,
I am writing to a DB -- or trying to :).
I believe the best practice for this (you can search the mailing list
archives) is to do a combination of mapPartitions and use a grouped
iterator.
Look at this thread, esp. the comment from A. Boisvert and Matei's comment
above it:
https://groups
Hi Vida,
It's possible to save an RDD as a hadoop file using hadoop output formats. It
might be worthwhile to investigate using DBOutputFormat and see if this will
work for you.
I haven't personally written to a db, but I'd imagine this would be one way
to do it.
Thanks,
Ron
Sent from my i
14 matches
Mail list logo