Both C* and mysql support is available in Spark. For C*, datastax:spark-cassandra-connector is needed. It is very simple to read and write data in Spark. To read C* table use:
df = spark.read.format("org.apache.spark.sql.cassandra")\ .options(keyspace = 'test', table = 'test_table').load() and to write data to mysql table use: df.write.format('jdbc').options( url='jdbc:mysql://localhost/database_name', driver='com.mysql.jdbc.Driver', dbtable='DestinationTableName', user='your_user_name', password='your_password').mode('append').save() While submitting the spark <http://spark.apache.org/> program, use the following command: bin/spark-submit --packages datastax:spark-cassandra-connector:2.0.7-s_2.11 \ --jars external/mysql-connector-java-5.1.40-bin.jar \ /path_to_your_program/spark_database.py It should solve your problem and save your time, On Tue, May 15, 2018 at 11:04 PM, kurt greaves <k...@instaclustr.com> wrote: > COPY might work but over hundreds of gigabytes you'll probably run into > issues if you're overloaded. If you've got access to Spark that would be an > efficient way to pull down an entire table and dump it out using the > spark-cassandra-connector. > > On 15 May 2018 at 10:59, Jing Meng <self.rel...@gmail.com> wrote: > >> Hi guys, for some historical reason, our cassandra cluster is currently >> overloaded and operating on that somehow becomes a nightmare. Anyway, >> (sadly) we're planning to migrate cassandra data back to mysql... >> >> So we're not quite clear how to migrating the historical data from >> cassandra. >> >> While as I know there is the COPY command, I wonder if it works in >> product env where more than hundreds gigabytes data are present. And, if it >> does, would it impact server performance significantly? >> >> Apart from that, I know spark-connector can be used to scan data from c* >> cluster, but I'm not that familiar with spark and still not sure whether >> write data to mysql database can be done naturally with spark-connector. >> >> Are there any suggestions/best-practice/read-materials doing this? >> >> Thanks! >> > > -- Regards, Arbab Khalil Software Design Engineer