Re: Use case for data in SQL Server

2015-02-24 Thread Denny Lee
Hi Suhel, My team is currently working with a lot of SQL Server databases as one of our many data sources and ultimately we pull the data into HDFS from SQL Server. As we had a lot of SQL databases to hit, we used the jTDS driver and SQOOP to extract the data out of SQL Server and into HDFS (smal

Re: Use case for data in SQL Server

2015-02-24 Thread Cheng Lian
There is a newly introduced JDBC data source in Spark 1.3.0 (not the JdbcRDD in Spark core), which may be useful. However, currently there's no SQL server specific logics implemented. I'd assume standard SQL queries should work. Cheng On 2/24/15 7:02 PM, Suhel M wrote: Hey, I am trying to w

Use case for data in SQL Server

2015-02-24 Thread Suhel M
Hey, I am trying to work out what is the best way we can leverage Spark for crunching data that is sitting in SQL Server databases. Ideal scenario is being able to efficiently work with big data (10billion+ rows of activity data). We need to shape this data for machine learning problems and want