joining streams from multiple kafka clusters

2018-07-17 Thread sathich
Hi, My question is about ability to integrate spark streaming with multiple clusters.Is it a supported use case. An example of that is that two topics owned by different group and they have their own kakka infra . Can i have two dataframes as a result of spark.readstream listening to different kafk

Re: Dataset - withColumn and withColumnRenamed that accept Column type

2018-07-17 Thread sathich
this may work val df_post= listCustomCols .foldLeft(df_pre){(tempDF, listValue) => tempDF.withColumn( listValue.name, new Column(listValue.name.toString + funcUDF(listValue.name)) ) and outsource the renaming to an udf or you can rename the c

Re: Custom Data Source for getting data from Rest based services

2017-11-22 Thread sathich
Hi Sourav, This is quite an useful addition to the spark family, this is a usecase that comes more often than talked about. * to get a 3rd party mapping data(geo coordinates) , * access database data through rest. * download data from from bulk data api service It will be really useful to be