** You do NOT need dataframes, I mean..... On Sat, Feb 17, 2018 at 3:58 PM, ayan guha <guha.a...@gmail.com> wrote:
> Hi > > Couple of suggestions: > > 1. Do not use Dataset, use Dataframe in this scenario. There is no benefit > of dataset features here. Using Dataframe, you can write an arbitrary UDF > which can do what you want to do. > 2. In fact you do need dataframes here. You would be better off with RDD > here. just create a RDD of symbols and use map to do the processing. > > On Sat, Feb 17, 2018 at 12:40 PM, Irving Duran <irving.du...@gmail.com> > wrote: > >> Do you only want to use Scala? Because otherwise, I think with pyspark >> and pandas read table you should be able to accomplish what you want to >> accomplish. >> >> Thank you, >> >> Irving Duran >> >> On 02/16/2018 06:10 PM, Lian Jiang wrote: >> >> Hi, >> >> I have a user case: >> >> I want to download S&P500 stock data from Yahoo API in parallel using >> Spark. I have got all stock symbols as a Dataset. Then I used below code to >> call Yahoo API for each symbol: >> >> >> >> case class Symbol(symbol: String, sector: String) >> >> case class Tick(symbol: String, sector: String, open: Double, close: >> Double) >> >> >> // symbolDS is Dataset[Symbol], pullSymbolFromYahoo returns Dataset[Tick] >> >> >> symbolDs.map { k => >> >> pullSymbolFromYahoo(k.symbol, k.sector) >> >> } >> >> >> This statement cannot compile: >> >> >> Unable to find encoder for type stored in a Dataset. Primitive types >> (Int, String, etc) and Product types (case classes) are supported by >> importing spark.implicits._ Support for serializing other types will be >> added in future releases. >> >> >> My questions are: >> >> >> 1. As you can see, this scenario is not traditional dataset handling such >> as count, sql query... Instead, it is more like a UDF which apply random >> operation on each record. Is Spark good at handling such scenario? >> >> >> 2. Regarding the compilation error, any fix? I did not find a >> satisfactory solution online. >> >> >> Thanks for help! >> >> >> >> >> >> > > > -- > Best Regards, > Ayan Guha > -- Best Regards, Ayan Guha