from:"任弘迪"

Re: how to call recommend method from ml.recommendation.ALS

2017-03-15 Thread 任弘迪

if the num of user-item pairs to predict aren't too large, say millions, you could transform the target dataframe and save the result to a hive table, then build cache based on that table for online services. if it's not the case(such as billions of user item pairs to predict), you have to start a

Re: is dataframe thread safe?

2017-02-13 Thread 任弘迪

for my understanding, all transformations are thread-safe cause dataframe is just a description of the calculation and it's immutable, so the case above is all right. just be careful with the actions. On Sun, Feb 12, 2017 at 4:06 PM, Mendelson, Assaf wrote: > Hi, > > I was wondering if dataframe

Re: [Spark Structured Streaming]: Is it possible to ingest data from a jdbc data source incrementally?

2016-12-29 Thread 任弘迪

why not sync binlog of mysql(hopefully the data is immutable and the table is append-only), send the log through kafka and then consume it by spark streaming? On Fri, Dec 30, 2016 at 9:01 AM, Michael Armbrust wrote: > We don't support this yet, but I've opened this JIRA as it sounds > generally