Re: how to find the nearest holiday

2017-04-25 Thread Wen Pei Yu
TypeError: unorderable types: str() >= datetime.date()   Should transfer string to Date type when compare.   Yu Wenpei.   - Original message -From: Zeming Yu To: user Cc:Subject: how to find the nearest holidayDate: Tue, Apr 25, 2017 3:39 PM  I have a column of dates (date type), just tryin

Re: Aggregated column name

2017-03-23 Thread Wen Pei Yu
will give you explicit control over the name of the resulting column. In your example, that would look something like: df.groupby(col("...")).agg(count("number")).alias("ColumnNameCount") Hope that helps! Kevin On Thu, Mar 23, 2017 at 2:41 AM, Wen Pei Yu wrote: Hi Al

Aggregated column name

2017-03-23 Thread Wen Pei Yu
Hi All   I found some spark version(spark 1.4) return upper case aggregated column,  and some return low case. As below code, df.groupby(col("...")).agg(count("number"))  may return   COUNT(number)  -- spark 1,4 count(number) - spark 1.6   Anyone know if there is configure parameter for thi

Re: Apply ML to grouped dataframe

2016-08-23 Thread Wen Pei Yu
3| [2.0,16.0]| |12462589343|3| [1.0,1.0]| +---+-++ From: ayan guha To: Wen Pei Yu/China/IBM@IBMCN Cc: user , Nirmal Fernando Date: 08/23/2016 05:13 PM Subject:Re: Apply ML to grouped dataframe I would suggest you to construct a toy probl

Re: Apply ML to grouped dataframe

2016-08-22 Thread Wen Pei Yu
Hi Mirmal Filter works fine if I want handle one of grouped dataframe. But I has multiple grouped dataframe, I wish I can apply ML algorithm to all of them in one job, but not in for loops. Wenpei. From: Nirmal Fernando To: Wen Pei Yu/China/IBM@IBMCN Cc: User Date: 08/23/2016

Re: Apply ML to grouped dataframe

2016-08-22 Thread Wen Pei Yu
: Nirmal Fernando To: Wen Pei Yu/China/IBM@IBMCN Cc: User Date: 08/23/2016 01:14 PM Subject:Re: Apply ML to grouped dataframe Hi Wen, AFAIK Spark MLlib implements its machine learning algorithms on top of Spark dataframe API. What did you mean by a grouped dataframe? On T

Re: Apply ML to grouped dataframe

2016-08-22 Thread Wen Pei Yu
Hi Nirmal I didn't get your point. Can you tell me more about how to use MLlib to grouped dataframe? Regards. Wenpei. From: Nirmal Fernando To: Wen Pei Yu/China/IBM@IBMCN Cc: User Date: 08/23/2016 10:26 AM Subject:Re: Apply ML to grouped dataframe You can use

Apply ML to grouped dataframe

2016-08-22 Thread Wen Pei Yu
Hi We have a dataframe, then want group it and apply a ML algorithm or statistics(say t test) to each one. Is there any efficient way for this situation? Currently, we transfer to pyspark, use groupbykey and apply numpy function to array. But this wasn't an efficient way, right? Regards. Wenpei

Re: LogisticsRegression in ML pipeline help page

2016-01-06 Thread Wen Pei Yu
You can get old resource under http://spark.apache.org/documentation.html And linear doc here for 1.5.2 http://spark.apache.org/docs/1.5.2/mllib-linear-methods.html#logistic-regression http://spark.apache.org/docs/1.5.2/ml-linear-methods.html Regards. Yu Wenpei. From: Arunkumar Pillai To: