Till 1.3, you have to prepare the DF appropriately
def setupCondition(t):
if t[1] > 100:
v = 1
else:
v = 0
return Row(col1=t[0],col2=t[1],col3=t[2],col4=v)
d1=[[1001,100,50],[1001,200,100],[1002,100,99]]
d1RDD = sc.parallelize(d1).map(setupCondition)
d1DF = s
Yes. I think that your sql solution will work but I was looking for a
solution with DataFrame API. I had thought to use UDF such as:
val myFunc = udf {(input: Int) => {if (input > 100) 1 else 0}}
Although I'd like to know if it's possible to do it directly in the
aggregation inserting a lambda fu
For this, I can give you a SQL solution:
joined data.registerTempTable('j')
Res=ssc.sql('select col1,col2, count(1) counter, min(col3) minimum,
sum(case when endrscp>100 then 1 else 0 end test from j'
Let me know if this works.
On 26 May 2015 23:47, "Masf" wrote:
> Hi
> I don't know how it wor
Hi
I don't know how it works. For example:
val result = joinedData.groupBy("col1","col2").agg(
count(lit(1)).as("counter"),
min(col3).as("minimum"),
sum("case when endrscp> 100 then 1 else 0 end").as("test")
)
How can I do it?
Thanks
Regards.
Miguel.
On Tue, May 26, 2015 at 12:35 AM,
Case when col2>100 then 1 else col2 end
On 26 May 2015 00:25, "Masf" wrote:
> Hi.
>
> In a dataframe, How can I execution a conditional sentence in a
> aggregation. For example, Can I translate this SQL statement to DataFrame?:
>
> SELECT name, SUM(IF table.col2 > 100 THEN 1 ELSE table.col1)
> FR
Hi.
In a dataframe, How can I execution a conditional sentence in a
aggregation. For example, Can I translate this SQL statement to DataFrame?:
SELECT name, SUM(IF table.col2 > 100 THEN 1 ELSE table.col1)
FROM table
GROUP BY name
Thanks
--
Regards.
Miguel