Please refer the link and drop() provides features to drop the rows with
Null / Non-Null columns. Hope, it also helps.

https://spark.apache.org/docs/1.5.2/api/scala/index.html#org.apache.spark.sql.DataFrameNaFunctions



Thanks & Regards,
Gokula Krishnan* (Gokul)*

On Wed, Dec 9, 2015 at 11:12 AM, Gokula Krishnan D <email2...@gmail.com>
wrote:

> Ok, then you can slightly change like
>
> [image: Inline image 1]
>
> Thanks & Regards,
> Gokula Krishnan* (Gokul)*
>
>
> On Wed, Dec 9, 2015 at 11:09 AM, Prashant Bhardwaj <
> prashant2006s...@gmail.com> wrote:
>
>> I have to do opposite of what you're doing. I have to filter non-empty
>> records.
>>
>> On Wed, Dec 9, 2015 at 9:33 PM, Gokula Krishnan D <email2...@gmail.com>
>> wrote:
>>
>>> Hello Prashant -
>>>
>>> Can you please try like this :
>>>
>>> For the instance, input file name is "student_detail.txt" and
>>>
>>> ID,Name,Sex,Age
>>> ===============
>>> 101,Alfred,Male,30
>>> 102,Benjamin,Male,31
>>> 103,Charlie,Female,30
>>> 104,Julie,Female,30
>>> 105,Maven,Male,30
>>> 106,Dexter,Male,30
>>> 107,Lundy,Male,32
>>> 108,Rita,Female,30
>>> 109,Aster,Female,30
>>> 110,Harrison,Male,15
>>> 111,Rita,,30
>>> 112,Aster,,30
>>> 113,Harrison,,15
>>> 114,Rita,Male,20
>>> 115,Aster,,30
>>> 116,Harrison,,20
>>>
>>> [image: Inline image 2]
>>>
>>> *Output:*
>>>
>>> Total No.of Records without SEX 5
>>> [111,Rita,,30]
>>> [112,Aster,,30]
>>> [113,Harrison,,15]
>>> [115,Aster,,30]
>>> [116,Harrison,,20]
>>>
>>> Total No.of Records with AGE <=15 2
>>> [110,Harrison,Male,15]
>>> [113,Harrison,,15]
>>>
>>> Thanks & Regards,
>>> Gokula Krishnan* (Gokul)*
>>> Contact :+1 980-298-1740
>>>
>>> On Wed, Dec 9, 2015 at 8:24 AM, Prashant Bhardwaj <
>>> prashant2006s...@gmail.com> wrote:
>>>
>>>> Already tried it. But getting following error.
>>>>
>>>> overloaded method value filter with alternatives: (conditionExpr:
>>>> String)org.apache.spark.sql.DataFrame <and> (condition:
>>>> org.apache.spark.sql.Column)org.apache.spark.sql.DataFrame cannot be
>>>> applied to (Boolean)
>>>>
>>>> Also tried:
>>>>
>>>> val req_logs_with_dpid = 
>>>> req_logs.filter(req_logs("req_info.dpid").toString.length
>>>> != 0 )
>>>>
>>>> But getting same error.
>>>>
>>>>
>>>> On Wed, Dec 9, 2015 at 6:45 PM, Fengdong Yu <fengdo...@everstring.com>
>>>> wrote:
>>>>
>>>>> val req_logs_with_dpid = req_logs.filter(req_logs("req_info.pid") !=
>>>>> "" )
>>>>>
>>>>> Azuryy Yu
>>>>> Sr. Infrastructure Engineer
>>>>>
>>>>> cel: 158-0164-9103
>>>>> wetchat: azuryy
>>>>>
>>>>>
>>>>> On Wed, Dec 9, 2015 at 7:43 PM, Prashant Bhardwaj <
>>>>> prashant2006s...@gmail.com> wrote:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> I have two columns in my json which can have null, empty and
>>>>>> non-empty string as value.
>>>>>> I know how to filter records which have non-null value using
>>>>>> following:
>>>>>>
>>>>>> val req_logs = sqlContext.read.json(filePath)
>>>>>>
>>>>>> val req_logs_with_dpid = req_log.filter("req_info.dpid is not null
>>>>>> or req_info.dpid_sha1 is not null")
>>>>>>
>>>>>> But how to filter if value of column is empty string?
>>>>>> --
>>>>>> Regards
>>>>>> Prashant
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards
>>>> Prashant
>>>>
>>>
>>>
>>
>>
>> --
>> Regards
>> Prashant
>>
>
>

Reply via email to