Please refer the link and drop() provides features to drop the rows with Null / Non-Null columns. Hope, it also helps.
https://spark.apache.org/docs/1.5.2/api/scala/index.html#org.apache.spark.sql.DataFrameNaFunctions Thanks & Regards, Gokula Krishnan* (Gokul)* On Wed, Dec 9, 2015 at 11:12 AM, Gokula Krishnan D <email2...@gmail.com> wrote: > Ok, then you can slightly change like > > [image: Inline image 1] > > Thanks & Regards, > Gokula Krishnan* (Gokul)* > > > On Wed, Dec 9, 2015 at 11:09 AM, Prashant Bhardwaj < > prashant2006s...@gmail.com> wrote: > >> I have to do opposite of what you're doing. I have to filter non-empty >> records. >> >> On Wed, Dec 9, 2015 at 9:33 PM, Gokula Krishnan D <email2...@gmail.com> >> wrote: >> >>> Hello Prashant - >>> >>> Can you please try like this : >>> >>> For the instance, input file name is "student_detail.txt" and >>> >>> ID,Name,Sex,Age >>> =============== >>> 101,Alfred,Male,30 >>> 102,Benjamin,Male,31 >>> 103,Charlie,Female,30 >>> 104,Julie,Female,30 >>> 105,Maven,Male,30 >>> 106,Dexter,Male,30 >>> 107,Lundy,Male,32 >>> 108,Rita,Female,30 >>> 109,Aster,Female,30 >>> 110,Harrison,Male,15 >>> 111,Rita,,30 >>> 112,Aster,,30 >>> 113,Harrison,,15 >>> 114,Rita,Male,20 >>> 115,Aster,,30 >>> 116,Harrison,,20 >>> >>> [image: Inline image 2] >>> >>> *Output:* >>> >>> Total No.of Records without SEX 5 >>> [111,Rita,,30] >>> [112,Aster,,30] >>> [113,Harrison,,15] >>> [115,Aster,,30] >>> [116,Harrison,,20] >>> >>> Total No.of Records with AGE <=15 2 >>> [110,Harrison,Male,15] >>> [113,Harrison,,15] >>> >>> Thanks & Regards, >>> Gokula Krishnan* (Gokul)* >>> Contact :+1 980-298-1740 >>> >>> On Wed, Dec 9, 2015 at 8:24 AM, Prashant Bhardwaj < >>> prashant2006s...@gmail.com> wrote: >>> >>>> Already tried it. But getting following error. >>>> >>>> overloaded method value filter with alternatives: (conditionExpr: >>>> String)org.apache.spark.sql.DataFrame <and> (condition: >>>> org.apache.spark.sql.Column)org.apache.spark.sql.DataFrame cannot be >>>> applied to (Boolean) >>>> >>>> Also tried: >>>> >>>> val req_logs_with_dpid = >>>> req_logs.filter(req_logs("req_info.dpid").toString.length >>>> != 0 ) >>>> >>>> But getting same error. >>>> >>>> >>>> On Wed, Dec 9, 2015 at 6:45 PM, Fengdong Yu <fengdo...@everstring.com> >>>> wrote: >>>> >>>>> val req_logs_with_dpid = req_logs.filter(req_logs("req_info.pid") != >>>>> "" ) >>>>> >>>>> Azuryy Yu >>>>> Sr. Infrastructure Engineer >>>>> >>>>> cel: 158-0164-9103 >>>>> wetchat: azuryy >>>>> >>>>> >>>>> On Wed, Dec 9, 2015 at 7:43 PM, Prashant Bhardwaj < >>>>> prashant2006s...@gmail.com> wrote: >>>>> >>>>>> Hi >>>>>> >>>>>> I have two columns in my json which can have null, empty and >>>>>> non-empty string as value. >>>>>> I know how to filter records which have non-null value using >>>>>> following: >>>>>> >>>>>> val req_logs = sqlContext.read.json(filePath) >>>>>> >>>>>> val req_logs_with_dpid = req_log.filter("req_info.dpid is not null >>>>>> or req_info.dpid_sha1 is not null") >>>>>> >>>>>> But how to filter if value of column is empty string? >>>>>> -- >>>>>> Regards >>>>>> Prashant >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Regards >>>> Prashant >>>> >>> >>> >> >> >> -- >> Regards >> Prashant >> > >