Hi
How can I rename nested columns in dataframe through scala API? Like
following schema
> |-- site: struct (nullable = false)
>
> ||-- site_id: string (nullable = true)
>
> ||-- site_name: string (nullable = true)
>
> ||-- site_domain: string (nullable = true)
>
> ||-- site_cat:
ct($"a", $"b", $"c")).show()
>
> ---+---+---+---+ | A| B| C| D| +---+---+---+---+ | a| b|
> c|[a,b,c]| +---+---+---+---+
>
> You can repeat to get the inner nesting.
>
> Xinh
>
> On Fri, May 13, 2016 at 4:51 AM, Prashant Bhardwaj <
> prasha
Hi
Let's say I have a flat dataframe with 6 columns like.
{
"a": "somevalue",
"b": "somevalue",
"c": "somevalue",
"d": "somevalue",
"e": "somevalue",
"f": "somevalue"
}
Now I want to convert this dataframe to contain nested column like
{
"nested_obj1": {
"a": "somevalue",
"b": "somevalue"
},
"ne
Anyway I got it. I have to use !== instead of ===. Thank BTW.
On Wed, Dec 9, 2015 at 9:39 PM, Prashant Bhardwaj <
prashant2006s...@gmail.com> wrote:
> I have to do opposite of what you're doing. I have to filter non-empty
> records.
>
> On Wed, Dec 9, 2015 at 9:33 PM, Gok
> [116,Harrison,,20]
>
> Total No.of Records with AGE <=15 2
> [110,Harrison,Male,15]
> [113,Harrison,,15]
>
> Thanks & Regards,
> Gokula Krishnan* (Gokul)*
> Contact :+1 980-298-1740
>
> On Wed, Dec 9, 2015 at 8:24 AM, Prashant Bhardwaj <
> prash
Engineer
>
> cel: 158-0164-9103
> wetchat: azuryy
>
>
> On Wed, Dec 9, 2015 at 7:43 PM, Prashant Bhardwaj <
> prashant2006s...@gmail.com> wrote:
>
>> Hi
>>
>> I have two columns in my json which can have null, empty and non-empty
>> string as valu
Hi
I have two columns in my json which can have null, empty and non-empty
string as value.
I know how to filter records which have non-null value using following:
val req_logs = sqlContext.read.json(filePath)
val req_logs_with_dpid = req_log.filter("req_info.dpid is not null or
req_info.dpid_sha
Hi
Some Background:
We have a Kafka cluster with ~45 topics. Some of topics contains logs in
Json format and some in PSV(pipe separated value) format. Now I want to
consume these logs using Spark streaming and store them in Parquet format
in HDFS.
Now my question is:
1. Can we create a InputDStre