-- foo:
string (nullable = false)* |-- level_0_struct: struct (nullable = true)
||-- level_1_a: string (nullable = true)
||-- level_1_b: integer (nullable = false)
The same problem applies if I tried to rename the fields, they become array
columns.
Is there any way to recursively manipulate repeated columns without
completely breaking their structure into individually repeated fields?
Best
--
Samuel
batch
size by a sampling function to save the traffic from Kafka?
Thanks!
Samuel
u know if there a way to do sampling for a stream when creating it?
Thanks,
Samuel
On Mon, May 16, 2016 at 12:54 AM, Mich Talebzadeh wrote:
> Hi Samuel,
>
> How do you create your RDD based on Kakfa direct stream?
>
> Do you have your code snippet?
>
> HTH
>
>
>
Hi,
Does anyone know how to get the batch information(like batch time, input
size, processing time, status) from Streaming UI by using Scala/Java API ?
Because I want to put the information in log files and the streaming jobs
are managed by YARN.
Thanks,
Samuel
Hi,
In Spark's in-memory logic, without cache, elements are accessed in an
iterator-based streaming style [
http://www.slideshare.net/liancheng/dtcc-14-spark-runtime-internals?next_slideshow=1
]
I have two questions:
1. if elements are read one line at at time from HDFS (disk) and then
tr
Unsubscribe
Hi all,
Having a strange issue that I can't find any previous issues for on the
mailing list or stack overflow.
Frequently we are getting "ACTOR SYSTEM CORRUPTED!! A Dispatcher can't have
less than 0 inhabitants!" with a stack trace, from akka, in the executor
logs, and the executor is marked as