Thanks for the prompt reply.
May I ask why the keyBy(f) is not supported in DStreams? any particular
reason?
or is it possible to add it in future release since that "stream.map(record
=> (keyFunction(record), record))" looks tedious.
I checked the python source code, KeyBy looks like a "shortcu
Hi Spark Experts,
I'm trying to use join(otherStream, [numTasks]) on DStreams, and it
requires called on two DStreams of (K, V) and (K, W) pairs,
Usually in common RDD, we could use keyBy(f) to build the (K, V) pair,
however I could not find it in DStream.
My question is:
What is the expected
Thanks Davies, it works in 1.2.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Help-pyspark-sql-List-flatMap-results-become-tuple-tp9961p9975.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
---
named tuple degenerate to tuple.
*A400.map(lambda i: map(None,i.INTEREST))*
===
[(u'x', 1), (u'y', 2)]
[(u'x', 2), (u'y', 3)]
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Help-pyspark-sql-List-flatMap-results-become-tupl
Hi pyspark guys,
I have a json file, and its struct like below:
{"NAME":"George", "AGE":35, "ADD_ID":1212, "POSTAL_AREA":1,
"TIME_ZONE_ID":1, "INTEREST":[{"INTEREST_NO":1, "INFO":"x"},
{"INTEREST_NO":2, "INFO":"y"}]}
{"NAME":"John", "AGE":45, "ADD_ID":1213, "POSTAL_AREA":1, "TIME_ZONE_ID":1,
"IN