eady) to reduce memory usage. Playing
around with different Storage levels (MEMORY_ONLY_SER, for example) might
also help.
Best
Gaurav Jain
Master's Student, D-INFK
ETH Zurich
Email: jaing at student dot ethz dot ch
-
Gaurav Jain
Master's Student, D-INFK
ETH Zurich
--
View this
KTH talks about this:
http://www.diva-portal.org/smash/get/diva2:605106/FULLTEXT01.pdf
Best
-
Gaurav Jain
Master's Student, D-INFK
ETH Zurich
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/rdd-cache-is-not-faster-tp7804p7835.html
Sent from the A
Hello Spark Streaming Experts
I have a use-case, where I have a bunch of log-entries coming in, say every
10 seconds (Batch-interval). I create a JavaPairDStream[K,V] from these
log-entries. Now, there are two things I want to do with this
JavaPairDStream:
1. Use key-dependent state (updated by u
Probably, not the answer you are looking for, but you can always include the
'key' in each of the 'New Values' itself. Something like:
class myVal {
T myData;
T key;
}
and in your updateStateByKey function, access the 'key' as val.key (which
would be the same for each of the items in List
I have a simple Java class as follows, that I want to use as a key while
applying groupByKey or reduceByKey functions:
private static class FlowId {
public String dcxId;
public String trxId;
public String msgType;
pub
I am getting a strange null pointer exception when trying to list the first
entry of a JavaPairRDD after calling groupByKey on it. Following is my code:
JavaPairRDD, List> KeyToAppList =
KeyToApp.distinct().groupByKey();
// System.out.println("First