Hi,
I wanted to understand forEachPartition logic. In the code below, I am
assuming the iterator is executing in a distributed fashion.
1. Assuming I have a stream which has timestamp data which is sorted. Will
the stringiterator in foreachPartition process each line in order?
2. Assuming I have a static pool of Kafka connections, where should I get a
connection from a pool to be used to send data to Kafka?
addMTSUnmatched.foreachRDD(
new Function<JavaRDD<String>, Void>() {
@Override
public Void call(JavaRDD<String> stringJavaRDD) throws Exception {
stringJavaRDD.foreachPartition(
new VoidFunction<Iterator<String>>() {
@Override
public void call(Iterator<String>
stringIterator) throws Exception {
while(stringIterator.hasNext()){
String str = stringIterator.next();
if(OnlineUtils.ESFlag) {
OnlineUtils.printToFile(str,
1, type1_outputFile, OnlineUtils.client);
}else{
OnlineUtils.printToFile(str,
1, type1_outputFile);
}
}
}
}
);
return null;
}
}
);
Thanks
Nipun