Can anybody help understand why pyspark streaming uses py4j callback to execute python code while pyspark batch uses worker.py?
regarding pyspark streaming, is py4j callback only used for DStream, worker.py still used for RDD? thanks, Renyi.
Can anybody help understand why pyspark streaming uses py4j callback to execute python code while pyspark batch uses worker.py?
regarding pyspark streaming, is py4j callback only used for DStream, worker.py still used for RDD? thanks, Renyi.