Hey Users,
I want to run spark job from virtual environment using Python.
Please note I am creating virtual env (using python3 -m venv env)
I see that there are 3 variables for PYTHON which we have to set:
PYTHONPATH
PYSPARK_DRIVER_PYTHON
PYSPARK_PYTHON
I have 2 doubts:
1. If i want to use Virt
Hi all,
I have a spark application with Arbitrary Stateful Aggregation implemented
with FlatMapGroupsWithStateFunction.
I want to make some statistics about incoming events inside
FlatMapGroupsWithStateFunction.
The statistics are made from some event property which on the one hand has
dynamic val