Kelly Zhang,
You can add a SparkListenerto your spark context:
sparkContext.addSparkListener(newSparkListener{})
That one can override onTaskEnd, which provides you a
SparkListenerTaskEnd for each task. That instance provides you access to
the metrics.
See:
-
https://spark.apache.org/doc
Hi Sean,
I understand your approach, but there's a slight problem.
If we generate rules after filtering for our desired consequent, we are
introducing some bias into our rules.
The confidence of the rules on the filtered input can be very high but this
may not be the case on the entire dataset.
T
You could just filter the input for sets containing the desired item,
and discard the rest. That doesn't mean all of the item sets have that
item, and you'd still have to filter, but may be much faster to
compute.
Increasing min support might generally have the effect of smaller
rules, though it do
Hi Everyone,
I was wondering if we could make any enhancements to the FP-Growth
algorithm in spark/pyspark.
Many times I am looking for a rule for a particular consequent, so I don't
need the rules for all the other consequents. I know I can filter the rules
to get the desired output, but if I co