Absolutely. I meant to say the confidence calculation depends on the
support calculations and hence would reduce the time. Thanks for pointing
that out.
On Thu, 7 May, 2020, 11:56 pm Sean Owen, wrote:
> The confidence calculation is pretty trivial, the work is finding the
> supports needed. Not
The confidence calculation is pretty trivial, the work is finding the
supports needed. Not sure how to optimize that.
On Thu, May 7, 2020, 1:12 PM Aditya Addepalli wrote:
> Hi Sean,
>
> 1.
> I was thinking that by specifying the consequent we can (somehow?) skip
> the confidence calculation for
Hi Sean,
1.
I was thinking that by specifying the consequent we can (somehow?) skip the
confidence calculation for all the other consequents.
This would greatly reduce the time taken as we avoid computation for
consequents we don't care about.
2.
Is limiting rule size even possible? I thought b
Yes, you can get the correct support this way by accounting for how
many rows were filtered out, but not the right confidence, as it
depends on counting support in rows without the items of interest.
But computing confidence depends on computing all that support; how
would you optimize it even if
Hi,
I understand that this is not a priority with everything going on, but if
you think generating rules for only a single consequent adds value, I would
like to contribute.
Thanks & Regards,
Aditya
On Sat, May 2, 2020 at 9:34 PM Aditya Addepalli wrote:
> Hi Sean,
>
> I understand your approac
Hi Sean,
I understand your approach, but there's a slight problem.
If we generate rules after filtering for our desired consequent, we are
introducing some bias into our rules.
The confidence of the rules on the filtered input can be very high but this
may not be the case on the entire dataset.
T
You could just filter the input for sets containing the desired item,
and discard the rest. That doesn't mean all of the item sets have that
item, and you'd still have to filter, but may be much faster to
compute.
Increasing min support might generally have the effect of smaller
rules, though it do
Hi Everyone,
I was wondering if we could make any enhancements to the FP-Growth
algorithm in spark/pyspark.
Many times I am looking for a rule for a particular consequent, so I don't
need the rules for all the other consequents. I know I can filter the rules
to get the desired output, but if I co