Measuring the interest of a rule in Association Rules

Measuring the interest of a rule in Association Rules

The importance of knowing the correlation between events in marketing strategies

Reading Time: 4minutes

Post published on 10/12/2020 by Donata Petrelli and released with licenza CC BY-NC-ND 3.0 IT (Creative Common – Attribuzione – Non commerciale – Non opere derivate 3.0 Italia)

On a shoe e-commerce site (yes, my favourite) how important is it to suggest a high heel rather than a low one?
The answer to this question is called “lift” and it is a measure that is part of the Association Rules algorithm.
Thanks to your very grateful feedback, received after the previous article on Association Rules, I have seen that there is a lot of interest around this topic and many of you have asked me to continue talking about it.
I have therefore decided to complete the introduction to Association Rules by addressing the last important topic, the lift. I hope it will be of help to all those who have to manage the sales marketing of countless products… not just shoes. 🙂

SUMMARY OF THE PREVIOUS EPISODE

The Association Rules algorithm finds hidden relationships between elements of a set through two measures:

1) Support

\inline \dpi{150} \large Support ( X \rightarrow Y ) = p ( X \cup Y )

2) Confidence

\inline \dpi{150} \large Confidence ( X \rightarrow Y ) = p ( Y | X ) = p ( X \cup Y ) / p ( X )

Based on these metrics, they find those rules that exceed a minimum threshold of support and confidence.
However, it often happens that, in this way, one finds rules that are uninteresting or even useless, for the purpose of finding associations for commercial or marketing purposes.
Let us see why.

CRITICALITIES

The rule support/confidence method

\inline \dpi{150} \large X \rightarrow Y
does not take into account an important aspect: the absolute probability of the event Y.
Following only the logic of the support/confidence of a rule, one can also find useless rules. In fact, finding rules with acceptable support/confidence for the thresholds we have set ourselves but, in any case, lower than the value at the probability with which Y occurs makes little sense.
We therefore need a measure that indicates the “correlation between events”, i.e. how the occurrence of one event raises the occurrences of the other.
In the case of shopping on an ecommerce site, this means finding a measure that increases our confidence that we will find Y in our basket, knowing that X is there.

SOLUTION

Given two events X, Y we define the correlation coefficient between the two as

\inline \dpi{150} \large Corr X,Y = p ( X \cap Y ) / p ( X ) * p ( Y )
If corrX,Y = 1, the two events are independent
If corrX,Y > 1 the two events are positively correlated
If corrX,Y < 1 the two events are negatively correlated

If X → Y is an associative rule, the value corrX,Y is called a lift.
Specifically:

\inline \dpi{150} \large Lift ( X \rightarrow Y ) = p ( Y | X ) / p ( Y )and since :

\inline \dpi{150} \large p ( Y | X ) = Confidence ( X \rightarrow Y)we obtain that:

\inline \dpi{150} \large \mathbf{Lift ( X \rightarrow Y ) = Confidence ( X \rightarrow Y ) / p ( Y )}
Thanks to this new metric, it turns out that a strong rule is not always interesting.
Furthermore, it is important to note that:

\inline \dpi{150} \large \mathbf{Lift of A \rightarrow B = Lift of B \rightarrow A}
and this confirms the usefulness of the found rule.
Let us see it in practice.

EXAMPLE

Let us take the following transactions as an example:

ID TransactionItems
1X, Y
2X, Y, Z
3X, Z
4X, Z
5Z
6Z
7Z
8Z

Where:

  • X : Heels
  • Y : Handbag
  • Z : flats

And we calculate the support, confidence and lift values for the three rules as follows:

RuleSupportConfidenceLift
X -> Y25,0 %50,0 %2
X -> Z37,5 %75,0 %0,9
Y -> Z12,5 %50,0 %0,57

For an associative rule to be useful, it must reach a lift value of at least 1.
Therefore, we can conclude that: although the X->Z rule is stronger, the X->Y rule is more interesting.

CONCLUSION

Thanks to the lift measure, once rules have been found using the Association Rules algorithm, only interesting ones can be taken into account.
In the example shown, the choice is to promote high heels on the website in order to push the sale of the bag at the same time and thus optimise the budget.
Personally, I would have promoted high heels anyway … but this is pure passion and not associative calculation 😀
it is instead important to consider this last aspect, related to the associative rules, in order to optimise our marketing campaigns and our business.

Measuring the interest of a rule in Association Rules
Scroll to top

Follow Me on LinkedIn

Want to give me your feedback

about Data Science … or high heels?

Connect with Me !