Discover Association Rule Learning, a technique used to uncover hidden relationships between items. This lesson explains support, confidence, lift, and how Market Basket Analysis helps businesses improve recommendations and sales strategies.
Association rule learning is an unsupervised machine learning technique that discovers hidden patterns and relationships in transactional data. Originally developed for analyzing shopping cart data, association rules answer questions like "What products are frequently bought together?"
The most famous application is market basket analysis, where retailers identify product combinations to optimize store layouts, create bundle offers, and power recommendation engines. However, association rules extend far beyond retail into healthcare, web usage mining, and fraud detection.
An association rule expresses a relationship between items in the form:
{Antecedent} → {Consequent}
For example: {Bread, Butter} → {Milk}
This rule suggests that customers who buy bread and butter are likely to also buy milk. The antecedent (left side) represents the condition, while the consequent (right side) represents the predicted item.
Three fundamental metrics evaluate the quality and usefulness of association rules.
Support measures how frequently an itemset appears in all transactions:
Support(A) = (Transactions containing A) / (Total transactions)
For a rule A → B: Support(A → B) = (Transactions containing both A and B) / (Total transactions)
Support filters out rare combinations that may not be practically useful.
Confidence measures how often the rule is correct when the antecedent is present:
Confidence(A → B) = Support(A → B) / Support(A)
This tells you the probability of finding B in a transaction that contains A.
Lift measures how much more likely B is purchased when A is purchased, compared to B being purchased alone:
Lift(A → B) = Confidence(A → B) / Support(B)
The Apriori algorithm is the foundational method for mining association rules efficiently.
The algorithm is based on a simple but powerful observation: If an itemset is infrequent, all its supersets must also be infrequent.
This means if {Bread} doesn't meet the minimum support threshold, we don't need to check {Bread, Milk}, {Bread, Butter}, or any other combination containing Bread.
Association rule learning powers numerous business applications:
Let's implement market basket analysis using the mlxtend library.
import pandas as pd
import numpy as np
# Sample transaction data
transactions = [
['Bread', 'Milk', 'Eggs'],
['Bread', 'Butter', 'Milk'],
['Milk', 'Eggs', 'Cheese'],
['Bread', 'Milk', 'Butter', 'Eggs'],
['Bread', 'Milk'],
['Eggs', 'Cheese'],
['Bread', 'Butter'],
['Milk', 'Eggs', 'Bread', 'Butter'],
['Bread', 'Milk', 'Eggs'],
['Butter', 'Eggs', 'Milk']
]
print(f"Number of transactions: {len(transactions)}")
Each transaction is a list of items purchased together in a single shopping trip.
Association rule algorithms require a specific data format called one-hot encoding:
from mlxtend.preprocessing import TransactionEncoder
# Transform transactions to one-hot encoded format
encoder = TransactionEncoder()
encoded_array = encoder.fit_transform(transactions)
# Create DataFrame for readability
df = pd.DataFrame(encoded_array, columns=encoder.columns_)
print("Encoded transaction format:")
print(df.head())
The TransactionEncoder converts transaction lists into a binary matrix where each column represents an item and each row represents a transaction.
from mlxtend.frequent_patterns import apriori
# Find frequent itemsets with minimum support of 30%
frequent_itemsets = apriori(df, min_support=0.3, use_colnames=True)
print("Frequent Itemsets:")
print(frequent_itemsets.sort_values('support', ascending=False))
The Apriori algorithm identifies all itemsets appearing in at least 30% of transactions. Lower support thresholds find more patterns but may include less meaningful associations.
from mlxtend.frequent_patterns import association_rules
# Generate rules with minimum confidence of 60%
rules = association_rules(frequent_itemsets,
metric="confidence",
min_threshold=0.6)
# Select relevant columns
rules_display = rules[['antecedents', 'consequents',
'support', 'confidence', 'lift']]
print("Association Rules:")
print(rules_display.round(3))
This generates association rules from frequent itemsets, filtering by a minimum confidence threshold of 60%.
# Find the strongest rules by lift
top_rules = rules.nlargest(5, 'lift')[
['antecedents', 'consequents', 'support', 'confidence', 'lift']
]
print("Top 5 rules by lift:")
for idx, row in top_rules.iterrows():
print(f"\n{set(row['antecedents'])} → {set(row['consequents'])}")
print(f" Support: {row['support']:.2%}")
print(f" Confidence: {row['confidence']:.2%}")
print(f" Lift: {row['lift']:.2f}")
Rules with high lift values indicate strong associations worth acting upon. A lift of 1.5 means customers are 50% more likely to buy the consequent when they buy the antecedent.
# Find strong, actionable rules
strong_rules = rules[
(rules['confidence'] >= 0.7) &
(rules['lift'] >= 1.2) &
(rules['support'] >= 0.2)
]
print(f"Strong rules found: {len(strong_rules)}")
Combining multiple thresholds ensures you focus on rules that are both statistically significant and practically useful.
# Find rules where 'Milk' is in the consequent
milk_rules = rules[rules['consequents'].apply(
lambda x: 'Milk' in x
)]
print("Rules that lead to Milk purchases:")
print(milk_rules[['antecedents', 'confidence', 'lift']].round(3))
This filtering helps answer targeted questions like "What products lead customers to buy Milk?"
| Metric | Typical Range | Considerations |
|---|---|---|
| Support | 0.01 - 0.1 | Lower for rare but valuable associations |
| Confidence | 0.5 - 0.8 | Higher for reliable predictions |
| Lift | > 1.0 | Focus on values significantly above 1 |
# Example: Filter out items with very high support
item_support = df.sum() / len(df)
common_items = item_support[item_support > 0.9].index.tolist()
print(f"Items to potentially exclude (>90% support): {common_items}")
Extremely common items often create obvious rules that provide little actionable insight.
Association rules identify correlations, not causal relationships. A rule like {Diapers} → {Beer} became famous in data mining, but the association doesn't mean buying diapers causes people to buy beer—it reflects a demographic pattern.
Many generated rules may be redundant or variations of the same pattern. Focus on rules with the highest lift and practical business value.