Back to Association and Anomaly Detection

Progress1/2 lessons (50%)

Lesson 1

Association Rule Learning

Discover Association Rule Learning, a technique used to uncover hidden relationships between items. This lesson explains support, confidence, lift, and how Market Basket Analysis helps businesses improve recommendations and sales strategies.

10 min read26 views

Introduction to Association Rule Learning

Association rule learning is an unsupervised machine learning technique that discovers hidden patterns and relationships in transactional data. Originally developed for analyzing shopping cart data, association rules answer questions like "What products are frequently bought together?"

The most famous application is market basket analysis, where retailers identify product combinations to optimize store layouts, create bundle offers, and power recommendation engines. However, association rules extend far beyond retail into healthcare, web usage mining, and fraud detection.

Understanding Association Rules

An association rule expresses a relationship between items in the form:

{Antecedent} → {Consequent}

For example: {Bread, Butter} → {Milk}

This rule suggests that customers who buy bread and butter are likely to also buy milk. The antecedent (left side) represents the condition, while the consequent (right side) represents the predicted item.

Key Terminology

Itemset: A collection of one or more items (e.g., {Bread, Milk})
Transaction: A single purchase event containing multiple items
Frequent Itemset: An itemset that appears in transactions more often than a minimum threshold
Association Rule: A relationship showing that one itemset implies another

Essential Metrics for Association Rules

Three fundamental metrics evaluate the quality and usefulness of association rules.

Support

Support measures how frequently an itemset appears in all transactions:

Support(A) = (Transactions containing A) / (Total transactions)

For a rule A → B: Support(A → B) = (Transactions containing both A and B) / (Total transactions)

Support filters out rare combinations that may not be practically useful.

Confidence

Confidence measures how often the rule is correct when the antecedent is present:

Confidence(A → B) = Support(A → B) / Support(A)

This tells you the probability of finding B in a transaction that contains A.

Lift

Lift measures how much more likely B is purchased when A is purchased, compared to B being purchased alone:

Lift(A → B) = Confidence(A → B) / Support(B)

Lift > 1: Positive association (A and B are purchased together more than expected)
Lift = 1: No association (A and B are independent)
Lift < 1: Negative association (A and B are substitutes)

The Apriori Algorithm

The Apriori algorithm is the foundational method for mining association rules efficiently.

The Apriori Principle

The algorithm is based on a simple but powerful observation: If an itemset is infrequent, all its supersets must also be infrequent.

This means if {Bread} doesn't meet the minimum support threshold, we don't need to check {Bread, Milk}, {Bread, Butter}, or any other combination containing Bread.

Algorithm Steps

Find frequent 1-itemsets: Identify individual items meeting minimum support
Generate candidates: Combine frequent itemsets to create larger candidate itemsets
Prune candidates: Remove candidates containing infrequent subsets
Count support: Scan transactions to find support for remaining candidates
Repeat: Continue until no new frequent itemsets are found
Generate rules: Create rules from frequent itemsets and filter by confidence

Real-World Applications

Association rule learning powers numerous business applications:

Retail Recommendations: "Customers who bought X also bought Y"
Store Layout Optimization: Place frequently co-purchased items nearby
Cross-Selling Strategies: Bundle complementary products together
Medical Diagnosis: Identify symptom combinations that indicate diseases
Web Usage Mining: Discover pages commonly visited together
Fraud Detection: Find unusual transaction patterns

Implementing Association Rules in Python

Let's implement market basket analysis using the mlxtend library.

Preparing Transaction Data

import pandas as pd
import numpy as np

# Sample transaction data
transactions = [
    ['Bread', 'Milk', 'Eggs'],
    ['Bread', 'Butter', 'Milk'],
    ['Milk', 'Eggs', 'Cheese'],
    ['Bread', 'Milk', 'Butter', 'Eggs'],
    ['Bread', 'Milk'],
    ['Eggs', 'Cheese'],
    ['Bread', 'Butter'],
    ['Milk', 'Eggs', 'Bread', 'Butter'],
    ['Bread', 'Milk', 'Eggs'],
    ['Butter', 'Eggs', 'Milk']
]

print(f"Number of transactions: {len(transactions)}")

Each transaction is a list of items purchased together in a single shopping trip.

Converting to Transaction Encoding

Association rule algorithms require a specific data format called one-hot encoding:

from mlxtend.preprocessing import TransactionEncoder

# Transform transactions to one-hot encoded format
encoder = TransactionEncoder()
encoded_array = encoder.fit_transform(transactions)

# Create DataFrame for readability
df = pd.DataFrame(encoded_array, columns=encoder.columns_)

print("Encoded transaction format:")
print(df.head())

The TransactionEncoder converts transaction lists into a binary matrix where each column represents an item and each row represents a transaction.

Finding Frequent Itemsets with Apriori

from mlxtend.frequent_patterns import apriori

# Find frequent itemsets with minimum support of 30%
frequent_itemsets = apriori(df, min_support=0.3, use_colnames=True)

print("Frequent Itemsets:")
print(frequent_itemsets.sort_values('support', ascending=False))

The Apriori algorithm identifies all itemsets appearing in at least 30% of transactions. Lower support thresholds find more patterns but may include less meaningful associations.

Generating Association Rules

from mlxtend.frequent_patterns import association_rules

# Generate rules with minimum confidence of 60%
rules = association_rules(frequent_itemsets, 
                          metric="confidence", 
                          min_threshold=0.6)

# Select relevant columns
rules_display = rules[['antecedents', 'consequents', 
                       'support', 'confidence', 'lift']]

print("Association Rules:")
print(rules_display.round(3))

This generates association rules from frequent itemsets, filtering by a minimum confidence threshold of 60%.

Interpreting the Results

# Find the strongest rules by lift
top_rules = rules.nlargest(5, 'lift')[
    ['antecedents', 'consequents', 'support', 'confidence', 'lift']
]

print("Top 5 rules by lift:")
for idx, row in top_rules.iterrows():
    print(f"\n{set(row['antecedents'])} → {set(row['consequents'])}")
    print(f"  Support: {row['support']:.2%}")
    print(f"  Confidence: {row['confidence']:.2%}")
    print(f"  Lift: {row['lift']:.2f}")

Rules with high lift values indicate strong associations worth acting upon. A lift of 1.5 means customers are 50% more likely to buy the consequent when they buy the antecedent.

Filtering and Analyzing Rules

Filtering by Multiple Criteria

# Find strong, actionable rules
strong_rules = rules[
    (rules['confidence'] >= 0.7) & 
    (rules['lift'] >= 1.2) &
    (rules['support'] >= 0.2)
]

print(f"Strong rules found: {len(strong_rules)}")

Combining multiple thresholds ensures you focus on rules that are both statistically significant and practically useful.

Finding Rules for Specific Items

# Find rules where 'Milk' is in the consequent
milk_rules = rules[rules['consequents'].apply(
    lambda x: 'Milk' in x
)]

print("Rules that lead to Milk purchases:")
print(milk_rules[['antecedents', 'confidence', 'lift']].round(3))

This filtering helps answer targeted questions like "What products lead customers to buy Milk?"

Best Practices for Association Rule Mining

Choosing Appropriate Thresholds

Metric	Typical Range	Considerations
Support	0.01 - 0.1	Lower for rare but valuable associations
Confidence	0.5 - 0.8	Higher for reliable predictions
Lift	> 1.0	Focus on values significantly above 1

Data Preparation Tips

Remove trivial items: Exclude items purchased in nearly every transaction (like shopping bags)
Handle quantity: Decide whether to treat multiple units as repeated items or single presence
Time windows: Consider analyzing transactions within specific time periods
Category levels: Experiment with product categories vs. specific products

# Example: Filter out items with very high support
item_support = df.sum() / len(df)
common_items = item_support[item_support > 0.9].index.tolist()

print(f"Items to potentially exclude (>90% support): {common_items}")

Extremely common items often create obvious rules that provide little actionable insight.

Limitations and Considerations

Computational Complexity

The number of possible itemsets grows exponentially with the number of items
Large transaction databases require efficient implementations
Consider sampling for initial exploration on massive datasets

Correlation vs. Causation

Association rules identify correlations, not causal relationships. A rule like {Diapers} → {Beer} became famous in data mining, but the association doesn't mean buying diapers causes people to buy beer—it reflects a demographic pattern.

Rule Redundancy

Many generated rules may be redundant or variations of the same pattern. Focus on rules with the highest lift and practical business value.

Key Takeaways

Association rule learning discovers relationships between items in transactional data
Support measures frequency, confidence measures reliability, and lift measures association strength
The Apriori algorithm efficiently finds frequent itemsets using the downward closure property
Rules with lift > 1 indicate positive associations worth investigating
Market basket analysis drives product recommendations, store layouts, and cross-selling strategies
Always filter rules by multiple metrics and validate findings with domain expertise

Related Lessons

Anomaly Detection

Learn how anomaly detection identifies rare or unusual patterns in data. This lesson covers statistical, clustering-based, and machine learning methods used in fraud detection, system monitoring, and security analytics.