Back to Regression Algorithms - Predicting Continuous Values

Progress2/4 lessons (50%)

Lesson 2

Multiple and Polynomial Regression

Explore Multiple and Polynomial Regression techniques to capture complex patterns in data. This lesson teaches how to model multiple features and nonlinear relationships for more powerful and flexible predictions.

10 min read25 views

Introduction

Real-world prediction problems rarely depend on just one variable. House prices depend on size, location, bedrooms, and age. Salary depends on experience, education, and skills. Multiple regression handles these multi-feature scenarios effectively.

Similarly, many relationships in nature aren't perfectly linear. Polynomial regression captures curved patterns that simple linear models miss. Together, these techniques significantly expand your predictive modeling capabilities.

Multiple Linear Regression

What is Multiple Linear Regression?

Multiple linear regression extends simple linear regression to include multiple input features. Instead of fitting a line, it fits a hyperplane through multi-dimensional space.

The Multiple Regression Equation

ŷ = w₀ + w₁x₁ + w₂x₂ + w₃x₃ + ... + wₙxₙ

Where:

ŷ = predicted value
w₀ = bias (intercept)
w₁, w₂, ..., wₙ = weights for each feature
x₁, x₂, ..., xₙ = input features

Each weight represents the impact of its corresponding feature on the prediction, assuming all other features remain constant.

Implementing Multiple Linear Regression

Step 1: Prepare Multi-Feature Data

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

# Create sample dataset
data = {
    'size_sqft': [1400, 1600, 1700, 1875, 1100, 1550, 2350, 2450],
    'bedrooms': [3, 3, 2, 4, 2, 3, 4, 4],
    'age_years': [10, 15, 20, 12, 25, 8, 5, 3],
    'price': [245, 312, 279, 308, 199, 325, 405, 450]
}

df = pd.DataFrame(data)

This creates a dataset with multiple features (square footage, bedrooms, age) to predict house prices.

Step 2: Separate Features and Target

X = df[['size_sqft', 'bedrooms', 'age_years']]
y = df['price']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=42
)

The features matrix X contains all input variables, while y holds the target variable (price).

Step 3: Train and Evaluate the Model

model = LinearRegression()
model.fit(X_train, y_train)

# View coefficients
feature_importance = pd.DataFrame({
    'Feature': X.columns,
    'Coefficient': model.coef_
})
print(feature_importance)
print(f"\nIntercept: {model.intercept_:.2f}")

Output:

      Feature  Coefficient
0   size_sqft         0.12
1    bedrooms        25.40
2   age_years        -3.85

Intercept: 85.23

This reveals that:

Each square foot adds approximately $120 to the price
Each additional bedroom adds about $25,400
Each year of age reduces the price by roughly $3,850

Step 4: Make Predictions

y_pred = model.predict(X_test)

# Predict price for a new house
new_house = np.array([[2000, 3, 7]])
predicted_price = model.predict(new_house)
print(f"Predicted price: ${predicted_price[0]:.2f}k")

The model combines all feature values with their respective weights to generate the final prediction.

Feature Importance in Multiple Regression

Understanding which features matter most helps in feature selection and model interpretation.

# Calculate absolute importance
importance = pd.DataFrame({
    'Feature': X.columns,
    'Coefficient': model.coef_,
    'Abs_Importance': np.abs(model.coef_)
})
importance = importance.sort_values('Abs_Importance', ascending=False)
print(importance)

Note: When features have different scales, coefficients alone don't indicate true importance. Feature scaling (covered in preprocessing) allows fair comparison.

Polynomial Regression

What is Polynomial Regression?

Polynomial regression models non-linear relationships by adding polynomial terms (squared, cubed, etc.) of the original features. Despite the curved fit, it remains a linear model because it's linear in its coefficients.

The Polynomial Equation

For a single feature with degree 2:

ŷ = w₀ + w₁x + w₂x²

For degree 3:

ŷ = w₀ + w₁x + w₂x² + w₃x³

Higher degrees allow the model to capture more complex curves.

When to Use Polynomial Regression

Polynomial regression is appropriate when:

The scatter plot shows a curved pattern
Simple linear regression produces poor results
The relationship has clear peaks, valleys, or bends

Real-world examples:

Growth rates that accelerate over time
Temperature effects on crop yield
Speed vs. fuel consumption relationships

Implementing Polynomial Regression

Step 1: Create Non-Linear Data

import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline

# Generate data with quadratic relationship
np.random.seed(42)
X = np.linspace(0, 10, 50).reshape(-1, 1)
y = 2 + 3*X.flatten() - 0.5*X.flatten()**2 + np.random.randn(50)*2

This generates data following a quadratic pattern with some random noise, simulating real-world non-linear data.

Step 2: Visualize the Non-Linear Pattern

plt.scatter(X, y, color='blue', alpha=0.6)
plt.xlabel('Feature X')
plt.ylabel('Target y')
plt.title('Non-Linear Data Pattern')
plt.show()

The scatter plot reveals a curved relationship that a straight line cannot capture effectively.

Step 3: Transform Features with PolynomialFeatures

from sklearn.preprocessing import PolynomialFeatures

# Create polynomial features of degree 2
poly_features = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly_features.fit_transform(X)

print(f"Original shape: {X.shape}")
print(f"Polynomial shape: {X_poly.shape}")
print(f"Feature names: {poly_features.get_feature_names_out()}")

Output:

Original shape: (50, 1)
Polynomial shape: (50, 2)
Feature names: ['x0' 'x0^2']

PolynomialFeatures transforms the original feature x into [x, x²], enabling the linear regression model to fit a curve.

Step 4: Train Polynomial Regression Model

model = LinearRegression()
model.fit(X_poly, y)

# Make predictions
y_pred = model.predict(X_poly)

print(f"Coefficients: {model.coef_}")
print(f"Intercept: {model.intercept_:.2f}")

The model learns coefficients for both x and x², effectively fitting a parabola to the data.

Step 5: Visualize the Polynomial Fit

# Sort for smooth line plot
sort_idx = X.flatten().argsort()

plt.scatter(X, y, color='blue', alpha=0.6, label='Data')
plt.plot(X[sort_idx], y_pred[sort_idx], color='red', 
         linewidth=2, label='Polynomial Fit')
plt.xlabel('Feature X')
plt.ylabel('Target y')
plt.title('Polynomial Regression (Degree 2)')
plt.legend()
plt.show()

The curved red line shows how polynomial regression captures the non-linear pattern in the data.

Using Pipeline for Cleaner Code

Scikit-learn's Pipeline combines preprocessing and modeling into a single object, making code cleaner and preventing data leakage.

from sklearn.pipeline import Pipeline

# Create polynomial regression pipeline
poly_pipeline = Pipeline([
    ('poly_features', PolynomialFeatures(degree=2)),
    ('linear_regression', LinearRegression())
])

# Fit and predict in one step
poly_pipeline.fit(X, y)
y_pred = poly_pipeline.predict(X)

# Evaluate
r2 = poly_pipeline.score(X, y)
print(f"R² Score: {r2:.4f}")

The pipeline automatically transforms features before fitting, simplifying the workflow considerably.

Choosing the Right Polynomial Degree

The Bias-Variance Tradeoff

Selecting the polynomial degree involves balancing underfitting and overfitting:

Low degree (underfitting): Model is too simple, misses patterns
High degree (overfitting): Model fits training data too closely, performs poorly on new data

Comparing Different Degrees

from sklearn.metrics import mean_squared_error

degrees = [1, 2, 3, 5, 10]

plt.figure(figsize=(12, 4))

for i, degree in enumerate(degrees, 1):
    plt.subplot(1, 5, i)
    
    poly = PolynomialFeatures(degree=degree)
    X_poly = poly.fit_transform(X)
    
    model = LinearRegression()
    model.fit(X_poly, y)
    y_pred = model.predict(X_poly)
    
    mse = mean_squared_error(y, y_pred)
    
    sort_idx = X.flatten().argsort()
    plt.scatter(X, y, alpha=0.5, s=20)
    plt.plot(X[sort_idx], y_pred[sort_idx], 'r-', linewidth=2)
    plt.title(f'Degree {degree}\nMSE: {mse:.2f}')

plt.tight_layout()
plt.show()

This comparison shows how different degrees affect the fit. Degree 2 typically fits quadratic data well, while degree 10 often creates an overly complex curve.

Using Cross-Validation to Select Degree

from sklearn.model_selection import cross_val_score

best_degree = 1
best_score = -np.inf

for degree in range(1, 8):
    pipeline = Pipeline([
        ('poly', PolynomialFeatures(degree=degree)),
        ('model', LinearRegression())
    ])
    
    scores = cross_val_score(pipeline, X, y, cv=5, scoring='r2')
    mean_score = scores.mean()
    
    print(f"Degree {degree}: R² = {mean_score:.4f}")
    
    if mean_score > best_score:
        best_score = mean_score
        best_degree = degree

print(f"\nBest degree: {best_degree}")

Cross-validation tests each degree on held-out data, helping identify the degree that generalizes best.

Multiple Polynomial Regression

When you have multiple features and non-linear relationships, polynomial features creates interaction terms and polynomial terms for all features.

# Example with 2 features
X_multi = np.array([[1, 2], [3, 4], [5, 6]])

poly = PolynomialFeatures(degree=2, include_bias=False)
X_transformed = poly.fit_transform(X_multi)

print("Feature names:")
print(poly.get_feature_names_out())

Output:

Feature names:
['x0' 'x1' 'x0^2' 'x0 x1' 'x1^2']

The transformation creates:

Original features: x0, x1
Squared terms: x0², x1²
Interaction term: x0 × x1

Warning: Feature count grows rapidly with more features and higher degrees. This can lead to overfitting and computational expense.

Summary

Multiple and polynomial regression extend basic linear regression to handle real-world complexity.

Key takeaways:

Multiple regression handles multiple input features with separate weights
Feature coefficients indicate the impact of each variable
Polynomial regression captures non-linear patterns by adding polynomial terms
Use PolynomialFeatures to transform features before fitting
Pipeline combines preprocessing and modeling cleanly
Choose polynomial degree carefully to avoid overfitting
Cross-validation helps select the optimal degree

Related Lessons

Ridge, Lasso, and Elastic Net Regularization

Master regularization techniques like Ridge, Lasso, and Elastic Net to reduce overfitting and improve model stability. This lesson explains how these methods handle multicollinearity and enhance regression model performance.

Regression Project - House Price Prediction

Apply regression techniques in a hands‑on House Price Prediction project. Learn to preprocess data, engineer features, select models, and evaluate performance to build a real‑world predictive analytics solution.

Linear Regression Theory and Implementation

Learn the fundamentals of Linear Regression, including how it works, key assumptions, and step‑by‑step implementation. This lesson helps you understand relationships between variables and build accurate predictive models using real data.