Linear transformations and eigenvalues are key concepts in machine learning for understanding how data is manipulated and represented. This section explains how linear transformations map vectors to new spaces and how eigenvalues and eigenvectors reveal important properties of matrices, which are essential in techniques like dimensionality reduction, PCA, and spectral analysis.
A linear transformation is a function that maps vectors to new vectors while preserving vector addition and scalar multiplication. Matrices represent linear transformations.
import numpy as np
# Transformation matrix (rotation by 90 degrees)
rotation_90 = np.array([[0, -1],
[1, 0]])
# Original vector
v = np.array([1, 0])
# Apply transformation
v_transformed = rotation_90 @ v
print(f"Original: {v}")
print(f"Transformed: {v_transformed}") # [0, 1] - rotated 90°
Scaling:
import numpy as np
# Scale x by 2, y by 0.5
scaling_matrix = np.array([[2, 0],
[0, 0.5]])
points = np.array([[1, 1],
[2, 2],
[3, 1]])
scaled_points = points @ scaling_matrix.T
print("Scaled points:")
print(scaled_points)
Eigenvectors are special vectors that, when transformed by a matrix, only change in magnitude (not direction). The scaling factor is the eigenvalue.
Mathematical Definition: For a matrix A, if Av = λv, then:
import numpy as np
# Matrix
A = np.array([[4, 1],
[2, 3]])
# Calculate eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)
print("Eigenvalues:", eigenvalues)
print("\nEigenvectors (as columns):")
print(eigenvectors)
# Verify: A @ v = λ * v
v = eigenvectors[:, 0] # First eigenvector
lambda_val = eigenvalues[0] # First eigenvalue
Av = A @ v
lambda_v = lambda_val * v
print(f"\nA @ v: {Av}")
print(f"λ * v: {lambda_v}")
print(f"Equal: {np.allclose(Av, lambda_v)}")
1. Principal Component Analysis (PCA): PCA uses eigenvectors of the covariance matrix to find directions of maximum variance.
import numpy as np
# Sample data
np.random.seed(42)
X = np.random.randn(100, 2)
X[:, 1] = X[:, 0] * 0.8 + np.random.randn(100) * 0.3 # Correlated
# Center the data
X_centered = X - X.mean(axis=0)
# Compute covariance matrix
cov_matrix = np.cov(X_centered.T)
print("Covariance matrix:")
print(cov_matrix)
# Eigendecomposition
eigenvalues, eigenvectors = np.linalg.eig(cov_matrix)
# Sort by eigenvalue (largest first)
sorted_idx = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[sorted_idx]
eigenvectors = eigenvectors[:, sorted_idx]
print(f"\nEigenvalues: {eigenvalues}")
print("First principal component (direction of max variance):")
print(eigenvectors[:, 0])
# Explained variance ratio
explained_variance = eigenvalues / eigenvalues.sum()
print(f"\nExplained variance ratio: {explained_variance}")
2. Understanding Data Structure: Eigenvalues reveal important properties:
Matrix decomposition breaks matrices into simpler components, essential for many ML algorithms.
Singular Value Decomposition (SVD):
import numpy as np
# Data matrix
A = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12]])
# SVD decomposition
U, S, Vt = np.linalg.svd(A)
print(f"U shape: {U.shape}") # (4, 4)
print(f"S shape: {S.shape}") # (3,) - singular values
print(f"Vt shape: {Vt.shape}") # (3, 3)
print(f"\nSingular values: {S}")
# Reconstruct (approximately for non-square)
reconstructed = U[:, :3] @ np.diag(S) @ Vt
print(f"\nReconstruction close to original: {np.allclose(A, reconstructed)}")
SVD is used in dimensionality reduction, recommendation systems, and image compression.
import numpy as np
# High-dimensional data (100 samples, 10 features)
np.random.seed(42)
X = np.random.randn(100, 10)
# Use SVD for dimensionality reduction
U, S, Vt = np.linalg.svd(X, full_matrices=False)
# Keep only top 3 components
n_components = 3
X_reduced = U[:, :n_components] @ np.diag(S[:n_components])
print(f"Original shape: {X.shape}")
print(f"Reduced shape: {X_reduced.shape}")
# Variance explained
variance_explained = (S[:n_components]**2).sum() / (S**2).sum()
print(f"Variance retained: {variance_explained:.2%}")
Vectors and matrices are the core structures used to represent and manipulate data in machine learning. This section introduces these building blocks, explaining how they encode features, datasets, and transformations, and why efficient matrix operations are central to modern machine learning algorithms.
Matrix operations are fundamental to machine learning, enabling efficient computation and data transformations. This section covers key operations—addition, multiplication, transposition, inversion, and element-wise operations—highlighting their role in representing datasets, performing calculations, and implementing algorithms like linear regression and neural networks.