CLO-1 · Asymptotic Analysis
| n | Naive n³ | Strassen n^2.807 | Ratio (Naive/Strassen) | Ops Saved |
|---|
Worst-Case Demo
CLO-2 · Element Access Patterns
Impact of Removing Optimization
CLO-3 · Divide & Conquer Paradigm
1. Divide: Split each n×n matrix into four n/2 × n/2 submatrices.
2. Conquer: Compute 7 products P₁–P₇ (instead of 8) using clever linear combinations.
3. Combine: Add/subtract P₁–P₇ to form the four quadrants of result C.
Recursion Tree
Why 7 and Not 8? The Key Insight
Counterexample: Why Greedy Fails Here
📖 Algorithm Overview
What it does: Strassen's algorithm multiplies two n×n matrices faster than the naive O(n³) approach by reducing the number of recursive multiplications from 8 to 7.
Why it matters: In 1969, Volker Strassen proved that the obvious O(n³) bound is not tight. His algorithm runs in O(nlog₂7) ≈ O(n2.807), which is a meaningful improvement for large matrices used in graphics, ML, and scientific computing.
When to use: Theoretically better for n > ~32, but in practice cache effects mean standard BLAS libraries often outperform Strassen for moderate n. It shines in distributed computing for very large matrices.
🧮 Fully Worked Example (Step-by-Step)
💡 Key Concepts
P₁ = (A₁₁+A₂₂)(B₁₁+B₂₂) P₂ = (A₂₁+A₂₂)B₁₁
P₃ = A₁₁(B₁₂−B₂₂) P₄ = A₂₂(B₂₁−B₁₁)
P₅ = (A₁₁+A₁₂)B₂₂ P₆ = (A₂₁−A₁₁)(B₁₁+B₁₂)
P₇ = (A₁₂−A₂₂)(B₂₁+B₂₂)
C₁₁ = P₁+P₄−P₅+P₇ C₁₂ = P₃+P₅
C₂₁ = P₂+P₄ C₂₂ = P₁−P₂+P₃+P₆
• Confusing P₄ = A₂₂(B₂₁−B₁₁) with (A₂₁−A₁₁)B₂₂ (that's P₆)
• Forgetting the subtraction in C₁₁ = P₁+P₄−P₅+P₇
• Thinking the algorithm works for odd-size matrices without padding
📐 Space Complexity Analysis
Time: O(nlog₂7) ≈ O(n2.807) — from the Master Theorem: a=7, b=2, f(n)=Θ(n²), log₂7 ≈ 2.807 > 2, so Case 1 applies.
Space: O(n²) for the 7 intermediate product matrices P₁–P₇ at each recursion level, plus O(log n) stack frames → O(n² log n) total auxiliary space.
Comparison: Naive uses O(n²) space (just the output matrix). Strassen uses more memory — a trade-off for fewer multiplications.