In a recent post, Bill used the announcement of a new AI multiplication algorithm to discuss the applications of Strassen's famous algorithm. For this post I'd like to focus on the new algorithm itself, Alpha Tensor, the algorithm behind the algorithm, what it has actually accomplished and what it means for us theorists.
To multiply two 2x2 matrices in the usual way you need eight multiplication steps. In 1969 Strassen surprised the world by showing how to multiply those matrices using only seven multiplications.
You can recurse on larger matrices. For 4x4 matrices you can use 72=49 multiplications instead of the naïve 64. In general for nxn matrices you need roughly nlog27 ≈ n2.81 multiplications.
No one has found an algorithm for 4x4 matrices that uses less than 49 from recursing on Strassen. Alpha Tensor does so for the special case of working over GF, where addition and subtraction are interchangeable. Their algorithm does not work for general fields such as the real numbers.
Here's the full table of Alpha Tensor results from the Nature paper for multiplying a nxm matrix by a mxp matrix. The Modular column is for GF and the standard column is for general fields. Alpha tensor does improve on the best known for general fields for specific problems like multiplying a 3x4 matrix by a 4x5 matrix. Much of the press failed to make this distinction for 4x4 multiplication leading to some confusion.