Regression and Linear Combinations

exmadscientist · on May 5, 2021

Worth noting: the pseudoinverse of a matrix is exactly the same as the direct solution to a linear least-squares regression. So if you have a linear system you want to "best effort" solve, for example in the case where you have the "wrong" number of data points and this makes things overdetermined or underdetermined, you can just swap out matrix inverses for matrix pseudoinverses and you'll get the least-squares solution!

Bet they didn't teach that in your linear algebra class. (Okay, maybe they did. But they didn't in mine... all three of them.)

(Obligatory note: Performance will probably be poor, even worse than taking the regular matrix inverse, so definitely don't do this in performance-sensitive applications. For small matrices, of course, computers are fast so it's fine.)

wenc · on May 5, 2021

So a typical linear algebra courses wouldn't cover solution methods (outside of a few basic ones like Gaussian decomposition or LU factorization) because those methods are usually covered in courses with names like "numerical computation" or "scientific computation". These courses go into depth on solutions for Ax = b, including the Moore-Penrose pseudoinverse, among others. Turns out computing solutions for Ax = b is a deep subject.

Just a note for others listening in that the Moore-Penrose pseudoinverse is rarely computed in practice because computing inverses is rarely done in practice, as you've astutely noted [1] -- one almost never tries to calculate x = inv(A)*B except for very small systems. Also at first glance one can reuse inv(A) for solving other b's, but turns out this isn't the best approach either (see linked article above).

Instead, the general strategy is to cast it into Ax=b and figure out what properties of A to exploit. This almost always results in a more efficient computation strategy than computing the inverse. For instance, if A is positive definite, one might try a Cholesky. If A is sparse, one can use any number of sparse methods. If A has special structure (symmetric, bandedness, circulant, triangular, etc.), one can use specific algorithms to exploit the structure. If A is ill-conditioned (underdetermined), one might try a power-method like NIPALS if one is only interested in preserving the higher singular values. Or if one doesn't know anything about A, even a simple LU decomposition would outperform an inverse computation.

In Matlab, x = A\b [2] is often recommended over x = pinv(A)*b or x = inv(A)*b. The former uses a decision tree to figure out which algorithm to use to solve the system.

Just adding a bit of context.

[1] Don't invert that matrix https://www.johndcook.com/blog/2010/01/19/dont-invert-that-m...

[2] https://www.mathworks.com/help/matlab/ref/mldivide.html