In the beginning of our article series, we already talk about how to derive polynomial regression using LSE (Linear Square Estimation) here. During this post, we will try to discuss linear regression from Bayesian point of view. Note that linear and polynomial regression here are similar in derivation, the difference is only in design matrix. You may check again our couple previous articles here and here.
I let you know in the beginning that the final result of deriving regression using LSE is equal to the result of deriving linear regression using MLE (Maximal Likelihood Estimation) in Bayesian method. Furthermore, the result of deriving regression using LSE with regularization is equal to the result of deriving using MAP (Maximum A Posteriori) in Bayesian method. During this post, we will try to prove it. And we will proceed to derive the posterior update formula for online learning using Conjugate prior.
(1) Regression using LSE = MLE Bayesian?
See picture below.