(1) Multivariate Gaussian
We already discuss Gaussian distribution function with one variable (univariate) here. In this post, we will discuss about Gaussian distribution function with multi variables (multivariate), which is the general form of Gaussian distribution. For dimentional vector , multivariate Gaussian distribution is defined as follows.
where is a dimensional mean vector, is covariance matrix, and is the determinant of matrix .
is called Mahalanobis distance from to , and reduces to Euclidean distance when is an identity matrix.
(2) Gaussian properties
In this sub-topic, we will discuss about three useful properties of Gaussians’, namely affine property, property in marginal distribution and property in conditional distribution.
2.1 Affine property of Gaussian distribution
Gaussian have affine property that is useful in forming new Gaussian distribution whose data inputs are combination of some inputs with Gaussian distribution too, for example sum of two inputs of Gaussian distributed data. Affine property of Gaussian distribution is defined as follows.
Let’s try to demonstrate by summing two inputs of Gaussian distributed data. Given vector data whose distributions are Gaussian and . To derive new Gaussian distribution given vector , we ca use affine property. Here we go.
First, we will modify the form to .
Thus, we get and . By using affine property, we can derive our new Gaussian distribution. Here we go.
Here we assume that and are independent, so, is a diagonal matrix (covariance = 0). Thus, our new Gaussian distribution is , where and . It makes sense, right? We have new variable which is sum of two variables, the new mean and variance are also sum of each mean and variance of them. This use of affine property then can be expanded to other combinations of Gaussian input data
2.2 Marginal Gaussian distribution
Other important properties of Gaussian are if two sets of variables are jointly Gaussian, then the marginal distribution on one set is also Gaussian, and the conditional distribution of one set given the other set is also Gaussian. Given marginal distribution below.
The intuition is, if we have joint probability , we can marginalize/remove one of the variable of its variables. For example we want to marginalize so that we get . Given our is Gaussian, thus our marginal distribution is also Gaussian with new mean/expected value and covariance matix as follows.
where , and called as precision matrix.
2.3 Conditional Gaussian distribution
Given two set Gaussian and , the conditional probability is as follows.
Where and .
(3) Gaussian Mixture Model
In reality, sometimes we cannot model the probability distribution using single Gaussian model. See picture below.
Picture in the left, the green dots are an example of given data that cannot (will be bad) be modeled using single Gaussian distribution, because it has two summits. To model green dots distributions, we can use linear combination of Gaussian distributions (picture in the right), which is much better than single Gaussian distribution. This model combination of Gaussian is usually called GMM (Gaussian Mixture Model). For more clear understanding, see again picture below showing Gaussian mixture distribution in one dimension formed by three Gaussians.
We can form new distribution shown with red line above by combining three Gaussian distribution with the blue line. Gaussian mixture distribution can be formed by using equation below.
Each Gaussian density is called a component of mixture and has its own mean and covariance . And parameters are called mixing coefficients, and they sum to 1, to make the integral of Mixture Gaussian equals to 1.