Gaussian processes model a probability distribution over functions.
Let be some function mapping vectors to vectors. Then we can write:
where represents the mean vector:
and is the kernel function.
The kernel is a function that represents the covariance function for the Gaussian process.
The kernel can be thought of as a prior for the shape of the function, encoding our expectations for the amount of smoothness or non-linearity.
Not all conceivable kernels are valid. The kernel must produce covariance matrices that are positive-definite.
Some functions sampled from a Gaussian process with a linear kernel:
Functions sampled from a Gaussian process with a polynomial kernel where and :
Some functions sampled from a GP with a Gaussian kernel:
Functions sampled from a GP with a Laplacian kernel:
Pseudocode to sample from a Gaussian process:
- Decide on a vector of inputs for which we want to compute , where has been sampled form the Gaussian process.
- Compute .
- Perform Cholesky decomposition on , yielding a lower triangular matrix .
- Sample a vector of numbers from a standard Gaussian distribution.
- Take the dot product of and the vector of points to get the samples for .