# Notation¶

## Functions¶

Symbol | Meaning |
---|---|

Sum | |

Product | |

Sigmoid function | |

Expectation | |

Natural logarithm | |

Absolute value of x |

## Sets¶

Symbol | Meaning |
---|---|

The set of real numbers | |

The set of vectors of real numbers of length | |

The set of matrices of real numbers of size |

## Calculus¶

Symbol | Meaning |
---|---|

Derivative of f, shorthand for | |

Second derivative of f | |

Derivative of y with respect to x | |

Second derivative of y with respect to x | |

Partial derivative of y with respect to x | |

Second partial derivative of y with respect to x |

## Information theory¶

Symbol | Meaning |
---|---|

KL-divergence between two distributions, P and Q |

## Linear algebra¶

Symbol | Meaning |
---|---|

A vector | |

A matrix | |

Transpose of X | |

Conjugate transpose of X | |

Inverse of X | |

Euclidean norm of x | |

Identity matrix | |

Element-wise product of X and Y | |

Kronecker product of X and Y | |

Dot product of x and y | |

Trace of X | |

Determinant of X |

## Probability¶

Symbol | Meaning |
---|---|

A random variable | |

Probability of a particular value of X. Shorthand for | |

Uniform distribution | |

Normal distribution |

## Statistics¶

Symbol | Meaning |
---|---|

Mean | |

Standard deviation | |

Variance of X | |

Covariance of X and Y |

## Machine learning¶

Symbol | Meaning |
---|---|

Parameters of the model | |

Observations or data | |

Feature vector | |

Loss function | |

Label | |

Prediction | |

Gradient at time t | |

Parameter update at time t | |

Learning rate |

## Reinforcement learning¶

Symbol | Meaning |
---|---|

Policy | |

Action at time t | |

State at time t | |

Reward at time t | |

Value function | |

Action set | |

Discount factor |