2017, Nov 20

Deep Latent Gaussian Models

Deep latent Gaussian models (DLGMs) are a general class of deep directed graphical models that consist of Gaussian latent variables at each layer of a processing hierarchy. The model consists of $L$ layers of latent variables. To generate a sample from the model, we begin at the top-most layer ($L$) by drawing from a Gaussian distribution.

The activation $\mathbf{h}_{l}$ at any lower layer is formed by a non-linear transformation of the layer above $\mathbf{h}_{l+1}$ , perturbed by Gaussian noise. We descend through the hierarchy and generate observations $v$ by sampling from the observation likelihood using the activation of the lowest layer $\mathbf{h}_1$. This process is described graphically in figure 1(a).

This generative process is described as follows:

\[\begin{aligned} & \mathbf{\xi}_l \sim \mathcal{N}(\mathbf{\xi}_l \mid \mathbf{0}, \mathbf{I}), \enspace l = 1, \dots , L \\ & \mathbf{h}_L = \mathbf{G}_L\mathbf{\xi}_L, \\ & \mathbf{h}_l = \mathit{T}_l(\mathbf{h}_{l+1}) + \mathbf{g}_l\mathbf{\xi}_l, \enspace l = 1, \dots, L - 1 \\ & v \sim \pi(\mathbf{v} \mid \mathbf{T}_0(\mathbf{h}_l)), \\ \end{aligned}\]

where $\mathbf{\xi}_l$ are mutually independent Gaussian variables. The transformations $\mathit{T}_l$ represent multi-layer perceptrons (MLPs) and $\mathbf{G}_l$ are matrices. At the visible layer, the data is generated from any appropriate distribution $\pi(\mathbf{v}\mid\cdot)$ whose parameters are specified by a transformation of the first latent layer.

Stochastic Backpropagation

Gradient descent methods in latent variable models typically require computations of the form $\nabla_\theta \mathbb{E}_{q_\theta}\left[f(\mathbf{\xi})\right]$, where the expectation is taken with repect to a distribution $q_\theta(\cdot)$ with parameters $ \mathbf\theta $, and $f$ is a loss function that we assume to be integrable and smooth.

Deep Latent Gaussian Models

Deep Latent Gaussian Models

Stochastic Backpropagation

B-Boy Seiok

Recent post