18 Confronting the Partition Function

18.1 The Log-Likelihood Gradient

\[\nabla_\theta \log p(x;\theta) = \nabla_\theta \log \tilde{p}(x;\theta) - \nabla_\theta \log Z(\theta).\]

For most undirected models of interest, the negative phase is difficult.

18.2 Stochastic Maximum Likelihood and Contrastive Divergence

18.3 Pseudolikelihood

18.4 Score Matching and Ratio Matching

18.5 Denoising Score Matching

18.6 Noise-Contrastive Estimation

18.7 Estimating the Partition Function

18.7.1 Annealed Importance Sampling

18.7.2 Bridge Sampling