Which statement best describes the forward and reverse processes in a typical diffusion model (e.g., Denoising Diffusion Probabilistic Models)?
- A single-pass network takes random noise and produces an image in one forward pass, with no iterative steps.
- During the forward process, noise is iteratively removed from real data until it becomes pure noise, and in reverse the model adds noise step by step to create new images.
- In the forward process, a small amount of noise is added to real data at each step until it becomes nearly pure noise; the reverse process is then learned to denoise step by step. <- correct
- The diffusion model relies on adversarial training where a discriminator oversees both noising and denoising.
解法
DDPM 定义了一条固定的马尔可夫链,前向过程 q 逐步加高斯噪声;模型学习逆向马尔可夫链,每步预测被加入的噪声。
Which characterizes the Kullback-Leibler divergence D(P || Q)?
D(P || Q)is symmetric,D(P || Q) = D(Q || P).D(P || Q)satisfies the triangle inequality (true metric).D(P || Q)is always non-negative and equals zero iff P and Q are identical almost everywhere. <- correctD(P || Q)can be negative if P has nonzero probability where Q is zero.
解法
由 Gibbs 不等式,KL ≥ 0,等号当且仅当 P = Q 几乎处处成立;KL 不对称,也不是度量。
Which activation function saturates for both large negative and large positive inputs?
- ReLU
- Tanh <- correct
- Swish (SiLU)
- Leaky ReLU
解法
Tanh 把输入压到 (-1, 1) 且两端饱和;ReLU / LeakyReLU / Swish 在正方向都无上界。