//
Search
๐ŸŒ‰

Score based Model and Schrodinger Bridge.

Schrodinger Bridge Series

Prelinamaries: SGMs

SGMs
1์ฐจ์› Standard Wiener Process
โ€ข
SGM: ํ™•์‚ฐ ๊ณผ์ •์ด ์‹œ๊ฐ„์— ๋”ฐ๋ผ ์—ฐ์†์ ์ผ ๋•Œ ํ™•๋ฅ ์  ๋ฏธ๋ถ„ ๋ฐฉ์ •์‹(SDE)์œผ๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค (Song 19)
โ€ข
Forward Process (๋ฐ์ดํ„ฐ๋ถ„ํฌโ†’ ์‚ฌ์ „๋ถ„ํฌ)
dxt=f(xt,t)dt+g(t)dwt,x0โˆผpdataย \begin{equation}\mathrm{d} \boldsymbol{x}_t=\boldsymbol{f}\left(\boldsymbol{x}_t, t\right) \mathrm{d} t+g(t) \mathrm{d} \boldsymbol{w}_t, \quad \boldsymbol{x}_0 \sim p_{\text {data }}\end{equation}
โ€ข
f\boldsymbol{f} (๋ฒกํ„ฐ ํ•จ์ˆ˜): ๋“œ๋ฆฌํ”„ํŠธ ๊ณ„์ˆ˜ gg (์Šค์นผ๋ผ ํ•จ์ˆ˜): ๋Š” ํ™•์‚ฐ ๊ณ„์ˆ˜ wt\boldsymbol{w}_t: Standard Wiener Process (Standard Brownian Motion) โ†’ ํ‰๊ท ์ด 0์ธ ๋ถ„์‚ฐ์˜ ์ •๊ทœ ๋ถ„ํฌ t๋ฅผ ๋”ฐ๋ฅด๋Š” ์—ฐ์† ํ™•๋ฅ  ๋ณ€์ˆ˜์ž…๋‹ˆ๋‹ค.
โ€ข
Reverse Process (์‚ฌ์ „๋ถ„ํฌ โ†’ ๋ฐ์ดํ„ฐ๋ถ„ํฌ)
dxt=[f(xt,t)โˆ’g2(t)โˆ‡logโกpt(xt)]dt+g(t)dwt,xTโˆผppriorย \begin{equation}\mathrm{d} \boldsymbol{x}_t=\left[\boldsymbol{f}\left(\boldsymbol{x}_t, t\right)-g^2(t) \nabla \log p_t\left(\boldsymbol{x}_t\right)\right] \mathrm{d} t+g(t) \mathrm{d} \boldsymbol{w}_t, \quad \boldsymbol{x}_T \sim p_{\text {prior }}\end{equation}
โ€ข
๋“œ๋ฆฌํ”„ํŠธ ๊ณ„์ˆ˜์— Score Function โˆ‡logโกpt(xt)\nabla \log p_t\left(\boldsymbol{x}_t\right) ํ•ญ์ด ์ถ”๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
โ€ข
ํ•จ์ˆ˜ p(t)p(t) ๋Š” ํ™•๋ฅ  ๋ฐ€๋„ ๋’ค์— ์ˆœ๋ฐฉํ–ฅ ๊ณผ์ •์—์„œ xtx_t ๊ฐ€ ๋ฉ๋‹ˆ๋‹ค. (xtโˆผpt(x)x_t\sim p_t(x)).
โ—ฆ
์ด๋•Œ, pt:=p(โ‹…;t).p_t:=p(\cdot;t).
โ€ข
Forward Process์—์„œ ํ™•๋ฅ ๋ถ„ํฌ ptp_t ๋‹ค์Œ xtx_t๊ฐ€ ์˜ค๋Š” ๊ฒƒ์€ Fokker-Plank (FP) ๋ฐฉ์ •์‹์œผ๋กœ ์ฃผ์–ด์ง‘๋‹ˆ๋‹ค.
โ—ฆ
Kolmogorov's equation of advance ๋ผ๊ณ ๋„ ํ•จ.
โ—ฆ
Forward Process (1) ์— ๋Œ€์‘ํ•˜๋Š” FP๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ฃผ์–ด์ง.
โ—ฆ
p0(x)=pdata(x)p_0(x)=p_{data}(x)๋Š” ๋ช…์‹œ์ ์ด์ง€ ์•Š์œผ๋ฏ€๋กœ ์†”๋ฃจ์…˜์ด ์กด์žฌํ•˜์ง€ ์•Š์Œ.
โˆ‚โˆ‚tpt(x)=โˆ’โˆ‡โ‹…[f(x,t)pt(x)]+12g2(t)ฮ”pt(x)\begin{equation} \frac{\partial}{\partial t} p_t(\boldsymbol{x})=-\nabla \cdot\left[\boldsymbol{f}(\boldsymbol{x}, t) p_t(\boldsymbol{x})\right]+\frac{1}{2} g^2(t) \Delta p_t(\boldsymbol{x}) \end{equation}
โ€ข
Objective Parameterํ™” ๋œ ๋ชจ๋ธ sฮธ(x,t)s_\theta(x,t)์— ๋Œ€ํ•ด โˆ‡logโกpt(x)\nabla \log p_t(x)๋ฅผ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.
โ—ฆ
์ƒ์„ฑ( ํšŒ๊ท€ํ”„๋กœ์„ธ์Šค) ์ค‘์—๋Š” ๋ฐ€๋„ํ•จ์ˆ˜ pt(x)p_t(x)๋ฅผ ์•Œ ํ•„์š”๊ฐ€ ์—†์Œ.
โ—ฆ
์กฐ๊ฑด๋ถ€ ํ™•๋ฅ  pt(xtโˆฃx0)p_t(x_t|x_0)์˜ score์— ๋Œ€ํ•œ L2 Loss ์ตœ์†Œํ™”
ฮธโˆ—=argminโกฮธEt,x0,xtโˆฃx0ฮป(t)โˆฅsฮธ(xt,t)โˆ’โˆ‡logโกpt(xtโˆฃx0)โˆฅ2\begin{equation}\boldsymbol{\theta}^*=\underset{\boldsymbol{\theta}}{\operatorname{argmin}} \mathbb{E}_{t, \boldsymbol{x}_0, \boldsymbol{x}_t \mid \boldsymbol{x}_0} \lambda(t)\left\|\boldsymbol{s}_{\boldsymbol{\theta}}\left(\boldsymbol{x}_t, t\right)-\nabla \log p_t\left(\boldsymbol{x}_t \mid \boldsymbol{x}_0\right)\right\|^2\end{equation}
โ€ข
Score-Matching: ์œ„ ์†์‹คํ•จ์ˆ˜๋Š” ๋‹ค์Œ ์ตœ์ ํ™”๋ฌธ์ œ์™€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.
ฮธโˆ—=argminโกฮธEt,xtฮป(t)โˆฅsฮธ(xt,t)โˆ’โˆ‡logโกpt(xt)โˆฅ2\begin{equation}\boldsymbol{\theta}^*=\underset{\boldsymbol{\theta}}{\operatorname{argmin}} \mathbb{E}_{t, \orange{\boldsymbol{x}_t}} \lambda(t)\left\|\boldsymbol{s}_{\boldsymbol{\theta}}\left(\boldsymbol{x}_t, t\right)-\nabla \log \orange{p_t\left({\boldsymbol{x}_t}\right)}\right\|^2\end{equation}
โ€ข
ํ•™์Šต๋œ score ์ถ”์ •๋ชจ๋ธ sฮธโˆ—s_{\theta^*}๋ฅผ ์‚ฌ์šฉํ•ด ํšŒ๊ท€ํ”„๋กœ์„ธ์Šค์˜ ํ•ด x0x_0๋ฅผ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์ •ํ™•๋„ ๋ฐ ๊ณ„์‚ฐ๋ณต์žก์„ฑ์„ ๊ณ ๋ คํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
dxt=[f(xt,t)โˆ’g2(t)sฮธโˆ—(xt,t)]dt+g(t)dwt,xTโˆผppriorย \begin{equation}\mathrm{d} \boldsymbol{x}_t=\left[\boldsymbol{f}\left(\boldsymbol{x}_t, t\right)-g^2(t) \boldsymbol{s}_{\boldsymbol{\theta}^*}\left(\boldsymbol{x}_t, t\right)\right] \mathrm{d} t+g(t) \mathrm{d} \boldsymbol{w}_t, \quad \boldsymbol{x}_T \sim p_{\text {prior }}\end{equation}
โ€ข
๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ ์†”๋ฃจ์…˜์€ Euler-maruyama method์ž…๋‹ˆ๋‹ค.
โ—ฆ
์ƒ๋ฏธ๋ถ„๋ฐฉ์ •์‹ (ODE)์— ๋Œ€ํ•œ ์ˆ˜์น˜ ์†”๋ฃจ์…˜์ธ ์˜ค์ผ๋Ÿฌ ๋ฐฉ๋ฒ•์˜ SDE ๋ฒ„์ „.
xkโˆ’1=xkโˆ’[f(xk,tk)โˆ’g2(tk)sฮธโˆ—(xk,tk)]ฮ”tk+g(tk)ฮ”tkzk,zkโˆผN(0,I)\begin{equation}\boldsymbol{x}_{k-1}=\boldsymbol{x}_k-\left[\boldsymbol{f}\left(\boldsymbol{x}_k, t_k\right)-g^2\left(t_k\right) \boldsymbol{s}_{\boldsymbol{\theta}^*}\left(\boldsymbol{x}_k, t_k\right)\right] \Delta t_k+g\left(t_k\right) \sqrt{\Delta t_k}\boldsymbol{z}_k, \quad \boldsymbol{z}_k \sim N(\mathbf{0}, \boldsymbol{I})\end{equation}
โ€ข
์˜ค์ผ๋Ÿฌ ๋ฐฉ๋ฒ•์— ๋…ธ์ด์ฆˆํ•ญ์ด ์ถ”๊ฐ€๋œ ํ˜•ํƒœ. (DDPM๊ณผ ๋™์ผ)

ODE (Probability Flow ODE)

โ€ข
์‹ค์ œ๋กœ ํ™•์‚ฐ ๋ชจ๋ธ์˜ ๋งŽ์€ ์‘์šฉ ๋ถ„์•ผ์—์„œ๋Š” ์ƒ์„ฑ์— SDE ๋Œ€์‹  ODE๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
โ€ข
ํ™•์‚ฐ ๋ชจ๋ธ์˜ SDE๋Š” Probability Flow ODE โ†’ ๊ณตํ†ต ๋ฐ€๋„ ํ•จ์ˆ˜๋ฅผ ๊ฐ€์ง„ ODE๋ฅผ ๋™๋ฐ˜ํ•ฉ๋‹ˆ๋‹ค.
Probabilityย Flowย ODE:dxt=[f(xt,t)โˆ’g2(t)โˆ‡logโกpt(xt)]dt+g(t)dwtStochasticย Flowย SDE:ย dxt=[f(xt,t)โˆ’12g2(t)โˆ‡logโกpt(xt)]dt\begin{align} \text{Probability Flow ODE:}\quad&\mathrm{d} \boldsymbol{x}_t=\left[\boldsymbol{f}\left(\boldsymbol{x}_t, t\right)-g^2(t) \nabla \log p_t\left(\boldsymbol{x}_t\right)\right] \mathrm{d} t+g(t) \mathrm{d} \boldsymbol{w}_t \\ \text{Stochastic Flow SDE:}\quad&\mathrm{~d} \boldsymbol{x}_t=\left[\boldsymbol{f}\left(\boldsymbol{x}_t, t\right)-\frac{1}{2} g^2(t) \nabla \log p_t\left(\boldsymbol{x}_t\right)\right] \mathrm{d} t \end{align}
โ€ข
์žฅ์ :
1.
์žฌํ•™์Šต ์—†์ด ๊ธฐ์กด ํ•™์Šต๋œ ๋ชจ๋ธ์— ์ ์šฉ ๊ฐ€๋Šฅ
2.
๊ฒฐ์ •๋ก ์  ์ƒ์„ฑ ๊ณผ์ •์ด๊ธฐ ๋•Œ๋ฌธ์— ๊ณ ์ •๋œ ์ดˆ๊ธฐ๊ฐ’์— ๋Œ€ํ•ด ํ•ญ์ƒ ๋™์ผํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์žฅ๋จ.
3.
์˜ค๋žซ๋™์•ˆ ์—ฐ๊ตฌ๋œ ODE Solver๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Œ. (์˜ˆ: Runge-Kutta ๋ฐฉ๋ฒ•)
โ€ข
SDE โ†’ Probability Flow ODE ์œ ๋„
1.
ฮ”p=โˆ‡โ‹…(โˆ‡p)=โˆ‡โ‹…(pโˆ‡logโกp)\Delta p=\nabla \cdot(\nabla p)=\nabla \cdot(p \nabla \log p)๋ฅผ FP๋ฐฉ์ •์‹์— ์ ์šฉ (๋กœ๊ทธ ๋ฏธ๋ถ„ ๊ณต์‹)
2.
๋ณ€์ˆ˜๋ณ€ํ™˜ f~:=fโˆ’1/2โ‹…g2โˆ‡logโกp\tilde{\boldsymbol{f}}:=\boldsymbol{f}-1 / 2 \cdot g^2 \nabla \log p ์€ ํ™•์‚ฐํ•ญ์ด ์—†๋Š” FP๋ฐฉ์ •์‹์ด๊ณ  ํ•ด๋Š” Forward Porcess์™€ ์ผ์น˜.
3.
๋ณ€ํ™˜๋œ FP๋ฐฉ์ •์‹์— ํ•ด๋‹นํ•˜๋Š” (ํ™•๋ฅ ) ๋ฏธ๋ถ„๋ฐฉ์ •์‹ ์ค‘ ํ•˜๋‚˜๋Š” Probability Flow ODE.
โˆ‚pโˆ‚t=โˆ’โˆ‡โ‹…[fp]+12g2ฮ”p=โˆ’โˆ‡โ‹…[(fโˆ’12g2โˆ‡logโกp)p]=โˆ’โˆ‡โ‹…[f~p]\frac{\partial p}{\partial t}=-\nabla \cdot[\boldsymbol{f} p]+\frac{1}{2} g^2 \Delta p=-\nabla \cdot\left[\left(\boldsymbol{f}-\frac{1}{2} g^2 \nabla \log p\right) p\right]=-\nabla \cdot[\tilde{\boldsymbol{f}} p]
FP SDE์™€ Probability Flow ODE

Schrodinger Bridge - โ€œ์ง„์ •ํ•œโ€ Image-to-image๋ฅผ ํ–ฅํ•˜์—ฌ

โ€ข
Diffusion Models์˜ ์ฒด๊ณ„๋ฅผ Image-to-image๋กœ ํ™•์žฅํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
โ€ข
ํ•˜์ง€๋งŒ ์ž…๋ ฅ ์ด๋ฏธ์ง€๋Š” guide๊ฐ€ ์•„๋‹Œ ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์œผ๋กœ ๊ฐ„์ฃผํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์‚ฌ์ „ํ™•๋ฅ ๋ถ„ํฌ๋ฅผ ์ž„์˜์˜ ๋ถ„ํฌ๋กœ ๋Œ€์ฒดํ•˜์ง€ ์•Š๋Š” ์ด์œ ? โ†’ Forward Process์˜ ์„ค๊ณ„ ๋ฐฉ๋ฒ•์ด ๋ฌธ์ œ๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.
โ—ฆ
ํ™•๋ฅ  ๋ฐ€๋„์— ๋Œ€ํ•œ pTโ‰ˆppriorp_T\approx p_{prior}๋ฅผ ๋งŒ์กฑํ•˜๊ธฐ ์œ„ํ•ด ํ•˜์ดํผํŒŒ๋ผ๋ฉ”ํ„ฐ f,g\boldsymbol{f},g๋ฅผ ์„ค๊ณ„ํ•ด์•ผ ํ•จ (FP๋ฐฉ์ •์‹์—์„œ pTp_T๋Š” p0,f,gp_0, \boldsymbol{f},g์— ์˜ํ•ด ์ž๋™์œผ๋กœ ๊ฒฐ์ •๋จ)
โ—ฆ
ํ•˜์ง€๋งŒ ์ผ๋ฐ˜์ ์œผ๋กœ pdata,ppriorp_{data}, p_{prior}๋Š” ์•”์‹œ์ ์ด๋ฏ€๋กœ ํ•™์Šต์ „์— ์„ค๊ณ„๋œ f,g\boldsymbol{f},g๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์–ด๋ ค์›€.
โ€ข
ํ™•์‚ฐ ๋ชจ๋ธ์— ๋Œ€ํ•œ FP ๋ฐฉ์ •์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
โˆ‚โˆ‚tpt(x)=โˆ’โˆ‡โ‹…[f(x,t)pt(x)]+12g2(t)ฮ”pt(x),p0=pdataย \frac{\partial}{\partial t} p_t(\boldsymbol{x})=-\nabla \cdot\left[\boldsymbol{f}(\boldsymbol{x}, t) p_t(\boldsymbol{x})\right]+\frac{1}{2} g^2(t) \Delta p_t(\boldsymbol{x}), \quad \orange{p_0=p_{\text {data }}}
โ€ข
ํ™•์‚ฐ ๋ชจ๋ธ ๋Œ€์‹  ์กฐ๊ธˆ ๋” ์ถ”์ƒํ™”๋œ ์ƒ์„ฑ๋ชจ๋ธ์— ๋Œ€ํ•œ ๋ฌธ์ œ ์„ค์ •์„ ๊ณ ๋ คํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
1.
๋ฐ์ดํ„ฐ๋ถ„ํฌ pdatap_{data}์™€ ์‚ฌ์ „๋ถ„ํฌ ppriorp_{prior}๊ฐ€ ์ฃผ์–ด์ง€๋ฉด,
2.
๊ฐ ๋ถ„ํฌ๋Š” SDE๋ฅผ ํ†ตํ•ด ์—ฐ๊ฒฐ๋ฉ๋‹ˆ๋‹ค.
3.
SDE์— ๋”ฐ๋ผ ํ™•๋ฅ ๋ฐ€๋„ pt(x)p_t(x)๋ฅผ ๋ชจ๋ธ๋งํ•ฉ๋‹ˆ๋‹ค. (๊ฒฝ๊ณ„์กฐ๊ฑด: p0=pdata,pT=pprior)p_0=p_{data}, p_{T}=p_{prior})
โ€ข
Schrodinger Bridge
โ€ข
๋ถ„ํฌ๋Š” ๋ธŒ๋ผ์šด ์šด๋™ ์‚ฌ์ด์˜ โ€œ๋‹ค๋ฆฌโ€๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.
โ€ข
์–‘์ž์—ญํ•™์˜ ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹๊ณผ๋Š” ๋‹ค๋ฆ…๋‹ˆ๋‹ค. (๊ด€๋ จ์€ ์žˆ์Šต๋‹ˆ๋‹ค.)
โ€ข
C. Lรฉonard (2013), Y. Chen et al. (2020)

Dynamic Schrodinger Bridge (Static SB)

โ€ข
SB๋Š” ํ™•๋ฅ  ๊ณผ์ •์˜ ๊ฒฝ๋กœ ์ธก๋„ ์ œ์•ฝ์ด์žˆ๋Š” KL ๋‹ค์ด๋ฒ„์ „์Šค ์ตœ์†Œํ™” ๋ฌธ์ œ๋กœ ๊ณต์‹ํ™”
minโกPDKL(PโˆฅQ)s.t.P0=ฮผdataย ,PT=ฮผpriorย \begin{equation} \min _{\mathbb{P}} D_{\mathrm{KL}}(\mathbb{P} \| \mathbb{Q}) \quad \text{s.t.}\quad \mathbb{P}_0=\mu_{\text {data }}, \mathbb{P}_T=\mu_{\text {prior }} \end{equation}
โ€ข
๊ฒฝ๋กœ ์ธก๋„๋Š” ๊ฒฝ๋กœ xtโ‰คtโ‰คTx_t \leq t \leq T ์ „์ฒด๋ฅผ ํ•˜๋‚˜์˜ ๋‹จ์ผํ‘œ๋ณธ์œผ๋กœ ๋ณด์•˜์„ ๋•Œ์˜ ํ™•๋ฅ ๋ถ„ํฌ์— ํ•ด๋‹นํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์—ฌ๊ธฐ์„œ, ๊ฒฝ๋กœ ์ธก๋„ย โ„™, โ„š๋Š” ๊ฐ๊ฐ ๊ทผ์‚ฌ ๋ถ„ํฌ์™€ ์ฐธ ๋ถ„ํฌ์— ๋Œ€์‘ํ•˜๋ฉฐ, ํŠนํžˆ โ„š๋ฅผ ์ฐธ์กฐ์ธก๋„๋ผ ๋ถ€๋ฆ…๋‹ˆ๋‹ค.
โ€ข
ํ›„์ˆ ํ•˜๋Š” static SB์™€ ๋Œ€๋น„ํ•˜์—ฌ, ์ด๋Š” dynamic SB๋ผ๊ณ ๋„ ๋ถˆ๋ฆฝ๋‹ˆ๋‹ค.

Static Schrodinger Bridge (Static SB)

ฯ€.ฮฑ.ฮฒ๋Š” ๊ฐ๊ฐ โ„™0,๐‘‡, ๐œ‡0, ๐œ‡๐‘‡์— ๋Œ€์‘
โ€ข
Dynamic SB์˜ ์ตœ์ ํ•ด๋Š” ๊ด€๋ จ๋œ Static SB์˜ ํ•ด๋กœ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ—ฆ
์™„๋งŒํ•œ ๊ฐ€์ • ํ•˜์—์„œ ์–‘์ž์˜ ์ตœ์ ํ•ด๋Š” ์ผ๋Œ€์ผ๋กœ ๋Œ€์‘ํ•ฉ๋‹ˆ๋‹ค. (ํ›„์ˆ )
โ€ข
Static SB: ์ฒ˜์Œ ๋ฐ ๋งˆ์ง€๋ง‰ ์‹œ๊ฐ„์— ๋Œ€ํ•œ ๊ฒฐํ•ฉ๋ถ„ํฌ์— ๋Œ€ํ•œ KL Divergence ์ตœ์†Œํ™” ๋ฌธ์ œ.
โ—ฆ
๋„์ค‘์˜ ๊ฒฝ๋กœ๋ฅผ ์ฃผ๋ณ€ํ™” (Marginalize)ํ•ด ์‹œ์ ๊ณผ ์ข…์ ์˜ ์กฐํ•ฉ๋งŒ์„ ๊ณ ๋ คํ•˜๋Š” ์„ค์ •
minโกP0,TDKL(P0,TโˆฅQ0,T)s.t.P0=ฮผ0,PT=ฮผT\begin{equation}\min _{\mathbb{P}_{0, T}} D_{\mathrm{KL}}\left(\mathbb{P}_{0, T} \| \mathbb{Q}_{0, T}\right) \quad\text{s.t.}\quad \mathbb{P}_0=\mu_0, \quad \mathbb{P}_T=\mu_T\end{equation}

Dynamic SB์™€ Static SB์˜ ๊ด€๊ณ„

โ€ข
Dynamic SB์˜ ํ•ด๋ฅผ Pโˆ—\mathbb{P}^*, static SB์˜ ํ•ด๋ฅผ P0,Tโˆ—\mathbb{P}_{0, T}^* ๋กœ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.
โ€ข
Pโˆ—\mathbb{P}^* ๋Š” ์ฐธ์กฐ diffusion bridge Qโˆฃ0,T\mathbb{Q}_{|0,T}์˜ ์ฃผ๋ณ€ํ™”๋กœ ์ฃผ์–ด์ง‘๋‹ˆ๋‹ค. [Leonard 13]
โ—ฆ
์–‘๋‹จ์˜ ๊ฐ’ (x0,xT)(x_0,x_T)์ด ๊ณ ์ •๋œ ํ™•์‚ฐ๊ณผ์ •์„ (Diffusion) Bridge๋ผ๊ณ  ๋ถ€๋ฆ…๋‹ˆ๋‹ค.
โ—ฆ
Pโˆ—\mathbb{P}^*์™€ ๊ฐ™์ด bridge์˜ ์ฃผ๋ณ€ํ™”๋กœ ๊ตฌ์„ฑ๋œ ๊ฒฝ๋กœ ์ธก๋„๋ฅผ Mixture of bridges๋ผ๊ณ  ๋ถ€๋ฆ…๋‹ˆ๋‹ค.
โ—ฆ
๋˜ํ•œ ๋ฐ˜๋Œ€๋กœ P0,Tโˆ—\mathbb{P}^*_{0,T}๋ฅผ Pโˆ—\mathbb{P}^*์—์„œ ๊ณ ์œ ํ•˜๊ฒŒ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
Pโˆ—(โ‹…)=โˆซQโˆฃ0,T(โ‹…โˆฃx0,xT)dP0,Tโˆ—(x0,xT)\begin{equation}\mathbb{P}^*(\cdot)=\int \mathbb{Q}_{\mid 0, T}\left(\cdot \mid \boldsymbol{x}_0, \boldsymbol{x}_T\right) \mathrm{d} \mathbb{P}_{0, T}^*\left(\boldsymbol{x}_0, \boldsymbol{x}_T\right)\end{equation}
1์ฐจ์›Diffusion Bridge Qโˆฃ0,T(โ‹…โˆฃx0=0,xT=0)\mathbb{Q}_{\mid 0, T}\left(\cdot \mid \boldsymbol{x}_0=0, \boldsymbol{x}_T=0\right) ์˜ ํ‘œ๋ณธ ๊ฒฝ๋กœ

์ตœ์  ์ˆ˜์†ก (Optimal Transport; OT)

โ€ข
Optimal Transport: ํ™•๋ฅ ๋ถ„ํฌ๋ฅผ ์ด๋™ ์‹œํ‚ฌ ๋•Œ ๋น„์šฉ์„ ์ตœ์†Œํ™” ํ•˜๋Š” ์šด๋ฐ˜๋ฐฉ๋ฒ•์„ ์ฐพ๋Š” ๋ฌธ์ œ
โ€ข
์–ด๋–ค ์ข…๋ฅ˜์˜ ์กฐ๊ฑด ํ•˜์—์„œ (static) SB๋Š” ์ตœ์  ์ˆ˜์†ก๊ณผ ๊ฐ™์€ ๊ฒƒ์œผ๋กœ ์•Œ๋ ค์ ธ ์žˆ์Šต๋‹ˆ๋‹ค. [Leonard 13]
โ€ข
๋ฐ€๋„ ํ•จ์ˆ˜๋ฅผ ๋ชจ๋ž˜ ์‚ฐ์œผ๋กœ ๋ณด๊ณ , ํ•œ ๋ชจ๋ž˜์‚ฐ์„ ์šด๋ฐ˜ํ•˜์—ฌ ๋‹ค๋ฅธ ํ˜•ํƒœ์˜ ๋ชจ๋ž˜์‚ฐ์„ ๊ตฌ์ถ•ํ•  ๋•Œ ๊ฑธ๋ฆฌ๋Š” ์šด๋ฐ˜๋น„์šฉ (๊ฑฐ๋ฆฌ์™€ ์šด๋ฐ˜๋Ÿ‰์— ์ƒ๊ด€)์ด ์ตœ์†Œ๊ฐ€ ๋˜๋Š” ์กฐํ•ฉ์„ ์ฐพ๋Š” ๋ฌธ์ œ

Kantorovich Optimal Transport

โ€ข
ํ˜„๋Œ€์ ์ธ ์ตœ์ ์ˆ˜์†ก์˜ ๊ณต์‹ํ™”๋Š” Kantorovich์˜ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์ด ์ด๋™ ๋น„์šฉ์„ ์ตœ์†Œํ™”ํ•˜๋Š” ์ปคํ”Œ๋ง ์ธก๋„ ๐œ‹ ๋ฅผ ์ฐพ๋Š” ๋ฌธ์ œ๋กœ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ—ฆ
ํ•œ ์ ์—์„œ ๋‹ค๋ฅธ ์ ์œผ๋กœ ์ด๋™ํ•  ๋•Œ ๋ถ„ํ•  ๋ฐ ํ†ตํ•ฉ์„ ์ธ์ •ํ•˜๋Š” ์„ค์ •
โ—ฆ
์ˆ˜์†ก์›, ์ˆ˜์†ก์ฒ˜์˜ ํ™•๋ฅ  ์ธก๋„๋ฅผ ๊ฐ๊ฐ ๐›ผ, ๐›ฝ๋กœ ํ‘œ๊ธฐ
โ—ฆ
์—ฌ๊ธฐ์„œ ๋‹จ์œ„ ์งˆ๋Ÿ‰์˜ ์ขŒํ‘œ xโˆˆX\boldsymbol{x}\in\mathcal{X} ์—์„œ yโˆˆY\boldsymbol{y}\in\mathcal{Y} ๋กœ์˜ ์ด๋™ ๋น„์šฉ์„ ๐‘(๐’™, ๐’š)๋กœ ์ •์˜ [Peyre 20]
minโกฯ€โˆซXร—Yc(x,y)dฯ€(x,y)ย s.t.ย ฯ€(โ‹…ร—Y)=ฮฑ,ฯ€(Xร—โ‹…)=ฮฒ\begin{equation} \min _\pi \int_{\mathcal{X} \times \mathcal{Y}} c(\boldsymbol{x}, \boldsymbol{y}) \mathrm{d} \pi(\boldsymbol{x}, \boldsymbol{y}) \quad \text { s.t. } \quad \pi(\cdot \times \mathcal{Y})=\alpha, \quad \pi(X \times \cdot)=\beta \end{equation}

์—”ํŠธ๋กœํ”ผ ์ •๊ทœํ™” OT (Entory-Regularized OT; EROT)

โ€ข
์ˆ˜์น˜ ๊ณ„์‚ฐ์œผ๋กœ OT๋ฅผ ์ทจ๊ธ‰ํ•  ๋•Œ๋Š” ์—”ํŠธ๋กœํ”ผ ์ •๊ทœํ™”๋ฅผ ๋”ํ•œ ์™„ํ™” ๋ฌธ์ œ๋ฅผ ์ƒ๊ฐํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Šต๋‹ˆ๋‹ค.
โ€ข
๋น„์šฉ ํ•จ์ˆ˜๊ฐ€ ๐œ‹์— ๊ด€ํ•ด ๊ฐ•๋ณผ๋กํ•˜๊ฒŒ ๋˜๋ฏ€๋กœ, ์ตœ์ ํ•ด๊ฐ€ ๊ณ ์œ ํ•˜๊ฒŒ ์ •ํ•ด์ ธ์„œ ์ˆ˜์น˜๊ณ„์‚ฐ์ด ํŽธํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์›๋ž˜ OT๋Š” ๋ณผ๋กํ•˜์ง€๋งŒ ์ผ๋ฐ˜์ ์œผ๋กœ ๊ฐ•๋ณผ๋กํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ์ตœ์ ํ•ด๋Š” ๊ณ ์œ ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
minโกฯ€โˆซXร—Yc(x,y)dฯ€(x,y)+ฮตH(ฯ€)ย s.t.ย ฯ€(โ‹…ร—Y)=ฮฑ,ฯ€(Xร—โ‹…)=ฮฒH(ฯ€):=โˆ’โˆซxร—yp(x,y)logโกp(x,y)dxdy\begin{equation} \begin{gathered} \min _\pi \int_{\mathcal{X} \times \mathcal{Y}} c(\boldsymbol{x}, \boldsymbol{y}) \mathrm{d} \pi(\boldsymbol{x}, \boldsymbol{y})\orange{+\varepsilon H(\pi)} \quad\text { s.t. } \pi(\cdot \times \mathcal{Y})=\alpha, \quad \pi(X \times \cdot)=\beta \\ H(\pi):=-\int_{x \times y} p(\boldsymbol{x}, \boldsymbol{y}) \log p(\boldsymbol{x}, \boldsymbol{y}) \mathrm{d} \boldsymbol{x} \mathrm{d} \boldsymbol{y}\end{gathered}\end{equation}
โ€ข
์—ฌ๊ธฐ์„œ H(ฯ€)H(\pi)๋Š” ๋ฏธ๋ถ„ ์—”ํŠธ๋กœํ”ผ.
โ€ข
์‹ (14)์—์„œ๋Š” ์ธก๋„ ๐œ‹์— ๋Œ€์‘ํ•˜๋Š” ๋ฐ€๋„ํ•จ์ˆ˜ p(x,y)p(x,y)๊ฐ€ ์กด์žฌํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค.
โ€ข
๋ณด๋‹ค ์—„๋ฐ€ํ•˜๊ฒŒ๋Š”, ์ด์‚ฐ OT์™€์˜ ๋Œ€์‘์„ ๊ณ ๋ คํ•˜์—ฌ ์ƒ๋Œ€ ์—”ํŠธ๋กœํ”ผ[Peyre 20]๋กœ ์ •์˜ํ•˜๋Š” ๊ฒƒ์ด ๋ฐ”๋žŒ์งํ•ฉ๋‹ˆ๋‹ค.

SB์™€ OT์˜ ๊ด€๊ณ„

โ€ข
๊ธฐ์ค€ ์ธก๋„๊ฐ€ ๊ฐ€์—ญ Brown ์šด๋™์ผ ๋•Œ, static SB ์™€ EROT ๋Š” ์ตœ์ ํ•ด๊ฐ€ ์ผ์น˜ํ•ฉ๋‹ˆ๋‹ค. [Chen 20][Peyre 20]
โ—ฆ
์ •์ƒ์ ์ธ ์ฃผ๋ณ€ ์ธก๋„๋ฅผ ๊ฐ€์ง€๋Š” Brown ์šด๋™์„ ๊ฐ€์—ญ Brown ์šด๋™์ด๋ผ๊ณ  ๋ถ€๋ฆ…๋‹ˆ๋‹ค. (Qt=Qsโˆ€0โ‰คtโ‰คsโ‰คT)\left(\mathbb{Q}_t=\mathbb{Q}_s \forall 0 \leq t \leq s \leq T\right)
โ—ฆ
๊ฐ€์—ญ Brown ์šด๋™์ด๋ผ๋ฉด ๊ฑฐ๋ฆฌ ํ•จ์ˆ˜๊ฐ€ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค: q0,T(x0,xT)โˆexpโก[โˆ’c(x0,xT)]q_{0, T}\left(\boldsymbol{x}_0, \boldsymbol{x}_T\right) \propto \exp \left[-c\left(\boldsymbol{x}_0, \boldsymbol{x}_T\right)\right]
โ—ฆ
์˜ˆ๋ฅผ๋“ค์–ด ์ •์˜ ์˜์—ญ์ด Euclid ๊ณต๊ฐ„ Rn\mathbb{R}^n ์ผ๋•Œ ๊ฐ€์—ญ Brown ์šด๋™ Q\mathbb{Q} ์— ๊ด€๋ จ๋œ SDE๋Š” dxt=ฯƒdwt\mathrm{d} \boldsymbol{x}_t=\sigma \mathrm{d} \boldsymbol{w}_t์ด๋ฉฐ, ๊ฑฐ๋ฆฌ ํ•จ์ˆ˜๋Š” ์ œ๊ณฑ Euclid์˜ Norm: c(x0,xT)=โˆฅx0โˆ’xTโˆฅ2/2ฯƒ2c\left(\boldsymbol{x}_0, \boldsymbol{x}_T\right)=\left\|\boldsymbol{x}_0-\boldsymbol{x}_T\right\|^2 / 2 \sigma^2
โ€ข
ํ™•๋ฅ  ๋ฐ€๋„ ํ•จ์ˆ˜๊ฐ€ ์กด์žฌํ•  ๋•Œ์˜ ์ฆ๋ช…์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
DKL(P0,TโˆฅQ0,T)=โˆ’EP0,T[logโกq0,T]+EP0,T[logโกp0,T]=โˆ’EP0,T[โˆ’c(x0,xT)]+H(P0,T)+ย const.ย =โˆซc(x0,xT)dP0,T(x0,xT)+H(P0,T)+ย const.ย \begin{equation}\begin{aligned}D_{\mathrm{KL}}\left(\mathbb{P}_{0, T} \| \mathbb{Q}_{0, T}\right) & =-\mathbb{E}_{\mathbb{P}_{0, T}}\left[\log q_{0, T}\right]+\mathbb{E}_{\mathbb{P}_{0, T}}\left[\log p_{0, T}\right] \\& =-\mathbb{E}_{\mathbb{P}_{0, T}}\left[-c\left(\boldsymbol{x}_0, \boldsymbol{x}_T\right)\right]+H\left(\mathbb{P}_{0, T}\right)+\text { const. } \\& =\int c\left(\boldsymbol{x}_0, \boldsymbol{x}_T\right) \mathrm{d} \mathbb{P}_{0, T}\left(\boldsymbol{x}_0, \boldsymbol{x}_T\right)+H\left(\mathbb{P}_{0, T}\right)+\text { const. }\end{aligned}\end{equation}
์ด์‚ฐ EROT์˜ ํ•ด๋กœ ๊ตฌ์„ฑ๋œ mixture of bridges
โ€ข
ํŒŒ๋ž€์ƒ‰ ์ ๋“ค๋กœ๋ถ€ํ„ฐ ๋นจ๊ฐ„์ƒ‰ ์ ๋“ค๋กœ์˜ ์ˆ˜์†ก ๊ฒฝ๋กœ (๊ฐ ์ ์˜ ์งˆ๋Ÿ‰์€ ๋ชจ๋‘ ๋™์ผํ•˜๋‹ค๊ณ  ๊ฐ€์ •)
โ€ข
์—”ํŠธ๋กœํ”ผํ•ญ์˜ ๊ธฐ์—ฌ(๐œ€)ย ๊ฐ€ ์ปค์งˆ์ˆ˜๋ก ํ—ˆ์šฉ ๊ฐ€๋Šฅํ•œย bridge๊ฐ€ ๋‹ค์–‘ํ™”๋˜๋Š” ๊ฒฝํ–ฅ์„ ๋ณด์ž…๋‹ˆ๋‹ค.

SB ์žฌ์ •์˜: ํ™•๋ฅ  ์ตœ์  ์ œ์–ด ๋ฌธ์ œ

โ€ข
ํŒจ์Šค ์ธก๋„๋Š” ๊ทธ๋Œ€๋กœ๋ผ๋ฉด ๋‹ค๋ฃจ๊ธฐ ์–ด๋ ค์šฐ๋ฏ€๋กœ SDE๋ฅผ ์ด์šฉํ•œ ํ‘œํ˜„์œผ๋กœ ๋ฐ”๊พธ๊ธฐ๋กœ ํ•ฉ๋‹ˆ๋‹ค.
โ—ฆ
์—ฌ๊ธฐ์„œ๋Š” T. Chen et al. (2021) (SB-FBSDE)ย ์˜ ์ •์‹ํ™”๋ฅผ ์†Œ๊ฐœ
โ€ข
Dynamic SB๋Š” ๋“ฑ๊ฐ€ ํ™•๋ฅ  ์ตœ์  ์ œ์–ด ๋ฌธ์ œ๋กœ ๋ณ€ํ™˜ํ•  ์ˆ˜ ์žˆ์Œ์ด ์•Œ๋ ค์ ธ์žˆ์Šต๋‹ˆ๋‹ค. [Caluya 19]
โ—ฆ
์—ฌ๊ธฐ์„œย dx=[f+gu]dt+gย dw\mathrm{d} \boldsymbol{x}=[\boldsymbol{f}+g \boldsymbol{u}] \mathrm{d} t+g \mathrm{~d} \boldsymbol{w}ย ์ผ๋•Œ ์ƒํƒœ ๋ฐฉ์ •์‹
โ—ฆ
์ฐธ์กฐ ๊ฒฝ๋กœ ์ธก๋„ย โ„š ๋ณด๋‹ค ์œ ๋„๋˜๋Š” ํ™•๋ฅ ์žฅ (SDE: dx=fdt+gย dw\mathrm{d} \boldsymbol{x}=\boldsymbol{f} \mathrm{d} t+g \mathrm{~d} \boldsymbol{w}๏ผ‰ย ์„ ๋– ๋„๋Š” ์ž…์ž์— ๋Œ€ํ•˜์—ฌ ์™ธ๋ ฅ(๐’–)์„ ์ œ์–ดํ•˜์—ฌ ์ดˆ๊ธฐ์น˜ x0\boldsymbol{x}_0์—์„œ ๋ชฉํ‘œย xT\boldsymbol{x}_T ๋กœ ์ด๋„๋Š” ๋ฌธ์ œ
โ—ฆ
์ตœ์†Œ์˜ ์ž‘์šฉ (โˆซโˆฅuโˆฅ2ย dt\int\|\boldsymbol{u}\|^2 \mathrm{~d} t๏ผ‰ย ์œผ๋กœ ๋ชฉ์ ์„ ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ์„ ๋•Œ์˜ย ๐’–๊ฐ€ ์ตœ์ ํ•ด
minโกuE[โˆซ0T12โˆฅu(xt,t)โˆฅ2ย dt]ย s.t.ย {dxt=[f(xt,t)+g(t)u(xt,t)]dt+g(t)dwtx0โˆผpdataย xTโˆผppriorย \begin{equation}\begin{aligned}& \min _{\boldsymbol{u}} \mathbb{E}\left[\int_0^T \frac{1}{2}\left\|\boldsymbol{u}\left(\boldsymbol{x}_t, t\right)\right\|^2 \mathrm{~d} t\right] \\& \text { s.t. }\left\{\begin{aligned}& \mathrm{d} \boldsymbol{x}_t=\left[\boldsymbol{f}\left(\boldsymbol{x}_t, t\right)+g(t) \boldsymbol{u}\left(\boldsymbol{x}_t, t\right)\right] \mathrm{d} t+g(t) \mathrm{d} \boldsymbol{w}_t \\& \boldsymbol{x}_0 \sim p_{\text {data }} \\& \boldsymbol{x}_T \sim p_{\text {prior }}\end{aligned}\right.\end{aligned}\end{equation}

SB ์žฌ์ •์˜: Schrodinger System

โ€ข
ํ™•๋ฅ  ์ตœ์  ์ œ์–ด์˜ ์ตœ์ ํ•ดย uโˆ—\boldsymbol{u}^*๋Š” ํ•จ์ˆ˜ ฮจ\Psi์™€ ํŽธ๋ฏธ๋ถ„๋ฐฉ์ •์‹ (PDE) ฮจ^\widehat{\Psi} ๋กœ ํŠน์ •๋ฉ๋‹ˆ๋‹ค. [Caluya 19]
โ—ฆ
์ด๋•Œ ฮจ\Psi ๋Š” Schrรถdinger potential, PDE ฮจ^\widehat{\Psi} ๋Š” Schrรถdinger system.
โ—ฆ
๊ฐ๊ฐ Kolmogorov์˜ Forward/Reverse Process์— ํ•ด๋‹นํ•˜์ง€๋งŒ, ์„œ๋กœ ๋‹ค๋ฅธ ํฌํ…์…œ์„ ์‚ฌ์šฉํ•จ.
{ย โˆ‚ฮจโˆ‚t=โˆ’โŸจโˆ‡ฮจ,fโŸฉโˆ’12g2ฮ”ฮจโˆ‚ฮจ^โˆ‚t=โˆ’โˆ‡โ‹…[ฮจ^f]+12g2ฮ”ฮจ^ย s.t.ย {ฮจ(โ‹…,0)ฮจ^(โ‹…,0)=pdataย ฮจ(โ‹…,T)ฮจ^(โ‹…,T)=ppriorย \begin{equation}\left\{\begin{array} { l }ย { \frac { \partial \Psi } { \partial t } = - \langle \nabla \Psi , \boldsymbol { f } \rangle - \frac { 1 } { 2 } g ^ { 2 } \Delta \Psi } \\{ \frac { \partial \widehat { \Psi } } { \partial t } = - \nabla \cdot [ \widehat { \Psi } \boldsymbol { f } ] + \frac { 1 } { 2 } g ^ { 2 } \Delta \widehat { \Psi } }\end{array} \text { s.t. } \left\{\begin{array}{l}\Psi(\cdot, 0) \widehat{\Psi}(\cdot, 0)=p_{\text {data }} \\\Psi(\cdot, T) \widehat{\Psi}(\cdot, T)=p_{\text {prior }}\end{array}\right.\right.\end{equation}
โ€ข
Schrรถdinger system์˜ ํ•ด ฮจ\Psi, ฮจ^\widehat{\Psi} ์„ ์‚ฌ์šฉํ•˜๋ฉด ptp_t ์™€ uโˆ—\boldsymbol{u}^*๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
pt=ฮจ(โ‹…,t)ฮจ^(โ‹…,t)uโˆ—=g(t)โˆ‡logโกฮจ\begin{equation}\begin{gathered}p_t=\Psi(\cdot, t) \widehat{\Psi}(\cdot, t) \\\boldsymbol{u}^*=g(t) \nabla \log \Psi\end{gathered}\end{equation}
โ€ข
์ฆ‰, ์ค‘๊ฐ„๊ณผ์ •์€ ฮจ\Psi, ฮจ^\widehat{\Psi} ์™€ xt\boldsymbol{x}_t๋ฅผ ๋”ฐ๋ฅด๋Š” ๋ฐ€๋„ํ•จ์ˆ˜ ptp_t๋ฅผ ๋ถ„ํ•ดํ•œ ๊ฒƒ์œผ๋กœ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
๊ตฌ์ฒด์ ์œผ๋กœ ฮจ\Psi, ฮจ^\widehat{\Psi} ๋ฅผ ๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ํ›„์ˆ ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

SB ์žฌ์ •์˜: Forward ยท Reverse SDE

โ€ข
์ƒํƒœ ๋ฐฉ์ •์‹์˜ ์ œ์–ด ๋ณ€์ˆ˜ย uโˆ—\boldsymbol{u}^*์„ ์ตœ์  ํฌํ…์…œ ฮจ\Psi, ฮจ^\widehat{\Psi} ๋กœ ๋Œ€์ฒดํ•จ์œผ๋กœ์จย SB์˜ ํ•ด๋Š” ๋‹ค์Œ์˜ForwardยทReverseย SDE๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. [19]
โ—ฆ
๊ฐ SDE๋Š” reverse-time formula [12]์— ์˜ํ•ด ์ƒํ˜ธ ๋ณ€ํ™˜ ๊ฐ€๋Šฅ
Forwardย SDE:ย dxt=[f(xt,t)+g2(t)โˆ‡logโกฮจ(xt,t)]dt+g(t)dwt,x0โˆผpdataย Reverseย SDE:ย dxt=[f(xt,t)โˆ’g2(t)โˆ‡logโกฮจ^(xt,t)]dt+g(t)dwt,xTโˆผppriorย \begin{align}\begin{equation} \begin{array}{ll} \text{Forward SDE: }& \mathrm{d} \boldsymbol{x}_t=\left[\boldsymbol{f}\left(\boldsymbol{x}_t, t\right)+g^2(t) \nabla \log \Psi\left(\boldsymbol{x}_t, t\right)\right] \mathrm{d} t+g(t) \mathrm{d} \boldsymbol{w}_t, & \boldsymbol{x}_0 \sim p_{\text {data }} \\ \text{Reverse SDE: }& \mathrm{d} \boldsymbol{x}_t=\left[\boldsymbol{f}\left(\boldsymbol{x}_t, t\right)-g^2(t) \nabla \log \widehat{\Psi}\left(\boldsymbol{x}_t, t\right)\right] \mathrm{d} t+g(t) \mathrm{d} \boldsymbol{w}_t, & \boldsymbol{x}_T \sim p_{\text {prior }} \end{array} \end{equation} \end{align}
โ€ข
์ฆ‰, Schrรถdinger bridge ๋ฌธ์ œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ํ™•๋ฅ ๋ถ„ํฌ pdataย ,ppriorย p_{\text {data }}, p_{\text {prior }}์™€ ์ฐธ์กฐ์ธก๋„ (ํ™•๋ฅ ์žฅ) Q:dx=fdt+gย dw\mathbb{Q}: \mathrm{d} \boldsymbol{x}=\boldsymbol{f} \mathrm{d} t+g \mathrm{~d} \boldsymbol{w} ๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ, Schrรถdinger system์„ ๋งŒ์กฑ์‹œํ‚ค๋Š” ํ•จ์ˆ˜ ์Œ ฮจ\Psi, ฮจ^\widehat{\Psi} ์„ ๊ตฌํ•˜๋Š” ๋ฌธ์ œ.

SB ํ•™์Šต ๋ฐ ์ƒ์„ฑ๊ณผ์ •

โ€ข
ํ•™์Šต๊ณผ์ •
1.
๋ฐ์ดํ„ฐ ๋ถ„ํฌ๋ฅผ ์ค€๋น„: pdataย ,ppriorย p_{\text {data }}, p_{\text {prior }}
2.
๊ธฐ์ค€ ์ธก๋„๋ฅผ SDE๋กœ ์„ค๊ณ„: dx=fdt+gย dw\mathrm{d} \boldsymbol{x}=\boldsymbol{f} \mathrm{d} t+g \mathrm{~d} \boldsymbol{w}
3.
Schrรถdinger system์„ ์ถฉ์กฑ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ๋งค๊ฐœ๋ณ€์ˆ˜ํ™” ๋œ ๋ชจ๋ธ ํ›ˆ๋ จ (ฮจ\Psi, ฮจ^\widehat{\Psi} ์ค‘๊ฐ„ ํ•™์Šต)
โ€ข
์ƒ์„ฑ๊ณผ์ •
1.
์ดˆ๊ธฐ ๋ฐ์ดํ„ฐ x๋ฅผ ppriorย p_{\text {prior }}์—์„œ ์ƒ˜ํ”Œ
2.
ํ•™์Šต๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ดˆ๊ธฐ์กฐ๊ฑด x ํ•˜์—์„œ Reverse SDE๋ฅผ ํ’ˆ: dx=[fโˆ’g2โˆ‡logโกฮจ^]dt+gย dw\mathrm{d} \boldsymbol{x}=\left[\boldsymbol{f}-g^2 \nabla \log \widehat{\Psi}\right] \mathrm{d} t+g \mathrm{~d} \boldsymbol{w}

SGM๊ณผ์˜ ๊ด€๊ณ„

โ€ข
SB๋Š” ํ™•์‚ฐ๋ชจ๋ธ์˜ ํ™•์žฅ์ด๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
ํ™•์‚ฐ ๋ชจ๋ธ์˜ Forward Reverse SDE๋Š” ฮจโ‰ก1,p(xTโˆฃx0)=N(0,I)\Psi \equiv 1, p\left(\boldsymbol{x}_T \mid \boldsymbol{x}_0\right)=N(\mathbf{0}, \boldsymbol{I}) ์˜ ์ œ์•ฝ์€ ๋‘” SB์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.
โ€ข
์ด ๋•Œ, g2โˆ‡logโกฮจโ‰ก0,ฮจ^=ฮจฮจ^=pg^2 \nabla \log \Psi \equiv 0, \widehat{\Psi}=\Psi \widehat{\Psi}=p ๊ฐ€ ์„ฑ๋ฆฝํ•ฉ๋‹ˆ๋‹ค.
โ€ข
๋” ์—„๋ฐ€ํžˆ ๋งํ•˜๋ฉด, ์‚ฌ์ „ ๋ถ„ํฌ์˜ ์ œ์•ฝ ์กฐ๊ฑด์„ ํ‘ธ๋Š” ๋Œ€์‹  ์ „์ง„ ๊ณผ์ •๋„ ํ•™์Šต ํŒŒ๋ผ๋ฏธํ„ฐํ™”ํ•œ ํ™•์‚ฐ ๋ชจ๋ธ์ด Schrรถdinger bridge์ž…๋‹ˆ๋‹ค.

๊ฒฐ๋ก 

โ€ข
Schrรถdinger Bridge (SB)๋Š” ์‚ฌ์ „ ๋ถ„ํฌ์˜ ์ œ์•ฝ์„ ์™„ํ™” ํ•œ ํ™•์‚ฐ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
โ€ข
SB๋Š” Dynamic Optimal Transport ๋ฌธ์ œ๋กœ ๊ฐ„์ฃผ ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
T. Chen et al. (2021)์— ๋”ฐ๋ฅด๋ฉด Forward ๋ฐ Reverse Process SDE์˜ ๋™์‹œ ์ตœ์ ํ™”๋กœ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋” ์•Œ์•„๋ณผ ๋‚ด์šฉ

1.
๊ธฐ์ค€ ๊ฒฝ๋กœ ์ธก๋„ Q\mathbb{Q}์˜ ์„ค๊ณ„ ๋ฐฉ๋ฒ•
โ€ข
ํ™•๋ฅ  ์ตœ์  ์ œ์–ด ๋ฌธ์ œ์˜ ํ™•๋ฅ  ํ•„๋“œ (SDE) dxt=f(xt,t)dt+g(t)dwt\mathrm{d} \boldsymbol{x}_t=\boldsymbol{f}\left(\boldsymbol{x}_t, t\right) \mathrm{d} t+g(t) \mathrm{d} \boldsymbol{w}_t
โ€ข
๋“œ๋ฆฌํ”„ํŠธ ๊ณ„์ˆ˜ f(xt,t)\boldsymbol{f}\left(\boldsymbol{x}_t, t\right)์™€ ํ™•์‚ฐ ๊ณ„์ˆ˜ g(t)g\left(t\right)๋Š” ์–ด๋–ป๊ฒŒ ์„ค๊ณ„ํ•ด์•ผ ํ• ๊นŒ?
2.
SB ๋ชจ๋ธ ํ›ˆ๋ จ ์•Œ๊ณ ๋ฆฌ์ฆ˜
โ€ข
Schrรถdinger potential ฮจ\Psi, ฮจ^\widehat{\Psi}์˜ ์ตœ์  ํ•ด๋ฅผ ๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ•
โ€ข
๊ธฐ๊ณ„ ํ•™์Šต ํ”„๋ ˆ์ž„ ์›Œํฌ์— ๊ฐ€์ ธ์˜ฌ ๋•Œ, ์–ด๋–ป๊ฒŒ ๋งค๊ฐœ ๋ณ€์ˆ˜ํ™”ํ•˜๋Š” ๊ฒƒ์ด ๋ฐ”๋žŒ์งํ• ๊นŒ?
โ€ข
Iterative Proportional Fitting (IPF) [Bortoli 21, Vargas 21]
โ—ฆ
Optimal Transport Sinkhorn-Knopp [Curturi 13] ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ํ™•์žฅ
โ€ข
Forward-backward SDE (SB-FBSDE)
โ€ข
Iterative Markov Fitting (IMF)
โ€ข
Computer Vision ์‘์šฉ
โ—ฆ
Image-to-image Schrรถdinger Bridge (I2SB)
โ€ข
๊ด€๋ จ ์•Œ๊ณ ๋ฆฌ์ฆ˜
โ—ฆ
Conditional Flow Matching (CFM) (Tong 23, Kerrigan 24)
โ—ฆ
Simulation-Free Score and Flow Matching ([SF]2M)