Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model

CVPR 2024


Dian Zheng1    Xiao-Ming Wu1     Shuzhou Yang2     Jian Zhang2    Jian-Fang Hu1    Wei-Shi Zheng†1, 3

1 School of Computer Science and Engineering, Sun Yat-sen University, China    
2 School of Electronic and Computer Engineering , Peking University, China   
3 Key Laboratory of Machine Intelligence and Advanced Computing, Ministry of Education, China   

corresponding author  


input

An illustration of existing universal image restoration methods compared with our DiffUIR, existing methods mainly design task-specific modules to handle different distributions, which force the generic model (tangerine module) to learn different distributions at once, termed multi-partite mapping. In contrast, the proposed DiffUIR maps the different distributions to one shared distribution (i.e., note that it is not the pure Gaussian distribution) while maintaining strong condition guidance. In this way, DiffUIR enables the generic model to only learn one shared distribution and guides the shared distribution to a task-specific distribution, termed selective hourglass mapping.


Abstract


In this work, we tackle the main challenge of universal image restoration, learning different degradation tasks in a single model simultaneously. The goal is to learn the shared information between different tasks while maintaing high generation image quality. More importantly, toward learning shared information, we want to enhance the generalization ability in real world which has more complicated scenes with multi-degradation types.

Two novel considerations make our DiffUIR non-trival:

  1. strong condition guidance;
  2. shared distribution mapping.
Firstly, inspired by the RDDM, we explicitly fuse the condition (i.e., degraded images) into the diffusing algorithm of the diffusion model and extensively concatenate the condition with the diffusing target. In this way, DiffUIR is equipped with the ability of strong condition guidance which is similar to multi-partite mapping methods. More importantly, to achieve shared distribution mapping, we integrate a shared distribution term, named SDT into the diffusion algorithm elegantly and naturally, which gradually adjusts the weight of the condition in the algorithm. By modeling the two problems, in the forward diffusing process, DiffUIR gradually map various distributions to one shared distribution, enabling the model to capture the shared information between tasks. Notably, we map the different distributions to an impure Gaussian distribution with slight condition left, as mentioned in previous works, the pure Gaussian noise contains no task information, which is not conducive to good generation quality. In the reverse process, by the guidance of the strong condition and SDT, DiffUIR will gradually recover from the shared distribution into the task-specific distribution.

By only modifying the mapping strategies, without bells and whistles we achieve SOTA performance with 5 times less computational cost. Additionally, we propose several light versions of our DiffUIR, the tiny version DiffUIR-T comprises only 0.89M parameters but showing outstanding performance success. We further do zero-shot generalization experiments, which also achieve SOTA and presenting ability of handling multi-degradation at once.


Selective Hourglass Mapping


Full algorithm

input

The goal of our method is to achieve strong condition guidance and shared distribution mapping at the same time. We adopt the condition mechanism of RDDM and integrate a shared distribution term (SDT) into the diffusion algorithm that achieves the synergistic effect between two components. The whole algorithm contains Distribution Approaching Forward Process, Distribution Diffusing Reverse Process and Universal Training Objective.


Distribution Approaching Forward Process

$\begin{equation} \begin{aligned} I_t &= I_{t-1}+\alpha_tI_{res}+ \beta_t\epsilon_{t-1}-\delta_tI_{in} \\ &= I_{t-2}+(\alpha_t+\alpha_{t-1})I_{res}+ \sqrt{\beta^2_t+\beta^2_{t-1}}\epsilon_{t-2} -(\delta_t+\delta_{t-1})I_{in} \\ &= I_{t-3} \dots \\ &= I_0+\overline{\alpha}_tI_{res}+\overline{\beta}_t\epsilon-\overline{\delta}_tI_{in}, \end{aligned} \end{equation}$

Our forward process is shown above, where $\delta_tI_{in}$ is the SDT, $\delta$ is the shared distribution coefficient. We set the value of $\overline{\delta}_t$ from 0 to 0.9, which will gradually reduce the influence of the condition. When $t \rightarrow T$, $\overline{\alpha}_T=1$, and the formula could be rewritten as: $I_T=0.1I_{in}+\overline{\beta}_T\epsilon$, which approaches an impure Gaussian distribution.


Distribution Diffusing Reverse Process

Following the DDPM, we use $q(I_{t-1}|I_t,I_{in},I_0^{\theta},I_{res}^{\theta})$ to simulate the distribution of $p_{\theta}(I_{t-1}|I_t)$ and based on the Bayes theorem, we could calculate it as follows.

$\begin{equation} \begin{aligned} \label{eqn:bayes} &\ p_{\theta}(I_{t-1}|I_t) \rightarrow q(I_{t-1}|I_t,I_{in},I_0^{\theta},I_{res}^{\theta}) & \\ &= q(I_t|I_{t-1},I_{in},I_{res}^{\theta})\frac{q(I_{t-1}|I_0^{\theta},I_{res}^{\theta},I_{in})}{q(I_t|I_0^{\theta},I_{res}^{\theta},I_{in})} \\ &\propto exp\left[-\frac{1}{2}((\frac{\overline{\beta}_t^2}{\beta_t^2\overline{\beta}_{t-1}^2})I_{t-1}^2-2(\frac{I_t+\delta_t I_{in}-\alpha_tI_{res}^{\theta}}{\beta_t^2} \right.\\ &\left.+\frac{I_0^{\theta}+\overline{\alpha}_{t-1}I_{res}^{\theta}-\overline{\delta}_{t-1}I_{in}}{\overline{\beta}_{t-1}^2})I_{t-1}+C(I_t,I_0^{\theta},I_{res}^{\theta},I_{in}))\right]. \end{aligned} \end{equation}$

As the goal of the formulation is to obtain the distribution of the $I_{t-1}$, we simplify and rearrange it into the form about the $I_{t-1}$ and $C(I_t,I_0^{\theta},I_{res}^{\theta},I_{in})$ is the term unrelated with it. Then we calculate the mean $\mu_{\theta}(I_t, t)$ and variance $\sigma_{\theta}(I_t, t)$ of the distribution $p_{\theta}(I_{t-1}|I_t)$ based on equation above:

$\begin{equation} \begin{aligned} &\mu_{\theta}(I_t, t) = I_t - \alpha_t I_{res}^{\theta} + \delta_t I_{in} - \frac{\beta^2_t}{\overline{\beta}_t}\epsilon^{\theta} \\ &\sigma_{\theta}(I_t, t) = \frac{\beta^2_t \overline{\beta}^2_{t-1}}{\overline{\beta}^2_t}, \end{aligned} \end{equation}$

where $I_{res}^{\theta}$ is predicted by the model and $\epsilon^{\theta}$ is obtained by the $I_{res}^{\theta}$. Based on the reparameterization technology, if we use the sampling strategy from the DDPM, $I_{t-1}$ could be calculated as follows:

$\begin{equation} I_{t-1} = I_t - \alpha_t I_{res}^{\theta} + \delta_t I_{in} - \frac{\beta^2_t}{\overline{\beta}_t}\epsilon^{\theta} + \frac{\beta_t \overline{\beta}_{t-1}}{\overline{\beta}_t}\epsilon_*, \end{equation}$

where $\epsilon_*$ is the random Gaussian noise. In this paper, to accelerate the sampling speed, we use the sampling strategy of DDIM and $I_{t-1}$ is calculated by:

$\begin{equation} I_{t-1} = I_t - \alpha_t I_{res}^{\theta} + \delta_t I_{in}. \end{equation}$

Based on the ddim equation, we could iteratively recover the sample from the $I_T$ to $I_{T-k}, I_{T-2k}, \dots, I_k, I_0$ where k means skip steps followed DDIM.


Universal Training Objective

Although our algorithm is different from standard diffusion models, as we both approach the different degraded distributions to one shared distribution, we reference DDPM and conduct the meticulous derivation of the training objective as follows:

$\begin{equation} \mathcal{L}(\theta) = D_{KL}(q(I_{t-1}|I_t,I_0,I_{res},I_{in}) || p_{\theta}(I_{t-1}|I_t)). \end{equation}$

As mentioned in VAE, the Kullback-Leibler divergence of two Gaussian distributions could be simplified to the difference of their mean, the function is transformed as:

$\begin{equation} \begin{aligned} \mathcal{L}(\theta) &= \mathbb{E}_{q(I_t|I_0)}\left[\Vert \mu(I_t,I_0) - \mu_{\theta}(I_t,t) \Vert^2\right] \\ &= \mathbb{E}_{t,\epsilon,I_{res}}\left[\Vert I_t - \alpha_t I_{res} + \delta_t I_{in} - \frac{\beta^2_t}{\overline{\beta}_t}\epsilon - \right.\\ & \left.\quad (I_t - \alpha_t I_{res}^{\theta} + \delta_t I_{in} - \frac{\beta^2_t}{\overline{\beta}_t}\epsilon^{\theta}) \Vert^2\right] \\ &= \mathbb{E}_{t,\epsilon,I_{res}}\left[\Vert \alpha_t(I_{res}^{\theta} - I_{res})+\frac{\beta^2_t}{\overline{\beta}_t}(\epsilon^{\theta}-\epsilon) \Vert^2\right], \\ \end{aligned} \end{equation}$

where $I_{res}$, $\epsilon$ mean established value in the forward process and $I_{res}^{\theta}$, $\epsilon^{\theta}$ mean predicted result in the reverse process. Referring to the official code of DDPM, predicting the noise or the input is essentially equivalent, so we directly use the model to predict the residual $I_{res}^{\theta}$, the $\epsilon^{\theta}$ could be derivated by it. According to previous works, when predicting the input, $L_1$ loss performs better than $L_2$ loss. Based on the experimental and theoretical basis above, our final loss function is simplified as follows:

$\begin{equation} \mathcal{L}(\theta)_{simple} = \mathbb{E}_{t,I_t,I_{res}}\left[\Vert I_{res} - I^{\theta}_{res}(I_t,t) \Vert_1\right]. \end{equation}$




Qualitative results


input
input
input
input
input
input


Analysis of shared information learning


Visualization of the feature distribution before and after our SDT term. It could be seen that by formulating the SDT into the diffusion algorithm, the different degradation distributions map to a shared one, enabling the model to capture the shared information between different tasks.

input

The attention of our feature map focuses on the region with rain and fog, validating that we share useful information.

input



Citation



@inproceedings{zheng2024selective,
  title={Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model},
  author={Zheng, Dian and Wu, Xiao-Ming and Yang, Shuzhou and Zhang, Jian and Hu, Jian-Fang and Zheng, Wei-Shi},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2024}
}


Contact


If you have any questions, please feel free to contact us:

  • Dian Zheng: zhengd35Prevent spamming@Prevent spammingmail2.sysu.edu.cn
  • Wei-Shi Zheng: wszhengPrevent spamming@Prevent spammingieee.org