WooSung-Min1
                     LeeSeong-Eui2
                     KimJong-Ok2
               
                  - 
                           
                        (School of Electrical, Electronics and Communication Engineering, Koreatech / 1600,
                        Chungjeol-ro, Byeongcheon-myeon, Dongnam-gu, Cheonan-si, Chungcheongnam-do, 31253,
                        Korea   innosm@koreatech.ac.kr )
                        
- 
                           
                        (	School of Electrical Engineering, Korea University / Engijneering Building #412,
                        College of Engineering, 145 Anam-ro, Seongbuk-gu, Seoul, 02841, Korea   {dltjddml97,
                        jokim}@korea.ac.kr )
                        
 
            
            
            Copyright © The Institute of Electronics and Information Engineers(IEIE)
            
            
            
            
            
               
                  
Keywords
               
                Image denoising,  DNN-based,  Texture-adaptive denoising,  Texture segmentation,  Perceptual image quality
             
            
          
         
            
                  1. Introduction
               Image denoising is an old but active research topic. It is essential because noise
                  is added to an image in various delivery paths, such as capturing, compressing, and
                  presenting a photograph [1]. In recent years, researchers have reported the remarkable performance of denoising
                  an image leveraged by deep neural networks (DNN) [2-5]. In quantitative scores, such as PSNR or SSIM, DNN-based denoising methods outperforms
                  widely used conventional bilateral filter-based [6] or wavelet domain denoising [7,8] methods. On the other hand, DNN-based denoising methods are still limited in practical
                  use in a few aspects.
               
               First, it is difficult to control the denoising strength of an image. Each DNN-based
                  method optimizes network parameters to minimize the designed loss of a particular
                  type. In training, the mean square error or absolute error is a metric commonly used
                  to calculate loss [9,10]. Reducing the score, however, does not guarantee a perceptually better image, even
                  if the resulting image improves in terms of noise. As a result, those methods typically
                  suffer from the loss of high-frequency details, such as edges and textures, to achieve
                  high PSNR scores [11], and produce overly smooth images, as illustrated in Fig. 1. This can be a serious problem, particularly when used by television manufacturers
                  to enhance the picture quality, as sharp detail cannot be sacrificed for noise removal.
               
               Second, DNN-based methods are intensive and require large datasets for generalization
                  performance. The prosperity of the examples and the heavy computation power make it
                  possible to generate images that are numerically closer to the correct answer. Nevertheless,
                  they are still subjectively insufficient. This explains why these methods are not
                  actively applied in practice.
               
               This paper proposes a method to denoise an image using region segmentation, prioritizing
                  the subjective image quality after denoising. The proposed network partitions the
                  image into flat and texture regions and denoises them adaptively. Assuming that noise
                  in an image is a type of additive Gaussian, the texture region is composed of high-frequency
                  components, and it is difficult to distinguish between true noise and texture. This
                  region is mostly over-smoothed in existing DNN methods. On the other hand, the flat
                  region is where the noise removal effect can be easily recognized. The proposed architecture
                  learns a texture map prepared in advance for the loss function in the learning process
                  and performs denoising by adjusting noise depending on the flat and texture regions.
                  Therefore, the noise in the two regions is treated independently so that they do not
                  affect each other. This way, minimal noise suppression can be applied in the texture
                  region while maximizing noise removal in the background.
               
               
                     Fig. 1. (a) Noisy image; (b) Clean image; (c) Enlarged view of the red boxes in (a); (d) DnCNN results; (e) RedNet30 results; (f) The proposed results; (g) Enlarged view of the ground truth in (b). Whereas existing DNN-based denoising methods over-smooth edges and textures, the proposed method removes noises in flat region as in the second row of (d)-(f), and preserves high frequency details in the first row of (d)-(f).
 
             
            
                  2. Related Work
               Researchers have proposed various denoising methods for image enhancement. Traditional
                  methods include a bilateral filter [6] and wavelet image denoising [7]. A bilateral filter utilizes the intensity and distance of nearby pixels to preserve
                  the edges of an image. Wavelet image denoising takes advantage of the sparse representation
                  of an image and denoises it by localizing features in the image to different scales
                  while preserving important features. The total variation exploits the image statistics
                  [12] and dictionary-based methods learn sparse features from clean images [13]. BM3D [14] and NLM [15] find similar patches in an image and group them into 3D blocks to extract the finest
                  shared details.
               
               Various types of convolutional neural networks adopted from image classification and
                  localization have shown excellent performance in image denoising. DnCNN [16] is a milestone in denoising with performance improvements through residual learning.
                  Autoencoder [17] or UNet-based methods [18,19] are also competitive structures in denoising. As deep neural networks have proven
                  effective in solving more than one problem simultaneously [20], some researchers have applied DNNs to comprehensively improve image quality, such
                  as noise and resolution [21] or noise and dynamic range [22].
               
               This study focused on the subjective image quality of the denoised image, which was
                  not considered important previously. By contrast, the denoised images of existing
                  methods are over-smoothed, especially in high-frequency regions. The proposed architecture
                  is designed to adjust the denoising level depending on the texture-ness of an image,
                  which is also learned during training. Therefore, denoising is applied to the maximum
                  for flat regions, such as backgrounds and objects with only low-frequency components.
                  For the texture regions with high-frequency objects, denoising is applied selectively
                  to avoid losing sharp details for visual perceptual quality. The proposed scheme is
                  an essential technology for camera and imaging software and TV manufacturers that
                  must deliver clear and rich details. 
               
             
            
                  3. The Proposed Scheme
               In the initial experiment, a texture map was applied to the loss function to observe
                  the performance of the texture segmentation. A lightweight neural network is preferred
                  assuming limited memory and GPU resources. The proposed architecture adopts the DnCNN
                  [16] as a base network and utilizes only six layers with 16 channels per layer from the
                  original DnCNN for texture segmentation, as illustrated in Fig. 2. The middle three layers have batch normalizations and ReLU activations. The features
                  from the first convolution layer are concatenated with the output of the fourth layer
                  to facilitate the back-propagation of the gradients.
               
               
                     Fig. 2. The proposed architecture for texture segmentation.
 
               
                     3.1 Texture Map Generation 
                  Using residual learning [6], which has proven to be effective, particularly in image denoising, the proposed
                     system learns whether the region is a texture in training. It automatically determines
                     the denoising area in the inference process. This was accomplished by generating a
                     texture map for each database image and multiplying the residual noise by the texture
                     map to obtain a weak-texture noise image, as illustrated in Fig. 3. The existing map-based denoising methods have a dedicated sub-branch network for
                     the weight map, whereas the proposed network does not have an additional branch for
                     the map extraction. This study utilized the weak-texture noise image in the loss function
                     so that the network learns that there is no or less noise in the texture areas. As
                     a result, high-frequency component regions, such as the texture and edge, were preserved,
                     while weak textures were denoised in the proposed system. The proposed network estimates
                     the weak-texture noises $\hat{\boldsymbol{N}}_{\boldsymbol{T}}$ at the network output
                     by solving the following problem:
                  
                  
                  
                  where $\boldsymbol{N}_{\boldsymbol{T}}$ is the ground truth weak-texture noise image
                     calculated from the pixel-wise multiplication $(\odot )$ of the noise $\boldsymbol{N}$
                     and the weak-texture map $\boldsymbol{T}$. $\boldsymbol{X}$ and $\boldsymbol{Y}$ are
                     the noisy and clean image, respectively. 
                  
                  The weak-texture map $\boldsymbol{T}$ serves to guide the regions to be denoised to
                     the proposal network. The denoised region includes areas with flat and low-frequency
                     components, such as the background behind the building in Fig. 3. The skeleton of the building and fine details, such as the pattern on a wall, should
                     be excluded. In the present study, a 3$\times $3 Sobel edge detection was applied
                     to a ground truth gray image first. Then, the resulting image was thresholded and
                     inverted from the normalization to obtain a weak-texture noise map $\boldsymbol{T}$
                     as the follows:
                  
                  
                  where $\mathbf{\mathcal{D}}\left(\odot \right)$ represents the bidirectional Sobel
                     operator. $\boldsymbol{Y}_{gray}$ and k are the ground truth gray image and the threshold
                     for the maximum response, respectively. $\boldsymbol{T}$ depends on k in Eq. (3). A higher k means that the denoised region becomes smaller; 0.2 for k was selected
                     empirically for the entire study. Fig. 4 shows the examples of pairs of the ground truth image Y and the corresponding weak-texture
                     map T from the DIV2K dataset [24]. In T in Fig. 4, the dark areas identified as texture, such as grass and patterns of clothes, are
                     not denoised but retain their detail. In contrast, the bright areas in T of Fig. 4 are classified as flat or weak textures and are learned to remove noise.
                  
                  
                        Fig. 3. Weak-texture noise map generation to preserve texture details of an image.
 
                  
                        Fig. 4. Examples of a pair of cleans and weak-texture maps from DIV2K dataset.
 
                
               
                     3.2 Performance of Weak Texture Segmentation 
                  The DIV2K [24] datasets were utilized qualitatively and quantitatively to compare the performance
                     of the proposed method with existing methods. The DIV2K dataset consisted of 900 high-quality
                     images divided into 800 training sets and 100 validation sets. Experiments with a
                     large sigma, i.e., ${\sigma}$ = 50, were excluded because as a system maker that delivers
                     the final products, such heave noise is unrealistic. In the analysis of the TV footage
                     also gathered, the noise level of the actual broadcasted video was equivalent to the
                     sigma 8-10 of Gaussian noise at most.
                  
                  The proposed method does not explicitly perform region segmentation for noise removal
                     inside the network and learns the target denoising region through the designed loss
                     function. Therefore, the performance of the weak-texture map segmentation is difficult
                     to verify with a test image. On the other hand, it can be achieved indirectly by comparing
                     the original DnCNN $\hat{\boldsymbol{Y}}_{\boldsymbol{R}}$ and the proposed texture-preserved
                     denoised result $\hat{\boldsymbol{Y}}$. The weak-texture maps in Fig. 4 notify the network how much denoising is performed by the intensity of $\boldsymbol{T}$.
                     The brightest backgrounds in $\boldsymbol{T}$ of Fig. 4 are the areas that need to be denoised with the same intensity as the DnCNN. The
                     residual image between $\hat{Y}_{R}$ and $\hat{Y}$ can show how well the proposed
                     method protects the textured area in test images. Fig. 5 represents the test and the corresponding residual images generated as follows:
                  
                  
                  The proposed network correctly identifies weak-texture regions because no significant
                     pixel differences were found in the images. In the first column image of Fig. 5, the red pillars surrounded by pointed branches are flat regions that require denoising
                     and are properly segmented.
                  
                  Another way to evaluate the performance of the segmentation is to gradually increase
                     the amount of noise assumed in the textured region. If the proposed method segments
                     flat and textured regions correctly, then the flat regions are unaffected by these
                     variations, and more noise is observed only in the texture regions. For this, $\boldsymbol{N}_{\boldsymbol{T}}~
                     $was modified slightly to $\boldsymbol{N}_{\boldsymbol{TF}}~ $in Eq. (1) as follows:
                  
                  
                  where c is the parameter that controls the noise level in the texture region between
                     0 and 1. If c = 0, it means that there is no noise in the texture region and if c
                     = 1, the noise is distributed evenly over the entire image and converges to DnCNN.
                     
                  
                  Fig. 6 illustrates the denoising results with respect to c in the test images. The proposed
                     method appears to distinguish flat and texture regions correctly. In the first two
                     rows of Fig. 6, the uniformly colored backgrounds in (b)-(d) have no noticeable changes as c increases,
                     while noise is not observed regardless of c. The denoising level of the texture regions
                     in (b)-(d) varies with c. As c increases, the processed images become closer to DnCNN.
                     In the third and fourth row images of Fig. 6 with no apparent flat regions, DnCNN over-smooths subtle texture, resulting in blurry
                     images. The proposed method suppresses noise in the flat region, as in the first two
                     rows of Fig. 6, while it simultaneously controls the level of denoising in the texture region that
                     is easy to smooth, as shown in the last two rows of Fig. 6.
                  
                  
                        Fig. 5. Residual images between original DnCNN and proposed method. The dark areas represent the denoised regions in the proposed method with the same intensity as DnCNN.
 
                  
                        Fig. 6. Comparisons with respect to c in(5). Noise in flat regions does not change when increasing c.
 
                
               
                     3.3 Siamese Architecture for Texture Adaptive Denoising 
                  With the texture map obtained in the previous section, a Siamese network architecture
                     is proposed to maximize the denoising performance by processing flat and texture regions
                     separately. Leaving the texture area as is or assuming the amount of the noise is
                     small is useful when the noise is low. On the other hand, to utilize the proposed
                     solution in a more versatile way, denoising is performed in the YCbCr domain so that
                     the noise in the chrominance components is minimized, and noise in the luminance is
                     controllable as in c used in (5). This is because human vision is more sensitive to luminance than chrominance [25].
                  
                  Fig. 7 presents the acquisition process of the ground truth noise image in training. Noise
                     is multiplied by the weak-texture map, divided into the texture and flat components,
                     $\boldsymbol{N}_{T}$ and $\boldsymbol{N}_{F}$, and converted to the YCbCr domain to
                     obtain $\boldsymbol{N}_{Tc}$ and $\boldsymbol{N}_{Fc}$. The first channel in the YCbCr
                     image represents the illuminance. The parameter c in Fig. 7 now controls how much the noise in the luminance of the texture area $\boldsymbol{N}_{Tc}$
                     is removed, while eliminating the noise in the chrominance of all regions. The proposed
                     Siamese network, as depicted in Fig. 8 estimates the noise in the texture and flat areas separately by the designed loss
                     function. The loss function to train is as follows.
                  
                  
                  After the two outputs from each subnet are obtained, they are combined and converted
                     to the RGB domain to be an estimated noise $\hat{\boldsymbol{N}}$. The denoised image
                     $\hat{\boldsymbol{Y}}$ is then reconstructed by subtracting $\hat{\boldsymbol{N}}$
                     from the noisy input X. The number of intermediate layers (depicted in blue) in Fig. 8 can be optimized to the strength of the noise, as shown in the next section.
                  
                  
                        Fig. 7. The acquisition of the ground truth noises in the texture and flat areas.
 
                  
                        Fig. 8. The proposed Siamese Architecture for texture adaptive denoising.
 
                
             
            
                  4. Performance Evaluation 
               Popular evaluation metrics, such as PSNR and SSIM for image quality assessment, are
                  not considered to reflect human visual perception correctly [26]. A more appropriate method is required when evaluating systems based on the criteria
                  that produce noise-free, natural, and visually pleasing images. In the present case,
                  the fidelity of the reconstruction focuses mainly on the high frequency of the signal.
                  The gradient distribution proposed by [26,27] can show how two similar images are distributed on the gradient domain. Therefore,
                  the metric defined as the squared difference in the gradient distribution for an objective
                  comparison is expressed as follows.
               
               
               where $\boldsymbol{H}_{\boldsymbol{gt}}$ and $\boldsymbol{H}_{\boldsymbol{est}}$ are
                  the gradient histograms of the gray image $\boldsymbol{Y}_{gt}$ and the corresponding
                  estimate $\hat{\boldsymbol{Y}}_{gray}$, respectively. Fig. 9 illustrates the gradient distributions of the proposed method compared to the existing
                  methods for the first cropped image in Fig. 6 on a logarithmic scale. The number of bins in the histogram in Fig. 9 is 100. The resulting images of DnCNN and GradNet [28], shown in green and blue, respectively, show a non-negligible difference in the vicinity
                  of the high gradient compared to the clean. On the other hand, the proposed method
                  with c=0.5, as indicated by the red line, tracks the distribution of the clean image
                  without abrupt decay.
               
               Table 1 lists the experimental results with the Gaussian ${\sigma}$ = 10 and 25 by $GD_{err}$.
                  This study varied the number of intermediate layers; L means the total number of layers,
                  including a variable number of intermediate layers, and c is the parameter for controlling
                  the noise of illuminance in Fig 7. The scores in the parenthesis in the first row
                  of Table 1 are PSNR and SSIM for reference. Existing deep denoising networks perform marginally
                  better than the proposed method on PSNR and SSIM metrics. On the other hand, they
                  have large $GD_{err}$, which means that the gradient distribution of the denoised
                  images differs substantially from that of the original image. Although the variants
                  of the proposed method in Table 1 have a relatively small number of layers, they retain the gradient distribution of
                  the clean image. Therefore, the proposed outputs are perceived as more similar to
                  a clean image.
               
               To scrutinize it, an image was divided into texture and flat regions simply by the
                  average gradient value of each image. The gradient of a pixel greater than the average
                  belongs to the texture region, otherwise the flat region. As expected, $GD_{err}$
                  in the texture region is substantially greater than that in the flat region. When
                  comparing the results of DnCNN and GradNet with the equivalent backbone structures,
                  the proposed method shows similar denoising performance in the flat but superior texture
                  at ${\sigma}$ = 10 and 25. A large number of layers in the network helps generate
                  good PSNR and SSIM metric scores when the strength of noise is high, but it does not
                  improve $GD_{err}.$ Therefore, the setup proposed in the last column of Table 1 can be a good compromise for conventional quantitative metrics and $GD_{err}.$
               
               Fig. 10 presents the resulting images for visual comparisons. Three popular deep neural net-based
                  methods were evaluated to assess the performance of the proposed method: DnCNN, GradNet,
                  and RedNet30 [29]. The first to third column images in Fig. 10 are the results with ${\sigma}$ = 10, and the fourth to last column images are the
                  results with ${\sigma}$ = 25. The deep neural network-based competitive methods have
                  been observed to remove noise well in large and small noise simulations. On the other
                  hand, they over-smooth texture details and noise without textual information. By contrast,
                  the proposed method adaptively denoises texture and flat areas. The resulting denoised
                  images of the proposed method are smoothed in flat areas and consistently sharp in
                  texture areas. Therefore, the proposed outputs look more realistic and visually appealing.
                  In addition, the proposed method does not require a texture map or other input to
                  denoise regions in the inference further. The ability to distinguish the domains is
                  acquired through the loss function of the learning process. The perceptual difference
                  between the proposed and competing methods is significant when ${\sigma}$ = 25. Unlike
                  the resulting images of the competing methods, which are unnatural and cartoonish,
                  the images from the proposed method appear sharper and maintain naturalness
               
               
                     Fig. 9. Gradient distribution of the proposed and existing methods for the first image in Fig. 6.
 
               
                     Fig. 10. Image Quality Comparison. First to Third Columns are for ${\upsigma}$ = 10 and Fourth to last columns are for ${\upsigma}$ = 25. It is recommended to enlarge the images for better comparison.
 
               
                     Table 1. Evaluation Result by normalized $\boldsymbol{G}\boldsymbol{D}_{\boldsymbol{err}}$($\times \frac{1}{10^{9}}$).
                  
                        
                           
                              | ${\sigma}$ = 10 | Noisy | DnCNN [16] | GradNet [28] | RedNet30 [29] | Proposed (3L, c=0.5) | Proposed (8L, c=0.5) | Proposed (10L, c=0.5) | 
                        
                              | All | 9.205 (28.36/0.80) | 0.498 (37.54/0.98) | 0.472 (37.54/0.98) | 0.389 (37.04/0.98) | 0.235 (34.77/0.96) | 0.519 (35.85/0.97) | 0.386 (36.330.97) | 
                        
                              | Texture | 1.043 | 0.241 | 0.222 | 0.168 | 0.113 | 0.259 | 0.163 | 
                        
                              | Flat | 5.257 | 0.083 | 0.083 | 0.080 | 0.066 | 0.084 | 0.082 | 
                        
                              | ${\sigma}$ = 25 | Noisy | DnCNN | GradNet | RedNet30 | Proposed (12L, c=0.8) | Proposed (14L, c=0.8) | Proposed (17L, c=0.8) | 
                        
                              | All | 18.33 (20.7/0.53) | 1.20 (32.88/0.95) | 1.15 (32.98/0.95) | 1.01 (32.79/0.95) | 0.88 (31.96/0.93) | 0.88 (32.00/0.94) | 0.98 (32.02/0.94) | 
                        
                              | Texture | 2.711 | 0.818 | 0.779 | 0.660 | 0.567 | 0.578 | 0.648 | 
                        
                              | Flat | 9.458 | 0.084 | 0.084 | 0.083 | 0.077 | 0.074 | 0.077 | 
                     
                  
                
             
            
                  5. Conclusion
               This paper proposed a method for texture adaptive denoising of an image. Existing
                  DNN-based denoising methods have shown excellent performance in terms of PSNR and
                  SSIM metrics. However, the perception of image quality by human beings is subjective,
                  and there are no universal metrics to evaluate human visual image quality. Overall,
                  they over-smooth and blur the image textures, scoring high on average. In the existing
                  methods, it is not easy to fine-tune the strength of the denoising in their frameworks
                  because only the trained dataset controls the resulting image quality. In the proposed
                  method, however, $GD_{err}$ was utilized to determine how faithfully it recovers the
                  signals in the gradient space. As a result, the proposed method further maintains
                  the gradient distribution in the texture corresponding to the high-frequency component
                  of an original signal. A single parameter c in the proposed method determines how
                  much noise is removed from the noisy luminance of texture areas, a novel feature existing
                  DNN-based methods cannot have. No additional resources are required because the training
                  process achieves the above-mentioned benefits through the designed loss function
               
             
          
         
            
                  ACKNOWLEDGMENTS
               
                  				This paper was supported by the Education and Research promotion program of KOREATECH
                  in 2020.
                  			
               
             
            
                  
                     REFERENCES
                  
                     
                        
                        Xu J, Zhang L, Zhang D, 2018, External prior guided internal prior learning for real-world
                           noisy image denoising, IEEE Transactions on Image Processing, Vol. 27, No. 6, pp.
                           2996-3010

 
                     
                        
                        Wang Tianyang, Sun Mingxuan, Hu Kaoning, 2017, Dilated deep residual network for image
                           denoising, IEEE 29th international conference on tools with artificial intelligence
                           (ICTAI). IEEE

 
                     
                        
                        Xu Qingyang, Zhang Chengjin, Zhang Li, 2015, Denoising convolutional neural network,
                           IEEE International Conference on Information and Automation. IEEE

 
                     
                        
                        Cruz C, Foi A, Katkovnik V, Egiazarian K, 2018, Nonlocality-Reinforced Convolutional
                           Neural Networks for Image Denoising, IEEE Signal Process Letter, Vol. 25, No. 8, pp.
                           1216-1220

 
                     
                        
                        Zhang K, Zuo W, Zhang L, 2018, FFDnet: toward a fast and flexible solution for CNN-based
                           image denoising, IEEE Transactions on Image Processing, Vol. 27, No. 9, pp. 4608-4622

 
                     
                        
                        Fan Linwei, et al. , 2019, Brief review of image denoising techniques, Visual Computing
                           for Industry, Biomedicine, and Art 2.1, pp. 1-12

 
                     
                        
                        Hou J. H, 2003, Research on image denoising approach based on wavelet and its statistical
                           characteristics, Wuhan: Huazhong University of Science and Technology

 
                     
                        
                        Zhang Lei, Bao Paul, Wu Xiaolin, 2005, Multiscale LMMSE-based image denoising with
                           optimal wavelet selection, IEEE Transactions on circuits and systems for video technology,
                           Vol. 15, No. 4, pp. 469-481

 
                     
                        
                        de Ridder D, Duin R. P, Verbeek P. W, Van Vliet L, 1999, The Applicability of Neural
                           Networks to Nonlinear Image Processing, Pattern Analysis & Applications, Vol. 2, No.
                           2, pp. 111-128

 
                     
                        
                        Greenhill D, Davies E, 1994, Relative effectiveness of neural networks for image noise
                           suppression, Machine Intelligence and Pattern Recognition. Elsevier, Vol. 16, pp.
                           367-378

 
                     
                        
                        Cho T. S, Zitnick C. L, Joshi N, Kang S. B, Szeliski R, Freeman W. T, April 2012,
                           Image Restoration by Matching Gradient Distributions, in IEEE Transactions on Pattern
                           Analysis and Machine Intelligence, Vol. 34, No. 4, pp. 683-694

 
                     
                        
                        Beck Amir, Marc Teboulle, 2009, Fast gradient-based algorithms for constrained total
                           variation image denoising and deblurring problems, IEEE transactions on image processing,
                           Vol. 18, No. 11, pp. 2419-2434

 
                     
                        
                        Takeda Hiroyuki, Farsiu Sina, Milanfar Peyman, 2007, Kernel regression for image processing
                           and reconstruction, IEEE Transactions on image processing, Vol. 16, No. 2, pp. 349-366

 
                     
                        
                        Dabov Kostadin, et al. , 2007, Image denoising by sparse 3-D transform-domain collaborative
                           filtering, IEEE Transactions on image processing, Vol. 16, No. 8, pp. 2080-2095

 
                     
                        
                        Buades A, Coll B, Morel J. M, 2005, A non-local algorithm for image denoising, IEEE
                           Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) IEEE.,
                           Vol. 2, pp. 60-65

 
                     
                        
                        Zhang Kai, et al. , 2017, Beyond a gaussian denoiser: Residual learning of deep CNN
                           for image denoising, IEEE transactions on image processing, Vol. 26, No. 7, pp. 3142-3155

 
                     
                        
                        Vincent Pascal, et al. , 2008, Extracting and composing robust features with denoising
                           autoencoders, Proceedings of the 25th international conference on Machine learning
                           ACM, pp. 1096-1103

 
                     
                        
                        Jifara W, Jiang F, Rho S, Cheng M, Liu S, 2019, Medical image denoising using convolutional
                           neural network: a residual learning approach, The Journal of Supercomputing, Vol.
                           75, No. 2, pp. 704-718

 
                     
                        
                        Gurrola-Ramos J, Dalmau O, Alarcón T. E, 2021, A Residual Dense U-Net Neural Network
                           for Image Denoising, IEEE Access, Vol. 9, pp. 31742-31754

 
                     
                        
                        Ruder Sebastian, 2017, An overview of multi-task learning in deep neural networks,
                           arXiv preprint arXiv:1706.05098

 
                     
                        
                        Sharma M, Chaudhury S, Lall B, 2017, Deep learning-based frameworks for image super-resolution
                           and noise-resilient super-resolution, 2017 International Joint Conference on Neural
                           Networks (IJCNN), pp. 744-751

 
                     
                        
                        Hu Litao, Chen Huaijin, Allebach Jan P, 2022, Joint Multi-Scale Tone Mapping and Denoising
                           for HDR Image Enhancement, Proceedings of the IEEE/CVF Winter Conference on Applications
                           of Computer Vision

 
                     
                        
                        Agustsson Eirikur, Timofte Radu, July 2017, NTIRE 2017 Challenge on Single Image Super-Resolution:
                           Dataset and Study, The IEEE Conference on Computer Vision and Pattern Recognition
                           (CVPR) Workshops

 
                     
                        
                        Agustsson E, Timofte R, 2017, NTIRE 2017 Challenge on Single Image Super-Resolution:
                           Dataset and Study, 2017 IEEE Conference on Computer Vision and Pattern Recognition
                           Workshops (CVPRW), pp. 1122-1131

 
                     
                        
                        Thijssen Johan Marie, Vendrik A. J. H, 1971, Differential luminance sensitivity of
                           the human visual system., Perception & Psychophysics, Vol. 10, No. 1, pp. 58-64

 
                     
                        
                        Cho Taeg Sang, et al. , 2011, Image restoration by matching gradient distributions.,
                           IEEE Transactions on Pattern analysis and machine intelligence, Vol. 34, No. 4, pp.
                           683-694

 
                     
                        
                        Heeger D.J, Bergen J.R, 1995, Pyramid-Based Texture Analysis/Synthesis, Proc. ACM
                           Siggraph

 
                     
                        
                        Liu Yang, et al. , 2020, Gradnet image denoising., Proceedings of the IEEE/CVF Conference
                           on Computer Vision and Pattern Recognition Workshops

 
                     
                        
                        Mao Xiaojiao, Shen Chunhua, Yang Yu-Bin, 2016, Image restoration using very deep convolutional
                           encoder-decoder networks with symmetric skip connections., Advances in neural information
                           processing systems, Vol. 29

 
                   
                
             
            Author
            
            
               			Sung-Min Woo received his B.S. degree in electrical engineering from Stony Brook
               University, Stony Brook, NY, USA, in 2006, M.S. degree from the Pohang University
               of Science and Technology, South Korea, in 2008, and Ph.D. degree from Korea University,
               South Korea, in 2020. From 2008 to 2020, he participated in research and development
               on mobile camera systems at the LG Electronics’ Mobile Communication Division. He
               is currently working as an Assistant Professor at the School of Electrical, Electronics,
               and Communication Engineering, Korea University of Technology and Education. His current
               research interests include color constancy, image and video processing, computer vision,
               and machine learning.
               		
            
            
            
               			Seong-Eui Lee received his B.S degree in electrical engineering from Dongguk University,
               Seoul, South Korea, in 2021. He is currently pursuing an M.S degree in electrical
               engineering at Korea University, Seoul. His current research interests include deep
               learning-based various image processing and computer vision algorithm.
               		
            
            
            
               			Jongok Kim received his B.S. and M.S. degrees in electronic engineering from Korea
               University, Seoul, Korea, in 1994 and 2000, respectively, and Ph.D. degree in information
               net-working from Osaka University, Osaka, Japan, in 2006. From 1995 to 1998, he served
               as an officer in the Korea Air Force. From 2000 to 2003, he was with SK Telecom R&D
               Center and Mcubeworks Inc. in Korea, where he was involved in research and development
               on mobile multimedia systems. From 2006 to 2009, he was a researcher at ATR (Advanced
               Telecommunication Research Institute International), Kyoto, Japan. He joined Korea
               University, Seoul, Korea, in 2009 and is currently a professor. His current research
               interests include image processing, computer vision, and intelligent media systems.
               Dr. Kim received a Japanese Government Scholarship during 2003-2006.