• ๋Œ€ํ•œ์ „๊ธฐํ•™ํšŒ
Mobile QR Code QR CODE : The Transactions of the Korean Institute of Electrical Engineers
  • COPE
  • kcse
  • ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ๋‹จ์ฒด์ด์—ฐํ•ฉํšŒ
  • ํ•œ๊ตญํ•™์ˆ ์ง€์ธ์šฉ์ƒ‰์ธ
  • Scopus
  • crossref
  • orcid

  1. (Department of Electronics and Computer Engineering, Seokyeong University, Republic of Korea.)



Deep learning, Segmentation, Knowledge Distillation, Single-Model-Based KD, Self-KD, Mutual-KD

1. ์„œ ๋ก 

์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์€ ๋‹จ์ˆœํžˆ ๊ฐ์ฒด๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹Œ ์ด๋ฏธ์ง€์—์„œ ์นดํ…Œ๊ณ ๋ฆฌ๋ณ„๋กœ ํ•ด๋‹นํ•˜๋Š” ๊ฐ์ฒด์˜ ์œ„์น˜์™€ ํ˜•ํƒœ๋ฅผ ๊ฒ€์ถœํ•˜๋Š” ๊ธฐ๋ฒ•์ด๋‹ค. ์ด๋ฏธ์ง€์˜ ํ”ฝ์…€๋‹จ์œ„๋กœ ์นดํ…Œ๊ณ ๋ฆฌ๋ณ„ ๋ถ„๋ฅ˜๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ณผ์ •์„ ํฌํ•จํ•œ๋‹ค[1]. ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์˜ ๋Œ€ํ‘œ์  ํ™œ์šฉ ์˜ˆ๋กœ์„œ, ์ž์œจ์ฃผํ–‰์—์„œ ์‚ฌ๋žŒ, ์ฐจ๋Ÿ‰, ๋ฐ ๋…ธ์„  ๋ฐ ์ฃผ์œ„ ํ™˜๊ฒฝ์˜ ํƒ์ง€๋ฅผ ๋“ค ์ˆ˜ ์žˆ๊ณ  ์˜๋ฃŒ ๋ถ„์•ผ์—์„œ๋Š” ํ™˜์ž์˜ ์žฅ๊ธฐ๋‚˜ ์กฐ์ง์˜ ์ด์ƒ ๋ถ€์œ„๋ฅผ ๊ฒ€์ถœํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋œ๋‹ค[1].

์ฃผ์š” ์—ฐ๊ตฌ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. ์ดˆ๊ธฐ ๋ชจ๋ธ์ธ Deeplab [2]์€ Atrous ์ปจ๋ณผ๋ฃจ์…˜๊ณผ CRF(Conditional Random Field)๋ฅผ ํ™œ์šฉํ•ด ๋ฉ€ํ‹ฐ์Šค์ผ€์ผ ๋ฌธ๋งฅ ์ •๋ณด๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํฌ์ฐฉํ•˜๋ฉฐ, ์„ธ๋ฐ€ํ•œ ๊ฐ์ฒด ๊ฒฝ๊ณ„ ์ •๋ณด๋ฅผ ๋ณด์กดํ•œ๋‹ค. U-Net [3]์€ ์ธ์ฝ”๋”-๋””์ฝ”๋” ๊ตฌ์กฐ์™€ ์Šคํ‚ต ์—ฐ๊ฒฐ์„ ํ†ตํ•ด ์˜๋ฃŒ ์˜์ƒ ๋“ฑ์—์„œ ์„ธ๋ฐ€ํ•œ ์ง€์—ญ ์ •๋ณด๋ฅผ ๋ณต์›ํ•˜์—ฌ ๋†’์€ ์ •ํ™•๋„๋ฅผ ์ œ๊ณตํ•œ๋‹ค. SegNet [4]์€ ์ธ์ฝ”๋”์˜ pooling ์ธ๋ฑ์Šค๋ฅผ ๋””์ฝ”๋”์— ์ „๋‹ฌํ•˜๋Š” ๊ตฌ์กฐ๋กœ ๊ณต๊ฐ„ ํ•ด์ƒ๋„๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ๋ณต์›ํ•˜์—ฌ ํšจ์œจ์ ์ธ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. PSPNet [5]์€ Pyramid Scene Parsing ๋ชจ๋“ˆ์„ ๋„์ž…ํ•ด ์ „์—ญ ๋ฐ ์ง€์—ญ ๋ฌธ๋งฅ ์ •๋ณด๋ฅผ ๋™์‹œ์— ํ™œ์šฉํ•จ์œผ๋กœ์จ ๋ณต์žกํ•œ ์žฅ๋ฉด์˜ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œ์ผฐ๋‹ค. DeepLabV3 [6]๋Š” Atrous Spatial Pyramid Pooling์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์Šค์ผ€์ผ์˜ ํŠน์ง•์„ ํšจ๊ณผ์ ์œผ๋กœ ์ถ”์ถœํ•˜๊ณ , ๊ฐ์ฒด ๊ฒฝ๊ณ„์™€ ์„ธ๋ถ€ ์ •๋ณด๋ฅผ ์ •๋ฐ€ํ•˜๊ฒŒ ๋ถ„ํ• ํ•œ๋‹ค.

์ง€์‹์ฆ๋ฅ˜(Knowledge Distillation, KD) [4] ๊ธฐ๋ฒ•์€ ๊ทœ๋ชจ๊ฐ€ ํฌ๊ณ  ํ•™์Šต๋œ Teacher(์„ ์ƒ) ๋„คํŠธ์›Œํฌ์—์„œ ๊ทœ๋ชจ๊ฐ€ ์ž‘๊ณ  ํ•™์Šต๋˜์ง€ ์•Š์€ Student(ํ•™์ƒ) ๋„คํŠธ์›Œํฌ๋กœ ์ถœ๋ ฅ ๋ฒกํ„ฐ์— ๋Œ€ํ•œ ๋ถ„ํฌ์˜ ์ฐจ์ด๋ฅผ ์ค„์—ฌ๋‚˜๊ฐ์œผ๋กœ์จ ์ง€์‹์„ ์ „๋‹ฌํ•œ๋‹ค. ์ „ํ˜•์ ์ธ ์‘์šฉ์€ ํฐ ๋„คํŠธ์›Œํฌ์˜ ๋Šฅ๋ ฅ์„ ์ž‘์€ ๋„คํŠธ์›Œํฌ๋กœ ์ „๋‹ฌํ•˜์—ฌ ์„ฑ๋Šฅ์€ ์ตœ๋Œ€ํ•œ ์œ ์ง€ํ•˜๋ฉด์„œ ๋„คํฌ์›Œํฌ ๊ทœ๋ชจ๋ฅผ ์ถ•์†Œํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ๋˜ํ•œ ๋™์ผํ•œ ๊ทœ๋ชจ๋ฅผ ์œ ์ง€ํ•˜๋ฉด์„œ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๊ฒฝ์šฐ์—๋„ ์ง€์‹์ฆ๋ฅ˜๊ฐ€ ์‚ฌ์šฉ๋œ๋‹ค. KD์—๋Š” Teacher-Student(์ดํ•˜ T-S)[7] ๊ตฌ์„ฑ์ด ๋„๋ฆฌ ์“ฐ์ด๊ณ , ์ž๊ธฐ ์ž์‹  ๋ชจ๋ธ๋งŒ์„ ์‚ฌ์šฉํ•œ Self-KD ๋ฐฉ์‹ [9], ๋™๋“ฑํ•œ ๋‘ ๋„คํŠธ์›Œํฌ๋กœ ๊ตฌ์„ฑ๋œ ์ƒํ˜ธํ˜‘๋ ฅ ๋ชจ๋ธ์ธ Deep Mutual Learning(DML) [8]๋„ ์ œ์‹œ๋˜๊ณ  ์žˆ๋‹ค.

๋ณธ ๋…ผ๋ฌธ์€ ๋Œ€ํ‘œ์ ์ธ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ๋ชจ๋ธ์ธ PSPNet์— ๋Œ€ํ•ด์„œ PASCAL-VOC 2012 ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‘ ๊ฐ€์ง€ ๋‹จ์ผ ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ์ง€์‹ ์ฆ๋ฅ˜์ธ Self-KD์™€ Mutual-KD ๋ฐฉ์‹์„ ์ ์šฉํ•˜์—ฌ ์„ฑ๋Šฅ์˜ ํ–ฅ์ƒ์„ ๊พ€ํ•˜๊ณ , ๋‹ค์–‘ํ•œ ํŠน์„ฑ ์ง€ํ‘œ๋ฅผ ๋ถ„์„ํ•œ๋‹ค.

2. ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜

์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์€ ๋ชจ๋“  ํ”ฝ์…€์— ๋Œ€ํ•ด ๋ถ„๋ฅ˜๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๊ฒƒ์œผ๋กœ, ํ•ด๋‹น ํ”ฝ์…€์ด ์–ด๋–ค ์นดํ…Œ๊ณ ๋ฆฌ์— ์†ํ•ด์žˆ๋Š”์ง€๋ฅผ ํŒ๋‹จํ•˜๋Š” ๋ฌธ์ œ์ด๋‹ค. ๋‹ค์–‘ํ•œ ๊ฐ์ฒด์— ๋Œ€ํ•ด์„œ ์ •ํ™•ํ•œ ํ˜•ํƒœ์˜ ์ด๋ฏธ์ง€ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์—ฐ์‚ฐ์ ์ธ ์ธก๋ฉด์—์„œ ์ผ๋ฐ˜์ ์ธ ๋ถ„๋ฅ˜ ๋ฌธ์ œ์— ๋น„ํ•ด ์–ด๋ ต๊ณ , ์—ฐ์‚ฐ๋Ÿ‰์˜ ๊ทœ๋ชจ๊ฐ€ ๋” ํฌ๋‹ค.

์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์€ ํฌ๊ฒŒ ์‹œ๋ฉ˜ํ‹ฑ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜๊ณผ ์ธ์Šคํ„ด์Šค ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์œผ๋กœ ๋‚˜๋‰œ๋‹ค. ์‹œ๋ฉ˜ํ‹ฑ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์€ ์ด๋ฏธ์ง€์˜ ์นดํ…Œ๊ณ ๋ฆฌ ๋งŒ์œผ๋กœ ๋ถ„๋ฅ˜ํ•˜๋Š” ๊ธฐ๋ฒ•์ธ ๋ฐ˜๋ฉด ์ธ์Šคํ„ด์Šค ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์€ ์นดํ…Œ๊ณ ๋ฆฌ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๊ฐ™์€ ์นดํ…Œ๊ณ ๋ฆฌ ๋‚ด์—์„œ๋„ ๊ฐ์ฒด๋“ค์„ ๋ถ„๋ฅ˜ํ•˜๋Š” ๊ธฐ๋ฒ•์ด๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์‹œ๋ฉ˜ํ‹ฑ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์„ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋ฉฐ, ๊ทธ๋ฆผ 1์— ์˜ˆ์‹œ๊ฐ€ ๋‚˜์™€์žˆ๋‹ค.

๊ทธ๋ฆผ 1. ์‹œ๋ฉ˜ํ‹ฑ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์˜ ์˜ˆ (a) ์ด๋ฏธ์ง€, (b) ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ์นดํ…Œ๊ณ ๋ฆฌ

Fig. 1. Semantic segmentation example (a) Image, (b) Segmentation categories

../../Resources/kiee/KIEE.2025.74.9.1575/fig1.png

๋ณธ ๋…ผ๋ฌธ์—์„œ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋Š” PspNet์€ ResNet50์„ ์‚ฌ์šฉํ•˜์—ฌ ํŠน์ง•์„ ์ถ”์ถœํ•˜๊ณ  ์ด 4๋ฒˆ์˜ ํ‰๊ท  ํ’€๋ง(Adaptive Average Pooling)์„ ์ˆ˜ํ–‰ํ•˜์—ฌ ๊ฐ€๋กœ(๋˜๋Š” ์„ธ๋กœ)์˜ ํฌ๊ธฐ๊ฐ€ ๊ฐ 1, 2, 3, 6์ธ ์„œ๋กœ ๋‹ค๋ฅธ ํฌ๊ธฐ์˜ ์ถœ๋ ฅ์„ ์–ป๋Š”๋‹ค. ๊ฐ๊ฐ์˜ ์ถœ๋ ฅ์€ ResNet50์˜ ์ตœ์ข… ํŠน์ง•๊ณผ ๊ฒฐํ•ฉ๋˜์–ด ์ถ”๊ฐ€์ ์ธ ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ์„ ํ†ตํ•ด ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ์ˆ˜ํ–‰๋œ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜จ๋‹ค. ๊ฐ ์ถœ๋ ฅ์„ ํ•ฉํ•˜๋Š” ๊ณผ์ •์—์„œ ์„œ๋กœ ๋‹ค๋ฅธ ํฌ๊ธฐ๋กœ ์ธํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด์„œ ๊ฐ€์žฅ ํฐ ํฌ๊ธฐ์— ๋งž์ถ”์–ด ํ™•๋Œ€(Up Sampling)๋ฅผ ์ˆ˜ํ–‰ํ•œ๋‹ค. ๊ทธ๋ฆผ 2์— PspNet์˜ ๊ตฌ์กฐ๋„๊ฐ€ ๋‚˜์™€ ์žˆ๋‹ค.

๊ทธ๋ฆผ 2. PspNet์˜ ๊ตฌ์กฐ๋„

Fig. 2. PSPNet architecture

../../Resources/kiee/KIEE.2025.74.9.1575/fig2.png

3. ์ง€์‹ ์ฆ๋ฅ˜

3.1 ์ „ํ†ต์  ์ง€์‹ ์ฆ๋ฅ˜

์ง€์‹ ์ฆ๋ฅ˜(Knowledge Distillation)๋Š” ๋Œ€๊ทœ๋ชจ๋กœ ์‚ฌ์ „ ํ•™์Šต๋œ ์„ ํ–‰(teacher) ๋„คํŠธ์›Œํฌ๋กœ๋ถ€ํ„ฐ, ๋น„๊ต์  ์ž‘์€ ํ•™์ƒ(student) ๋„คํŠธ์›Œํฌ๊ฐ€ ์ถœ๋ ฅ ๋ถ„ํฌ๋ฅผ ๋ชจ๋ฐฉํ•จ์œผ๋กœ์จ ์ง€์‹์„ ์ „๋‹ฌ๋ฐ›๋Š” ๊ธฐ๋ฒ•์ด๋‹ค. ์ด๋Ÿฌํ•œ ๊ณผ์ •์„ ํ†ตํ•ด ํ•™์ƒ ๋„คํŠธ์›Œํฌ๋Š” ๋‹ค์–‘ํ•œ ์ •๋ณด๋ฅผ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. ์ง€์‹ ์ฆ๋ฅ˜๋ฅผ ์œ„ํ•œ ์†์‹ค ํ•จ์ˆ˜๋Š” ์‹ (1)์— ์ œ์‹œ๋˜์–ด ์žˆ๋‹ค.

(1)
$L = -\dfrac{1}{m}\sum_{i=1}p(z_{i},\: \theta)\log\dfrac{p(z_{i},\: \theta)}{p(\hat{z_{i}},\: \theta)}$
(2)
$p(z_{i},\: \theta)= soft\max(z_{i}/\tau)$

์—ฌ๊ธฐ์„œ, m์€ ๋ฏธ๋‹ˆ-๋ฐฐ์น˜ ์‚ฌ์ด์ฆˆ, $z_{i}$๋Š” ์„ ์ƒ์˜ ์ถœ๋ ฅ๊ฐ’, $\hat{z_{i}}$๋Š” ํ•™์ƒ์˜ ์ถœ๋ ฅ๊ฐ’, $\theta$๋Š” ๋ชจ๋ธ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ, $\tau$๋Š” ๋„คํŠธ์›Œํฌ์˜ ์ถœ๋ ฅ ๋ถ„ํฌ๋ฅผ ์กฐ์ •ํ•ด์ฃผ๋Š” ์ƒ์ˆ˜์ด๋‹ค.

3.2 Self-KD

Self-KD๋Š” ํ•˜๋‚˜์˜ ๋ชจ๋ธ ๋‚ด์—์„œ ์Šค์Šค๋กœ ์ƒ์„ฑํ•œ ์†Œํ”„ํŠธ ๋ ˆ์ด๋ธ”์ด๋‚˜ ์ค‘๊ฐ„ ํ‘œํ˜„์„ ํ™œ์šฉํ•ด ์ง€์‹์„ ์žฌ์ „์ดํ•˜๋Š” ๊ธฐ๋ฒ•์ด๋‹ค. ๋ชจ๋ธ์€ ์ž์‹ ์˜ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ๋ณด๋‹ค ์ •๊ตํ•˜๊ณ  ์•ˆ์ •๋œ ํ‘œํ˜„์„ ํ•™์Šตํ•œ๋‹ค. ์ด ๊ณผ์ •์€ ๋ชจ๋ธ ๋ณต์žก๋„๋ฅผ ๋‚ฎ์ถ”๋ฉด์„œ ์„ฑ๋Šฅ์„ ๊ฐœ์„ ํ•˜๊ณ  ์˜ค๋ฒ„ํ”ผํŒ…์„ ์™„ํ™”ํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ค€๋‹ค. ๊ทธ๋ฆผ 3-(a)์— ๊ตฌ์กฐ๋„๊ฐ€ ๋‚˜์™€ ์žˆ๋‹ค.

3.3 Mutual-KD

์ „ํ†ต์ ์ธ Mutual-KD๋Š” ์—ฌ๋Ÿฌ ๋ชจ๋ธ์ด ์„œ๋กœ์˜ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋‚˜ ์ค‘๊ฐ„ ํ‘œํ˜„์„ ๊ณต์œ ํ•˜์—ฌ ์ง€์‹์„ ์ƒํ˜ธ ์ „์ดํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. ๊ฐ ๋ชจ๋ธ์€ ์ƒ๋Œ€ ๋ชจ๋ธ์˜ ๊ฐ•์ ์„ ํ•™์Šตํ•˜๋ฉด์„œ ๊ฐœ๋ณ„ ํ•œ๊ณ„๋ฅผ ๋ณด์™„ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋ชจ๋ธ ๊ฐ„ ํ˜‘์—… ํšจ๊ณผ๋ฅผ ๊ทน๋Œ€ํ™”ํ•˜์—ฌ ์ „์ฒด ์„ฑ๋Šฅ๊ณผ ์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค. ๊ทธ๋ฆผ 3-(b)์— ๊ตฌ์กฐ๋„๊ฐ€ ๋‚˜์™€ ์žˆ๋‹ค.

๊ทธ๋ฆผ 3. ๋‹จ์ผ ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ๋‘ ์ง€์‹์ฆ๋ฅ˜ (a) self-KD, (b) mutual-KD

Fig. 3. Single-model-based knowledge distillation (a) Self-KD, (b) Mutual-KD

../../Resources/kiee/KIEE.2025.74.9.1575/fig3.png

4. PspNet ๋ชจ๋ธ์˜ ๋‹จ์ผ ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ์ง€์‹ ์ฆ๋ฅ˜

4.1 ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ๋ชจ๋ธ์˜ ์ง€์‹ ์ฆ๋ฅ˜

์ „ํ†ต์ ์ธ ์ง€์‹์ฆ๋ฅ˜๋Š” ๋ถ„๋ฅ˜๋ฅผ ๋ชฉ์ ์œผ๋กœ ์‚ฌ์šฉ๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ฐ ๋ถ„๋ฅ˜ ์นดํ…Œ๊ณ ๋ฆฌ์— ๋Œ€ํ•œ ๋ฒกํ„ฐ๋ฅผ ์ถœ๋ ฅํ•˜๊ณ  ์ด์˜ ์ฐจ์ด๋ฅผ ํ•™์Šตํ•˜๋Š” ๋ฐฉ์‹์ด๋‹ค. ์‹œ๋ฉ˜ํ‹ฑ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์€ ์ด๋Ÿฌํ•œ ๊ณผ์ •์„ ์ด๋ฏธ์ง€์˜ ๊ฐ€๋ชจ๋“  ํ”ฝ์…€๋“ค์— ๋Œ€ํ•ด์„œ ์ˆ˜ํ–‰ํ•œ๋‹ค. ์ด๋•Œ ํ”ฝ์…€๋“ค์˜ ์ฑ„๋„์€ ๊ฐ ๋ถ„๋ฅ˜ ์นดํ…Œ๊ณ ๋ฆฌ ์—ญํ• ์„ ํ•˜๋ฉฐ, ํ”ฝ์…€๋ณ„๋กœ ๊ฐ€์žฅ ํฐ ๊ฐ’์„ ๊ฐ–๋Š” ์ฑ„๋„์ด ์นดํ…Œ๊ณ ๋ฆฌ๋กœ ์„ ํƒ๋œ๋‹ค. ๊ฐ ์ฑ„๋„์— ํ•ด๋‹นํ•˜๋Š” ๊ฐ’์˜ ๋ถ„ํฌ์˜ ์ฐจ์ด๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ํ•™์Šต์— ์‚ฌ์šฉํ•œ๋‹ค. ๊ทธ๋ฆผ 4์— ์˜ˆ์‹œ๊ฐ€ ๋‚˜์™€ ์žˆ๋‹ค. ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์˜ ์ตœ์ข… ์ถœ๋ ฅ์˜ ์ฑ„๋„์€ ๋ถ„๋ฅ˜ํ•˜๊ณ ์ž ํ•˜๋Š” ์นดํ…Œ๊ณ ๋ฆฌ์˜ ์ˆ˜์— ๋ฐฐ๊ฒฝ์„ ์ถ”๊ฐ€ํ•œ ํ˜•ํƒœ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค.

๊ทธ๋ฆผ 4. (a) ์ „ํ†ต์ ์ธ ์ง€์‹์ฆ๋ฅ˜, (b) ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์˜ ์ง€์‹์ฆ๋ฅ˜

Fig. 4. (a) Traditional knowledge distillation, (b) Knowledge distillation for segmentation

../../Resources/kiee/KIEE.2025.74.9.1575/fig4.png

4.2 PspNet ๊ธฐ๋ฐ˜ Self-KD

PspNet์— Self-KD๋ฅผ ์ ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ, ๊ธฐ์กด์˜ FPN (Fully Feature Pyramid Networks) [10]์„ ๋ณ€ํ˜•ํ•˜๊ณ  FPN์„ ๊ฒฐํ•ฉํ•˜์—ฌ ๊ทธ๋ฆผ 5์™€ ๊ฐ™์ด ๊ตฌ์„ฑํ•œ๋‹ค. FPN์€ ResNet์˜ ๊ฐ Stage์—์„œ ์ถœ๋ ฅ๋œ ํŠน์ง•๋“ค์˜ ์ค‘์š”๋„๋ฅผ ํŒ๋‹จํ•˜๋Š” ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ๊ทธ๋ฆผ 5์˜ ์ขŒ์ธก์—์„œ ResNet๊ณผ FPN์˜ ์ถœ๋ ฅ์€ ๊ฐ๊ฐ์˜ Psp Module์„ ํ†ตํ•ด ์ตœ์ข… ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•œ๋‹ค.

๊ทธ๋ฆผ 5. Self-KD ๊ธฐ๋ฒ•์„ ์ ์šฉํ•œ PspNet ๊ตฌ์กฐ๋„

Fig. 5. PspNet structure diagram using the Self-KD technique

../../Resources/kiee/KIEE.2025.74.9.1575/fig5.png

๊ตฌ์ฒด์ ์œผ๋กœ, ๊ฐ Stage์—์„œ ๋‚˜์˜จ ํŠน์ง•๋งต์„ Stage 4์—์„œ Stage 1 ๋ฐฉํ–ฅ์œผ๋กœ ํ•ฉ์„ฑ๊ณฑ๊ณผ ์š”์†Œํ•ฉ ๊ณผ์ •์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ์ดํ›„ ์ด์ „ ๊ณผ์ •์—์„œ ๋‚˜์˜จ ํŠน์ง•๋งต์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด์ „๊ณผ ๋ฐ˜๋Œ€ ๋ฐฉํ–ฅ์œผ๋กœ ํ•œ๋ฒˆ ๋” ์ˆ˜ํ–‰ํ•œ๋‹ค. FPN์˜ ์—ฐ์‚ฐ๋Ÿ‰์˜ ๊ฐ์†Œ๋ฅผ ์œ„ํ•ด ์ผ๋ฐ˜ ํ•ฉ์„ฑ๊ณฑ ๋Œ€์‹  Depth-Wise ํ•ฉ์„ฑ๊ณฑ [11]์„ ์‚ฌ์šฉํ•œ๋‹ค. ํ•™์Šต๊ณผ์ •์—์„œ๋Š” ResNet์— ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š๋„๋ก ์—ญ์ „ํŒŒ๋ฅผ ์ฐจ๋‹จํ•œ๋‹ค.

ResNet๊ณผ FPN์˜ ์ถœ๋ ฅ์€ ํ‰๊ท  ์ œ๊ณฑ ์˜ค์ฐจ๋กœ ํ•™์Šต๋˜์–ด ์„œ๋กœ์˜ ํŠน์ง• ์ถ”์ถœ์„ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค. ํ‰๊ท  ์ œ๊ณฑ ์˜ค์ฐจ๋Š” ์‹ (3)๊ณผ ๊ฐ™๋‹ค.

(3)
$L_{2}=\dfrac{1}{H\times W}\sum_{h=1}^{H}\sum_{w=1}^{W}(F_{h,\: w}^{Res Net}- F_{h,\: w}^{FPN})^{2}$

ResNet๊ณผ FPN์„ ๊ฑฐ์ณ ๋‚˜์˜จ ์ถœ๋ ฅ์€ Pyramid Pooling ๊ตฌ์กฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ž…๋ ฅ ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ์˜ˆ์ธก์„ ์ƒ์„ฑํ•œ๋‹ค. ์ƒ์„ฑ๋œ ์˜ˆ์ธก์€ ์ •๋‹ต๊ณผ์˜ ๊ต์ฐจ ์—”ํŠธ๋กœํ”ผ(CE)๋กœ ํ•™์Šต๋œ๋‹ค. ์˜ˆ์ธก๊ณผ ์ •๋‹ต์˜ ์ฐจ์ด์— ๋Œ€ํ•œ ์†์‹คํ•จ์ˆ˜๋Š” ์‹ (4)์™€ ๊ฐ™๋‹ค. H์™€ W๋Š” ์ด๋ฏธ์ง€์˜ ๊ฐ€๋กœ์™€ ์„ธ๋กœ ํฌ๊ธฐ๋ฅผ, Z๋Š” ์˜ˆ์ธก๊ฐ’, y๋Š” ๋ชฉํ‘œ๊ฐ’์„ ๋‚˜ํƒ€๋‚ธ๋‹ค.

(4)
$L_{c e}=\dfrac{1}{H\times W}\sum_{h=1}^{H}\sum_{w=1}^{W}CE(\sigma(Z_{h,\: w}),\: y_{h,\: w})$

๊ทธ๋ฆผ 5์˜ ์šฐ์ธก๊ณผ ๊ฐ™์ด PsP ๋„คํŠธ์›Œํฌ์™€ FPN์—์„œ ์ƒ์„ฑ๋œ 2๊ฐœ์˜ ์ตœ์ข…์ถœ๋ ฅ์˜ ์ฐจ์ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‹ (5)์— ์˜ํ•ด ์ง€์‹ ์ฆ๋ฅ˜๋ฅผ ์ˆ˜ํ–‰ํ•œ๋‹ค. ์—ฌ๊ธฐ์„œ, KL์€ Kullback-Leibler divergence, $Z_{h,\: w}^{Psp}$๋Š” Psp์—์„œ ์ƒ์„ฑ๋œ ์˜ˆ์ธก, ๊ทธ๋ฆฌ๊ณ  $Z_{h,\: w}^{FPN}$๋Š” FPN๊ตฌ์กฐ์—์„œ ์ƒ์„ฑ๋œ ์˜ˆ์ธก์„ ์˜๋ฏธํ•œ๋‹ค. T๋Š” ์ง€์‹์ฆ๋ฅ˜์—์„œ ์‚ฌ์šฉ๋˜๋Š” ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” 1์„ ์‚ฌ์šฉํ•œ๋‹ค.

(5)
$D_{K L}=\dfrac{1}{H\times W}\sum_{h=1}^{H}\sum_{w=1}^{W}KL(\sigma(\dfrac{Z_{h,\: w}^{Psp}}{T})||\sigma(\dfrac{Z_{h,\: w}^{FPN}}{T}))$

์ตœ์ข… ์†์‹คํ•จ์ˆ˜๋Š” ์‹ (6)์— ๋‚˜์™€ ์žˆ๋‹ค. ๋น„์ค‘ ๊ณ„์ˆ˜์ธ $\alpha$๋Š” 10์„ ์‚ฌ์šฉํ•˜๊ณ  $\beta$, $\gamma$์˜ ๊ฐ’์€ 1๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.

(6)
$L_{loss}=\alpha\cdot L_{2}+\beta\cdot D_{KL}+\gamma\cdot L_{CE}$

4.3 PspNet ๊ธฐ๋ฐ˜ Mutual-KD

PspNet์— Mutual KD๋ฅผ ์ ์šฉํ•˜๊ธฐ ์œ„ํ•ด ๋‘ ๊ฐœ์˜ ๋ชจ๋ธ์„ ๋ถ„๊ธฐ ๊ตฌ์กฐ๋กœ ์„ค๊ณ„ํ•˜์˜€์œผ๋ฉฐ, ๊ทธ๋ฆผ 6๊ณผ ๊ฐ™์ด ๊ตฌ์„ฑ๋œ๋‹ค.

๊ทธ๋ฆผ 6. Mutual-KD ๊ธฐ๋ฒ•์„ ์ ์šฉํ•œ PspNet ๊ตฌ์กฐ๋„

Fig. 6. PspNet structure diagram using the Mutual-KD technique

../../Resources/kiee/KIEE.2025.74.9.1575/fig6.png

๊ทธ๋Ÿฌ๋‚˜ ๋‘ ๋ชจ๋ธ์ด ๋™๋“ฑํ•œ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๋ฉด ์˜ˆ์ธก๊ฐ’์ด ์ง€๋‚˜์น˜๊ฒŒ ์œ ์‚ฌํ•ด์งˆ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, Psp Module์— ๋ณ€ํ˜•์„ ๊ฐ€ํ•˜์—ฌ ์ฐจ์ด๋ฅผ ๋‘์—ˆ๋‹ค. ๊ฐ๊ฐ์˜ ResNet์—์„œ ์ถ”์ถœ๋œ ํŠน์ง•์ด Psp Module ๋‚ด๋ถ€์˜ Pyramid Pooling (PP)์„ ํ†ตํ•ด ์ฒ˜๋ฆฌ๋˜์–ด ์ตœ์ข… ์˜ˆ์ธก์„ ์ƒ์„ฑํ•œ๋‹ค. Mutual KD๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ์„œ๋กœ ๋™๋“ฑํ•œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์ง€๋งŒ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ํ‰๊ท  ํ’€๋ง(Adaptive Average Pooling) ํฌ๊ธฐ๋ฅผ ๊ฐ๊ฐ 1, 2, 3, 6๊ณผ 1, 3, 5, 7๋กœ ๋‹ค๋ฅด๊ฒŒ ๊ตฌ์„ฑํ•˜์—ฌ ๋‘ ์ถœ๋ ฅ์˜ ์ฐจ์ด๋ฅผ ๋‘์—ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ์ƒ์„ฑ๋œ 2๊ฐœ์˜ ์ตœ์ข…์ถœ๋ ฅ์€ ์‹ (7)์— ๋”ฐ๋ผ ์„œ๋กœ์˜ ์ฐจ์ด๋ฅผ ํ•™์Šตํ•œ๋‹ค. 4.2์ ˆ์—์„œ ์‚ฌ์šฉํ•œ ์‹ (5)์˜ ์†์‹คํ•จ์ˆ˜์™€ ๋™์ผํ•˜๋ฉฐ, $Z_{h,\: w}^{Psp1}$๋Š” ์ฒซ ๋ฒˆ์งธ PP์—์„œ ์ƒ์„ฑ๋œ ์˜ˆ์ธก์ด๊ณ , $Z_{h,\: w}^{Psp2}$๋Š” ๋‘ ๋ฒˆ์งธ PP์—์„œ ์ƒ์„ฑ๋œ ์˜ˆ์ธก์„ ์˜๋ฏธํ•œ๋‹ค.

(7)
$D_{K L}=\dfrac{1}{H\times W}\sum_{h=1}^{H}\sum_{w=1}^{W}KL(\sigma(\dfrac{Z_{h,\: w}^{Psp1}}{T})๏ฝœ ๏ฝœ\sigma(\dfrac{Z_{h,\: w}^{Psp2}}{T}))$

์ƒ์„ฑ๋œ ์˜ˆ์ธก์€ 4.2 ์ ˆ์˜ ์‹ (4)๋ฅผ ํ†ตํ•ด ์ •๋‹ต๊ณผ์˜ ๊ต์ฐจ ์—”ํŠธ๋กœํ”ผ(Cross Entropy)๋กœ ํ•™์Šต๋œ๋‹ค. ์ตœ์ข… ์†์‹คํ•จ์ˆ˜๋Š” ์‹ (8)์— ๋‚˜์™€์žˆ๋‹ค. $\alpha$, $\gamma$์˜ ๊ฐ’์€ 1๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.

(8)
$L_{loss}=\alpha\cdot D_{KL}+\gamma\cdot L_{CE}$

5. ์‹คํ—˜ ๋ฐ ๊ฒฐ๊ณผ ๋ถ„์„

5.1 ์‹คํ—˜ ํ™˜๊ฒฝ

PASCAL-VOC 2012 ๋ฐ์ดํ„ฐ์…‹์€ ๊ฐ์ฒด ๊ฒ€์ถœ๊ณผ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์„ ์œ„ํ•ด ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ๋กœ, 20๊ฐœ์˜ ์นดํ…Œ๊ณ ๋ฆฌ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์œผ๋ฉฐ, ํ•™์Šต์šฉ ์ด๋ฏธ์ง€ 1,464์žฅ๊ณผ ํ…Œ์ŠคํŠธ์šฉ ์ด๋ฏธ์ง€ 1,449์žฅ์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋Š” PyTorch 1.7.1 ํ™˜๊ฒฝ์—์„œ ๊ตฌํ˜„๋˜์—ˆ์œผ๋ฉฐ, NVIDIA RTX 2080 GPU๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ˆ˜ํ–‰๋˜์—ˆ๋‹ค.

์‹œ๋ฉ˜ํ‹ฑ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ์‹ ๊ฒฝ๋ง์˜ ํ‰๊ฐ€๋ฅผ ์œ„ํ•ด, ์‹ (10)์—์„œ ์ •์˜๋œ mIoU (mean Intersection over Union) ์„ฑ๋Šฅ ์ง€ํ‘œ๋ฅผ ์ธก์ •ํ•˜์˜€์œผ๋ฉฐ, ์—ฌ๊ธฐ์„œ C๋Š” ๋ฐ์ดํ„ฐ์…‹์˜ ์นดํ…Œ๊ณ ๋ฆฌ ์ˆ˜๋ฅผ ์˜๋ฏธํ•œ๋‹ค.

(9)
$I o U =\dfrac{TP}{TP+FP+FN}$
(10)
$m I o U =\dfrac{1}{C}\sum_{c=1}^{C}I o U_{c}$

5.2 ์ •๋Ÿ‰์  ์‹คํ—˜ ๊ฒฐ๊ณผ

PASCAL-VOC 2012 ๋ฐ์ดํ„ฐ์…‹์˜ 20๊ฐœ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ํ‰๊ท  ์„ฑ๋Šฅ์ธ mIoU ์ˆ˜์น˜ ๊ฒฐ๊ณผ๊ฐ€ ํ‘œ 1์— ๋‚˜์™”์žˆ๋‹ค. PSPNet์™€ ์ œ์•ˆ๋œ ๋‘ ๊ฐ€์ง€ ์ง€์‹์ฆ๋ฅ˜ ๊ธฐ๋ฒ•์˜ ์„ฑ๋Šฅ์ด ๋น„๊ต๋˜์—ˆ๋‹ค. ๋จผ์ €, Self KD๋ฅผ ์ ์šฉํ•œ ๊ฒฝ์šฐ PSPNet ์„ฑ๋Šฅ์— ๋น„ํ•ด 0.56%, Mutual KD์—์„œ๋Š” 0.82%์˜ ์„ฑ๋Šฅํ–ฅ์ƒ์ด ์žˆ์—ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด, PSPNet ์ž์‹ ๋งŒ์„ ๊ฐ€์ง€๊ณ  ์ง€์‹์ฆ๋ฅ˜ํ•œ Self KD ๋ฐฉ์‹์— ๋น„ํ•ด์„œ ๋‘ ๊ฐœ์˜ PSPNet์„ ๊ฐ€์ง€๊ณ  ์ƒํ˜ธ์ ์œผ๋กœ ์ง€์‹์ฆ๋ฅ˜ํ•œ Mutual KD ๋ฐฉ์‹์˜ ์„ฑ๋Šฅ์ด ์šฐ์ˆ˜ํ•จ์„ ์•Œ์ˆ˜ ์žˆ๋‹ค. ๋‹จ์ผ ๋ชจ๋ธ ๋งŒ์œผ๋กœ ์„ฑ๋Šฅํ–ฅ์ƒ์„ ์–ป๊ธฐ๊ฐ€ ์–ด๋ ค์šด ์ƒํ™ฉ์—์„œ ๋‹จ์ผ ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ์ง€์‹์ฆ๋ฅ˜์˜ ์ œ์•ˆ ๋ฐ ์ด์˜ ์ ์šฉ ํƒ€๋‹น์„ฑ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.

ํ‘œ 1์˜ ํ‰๊ท  ์„ฑ๋Šฅ ์ˆ˜์น˜ ์™ธ์— 20๊ฐœ ์นดํ…Œ๊ณ ๋ฆฌ์— ๋Œ€ํ•œ ์„ธ๋ถ€ ์ˆ˜์น˜ ๊ฒฐ๊ณผ๊ฐ€ ํ‘œ 2์— ๋‚˜์™€ ์žˆ๋‹ค. ๊ฐ ์นดํ…Œ๊ณ ๋ฆฌ๋ณ„ ์„ฑ๋Šฅ์€ ์ข…๋ฅ˜์— ๋”ฐ๋ผ ์ตœ๊ณ  92%์˜ BG(Back Ground)๋ถ€ํ„ฐ ์ตœ์ € 26%์˜ Chair๊นŒ์ง€ ํฐ ํŽธ์ฐจ๊ฐ€ ์กด์žฌํ•จ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. Mutual-KD ์„ฑ๋Šฅ์„ ๊ธฐ์ค€ํ•˜์—ฌ ๋‚ด๋ฆผ์ฐจ์ˆœ์œผ๋กœ ์ •๋ฆฌํ•˜์˜€๊ณ , ์นดํ…Œ๊ณ ๋ฆฌ๋ณ„ ๊ฐ€์žฅ ๋†’์€ ์„ฑ๋Šฅ์€ ๊ตต๊ฒŒ ํ‘œ์‹œํ•˜์˜€๋‹ค.

ํ‘œ 1 ์ง€์‹์ฆ๋ฅ˜ ๊ธฐ๋ฒ•๋ณ„ ์„ฑ๋Šฅ ๋น„๊ต

Table 1 Performance comparison of knowledge distillation methods

Method

mIoU

PSPNet

68.29

Self KD

68.85

Mutual KD

69.35

ํ‘œ 2 ์ง€์‹์ฆ๋ฅ˜ ๊ธฐ๋ฒ•๋ณ„ ์นดํ…Œ๊ณ ๋ฆฌ ์„ฑ๋Šฅ ๋น„๊ต (mutual KD ์„ฑ๋Šฅ ๊ธฐ์ค€์œผ๋กœ ๋‚ด๋ฆผ์ฐจ์ˆœ)

Table 2 Category performance comparison by knowledge distillation methos (in descending order by mutual KD performance)

Method

BG

Bus

Plane

Cat

Bird

Cow

Car

Train

Person

Sheep

MTB

Dog

Horse

Boat

Bottle

TV

Table

Plant

Bike

Sofa

Chair

PSPNet

91.91

84.54

81.24

81.76

79.20

78.45

78.20

77.96

75.49

69.98

73.75

72.86

71.00

68.13

66.87

65.16

55.97

54.42

45.08

35.22

26.77

Self KD

92.15

85.45

82.54

82.83

78.62

77.04

80.00

78.91

76.08

72.08

74.49

74.42

70.59

67.64

67.94

64.81

56.26

50.52

46.78

39.09

27.55

Mutual KD

92.27

86.04

83.64

81.73

79.97

77.43

81.12

78.46

76.43

74.57

74.90

72.43

69.18

69.42

70.91

65.24

59.29

51.81

45.85

37.00

28.61

5.3 ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ๊ฒฐ๊ณผ ์˜ˆ์‹œ

mIoU ํ‰๊ฐ€์ง€ํ‘œ๋Š” ์ •๋Ÿ‰์  ์ˆ˜์น˜ ๊ฒฐ๊ณผ๋กœ์„œ, ์ด๋ฏธ์ง€ ๋‚ด์˜ ์นดํ…Œ๊ณ ๋ฆฌ๋ณ„ ํ”ฝ์…€ ์ˆ˜์— ์˜์กดํ•˜๊ธฐ ๋•Œ๋ฌธ์— [12], ๊ฐ ๋ฌผ์ฒด์™€ ๋ฐฐ๊ฒฝ์˜ ํฌ๊ธฐ์— ์˜ํ–ฅ์„ ๋ฐ›๋Š”๋‹ค. ๋˜ํ•œ ์ˆ˜์น˜๋งŒ์œผ๋กœ๋Š” ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ์ •์„ฑ์ ์ธ ๋ถ„์„์ด ์–ด๋ ต๋‹ค. ๋”ฐ๋ผ์„œ, ๊ทธ๋ฆผ 7์— ์ฃผ์š” ๊ฒฐ๊ณผ ์ด๋ฏธ์ง€๋ฅผ ์˜ˆ์‹œํ•˜์˜€๋‹ค. (a)๋Š” ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ์ด๋ฏธ์ง€, (b)๋Š” ์ •๋‹ต, ๊ทธ๋ฆฌ๊ณ  (c), (d), (e)๋Š” ์ˆœ์„œ๋Œ€๋กœ PspNet, Self-KD, Mutual-KD ๊ธฐ๋ฒ•์˜ ๊ฒฐ๊ณผ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค.

๊ทธ๋ฆผ 7. ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ๊ฒฐ๊ณผ ์ด๋ฏธ์ง€์˜ ์˜ˆ์‹œ

Fig. 7. Example of segmentation result images

../../Resources/kiee/KIEE.2025.74.9.1575/fig7.png

์œ„๋กœ๋ถ€ํ„ฐ Bus, Bird, Car, Cow, Horse, Person ์ˆœ์œผ๋กœ ์ •๋ ฌ๋˜์–ด ์žˆ๊ณ , ๊ฐ ๊ฐ์ฒด์— ๋Œ€ํ•œ ์„ธ๊ทธ๋ฉ˜ํ…Œ์• ์…˜ ๋ถ€์œ„๋ฅผ ํ•ด๋‹น ์ปฌ๋Ÿฌ๋กœ ํ‘œ์‹œํ•˜์˜€๋‹ค. ๋นจ๊ฐ„ ์ ์„  ์›์œผ๋กœ ์ •๋‹ต๊ณผ์˜ ์ฃผ์š” ์ฐจ์ด ๋ถ€๋ถ„์„ ํ‘œ์‹œ ํ•˜์˜€์œผ๋ฉฐ, ์ด ๋ถ€๋ถ„์—์„œ ๊ธฐ๋ฒ•๋ณ„ ์ •๋‹ต๊ณผ์˜ ์ฐจ์ด๊ฐ€ ๋šœ๋ ทํ•˜๊ฒŒ ๊ตฌ๋ถ„๋œ๋‹ค. ๋Œ€ํ‘œ์ ์œผ๋กœ, Bus ์นดํ…Œ๊ณ ๋ฆฌ์—์„œ PspNet(c)์€ ๋„๋กœ ์˜์—ญ์„ Bus๋กœ ์˜ˆ์ธกํ•œ ๋ฐ˜๋ฉด Self-KD(d) ์™€ Mutual-KD(e)์˜ ๊ฒฝ์šฐ ๊ฒฝ๊ณ„๋ฅผ ๋ช…ํ™•ํžˆ ๊ตฌ๋ถ„ํ•ด ์‹ค์ œ ์ฐจ๋Ÿ‰ ์˜์—ญ๋งŒ์„ ์ธ์‹ํ•˜์˜€๊ณ , Cow ์นดํ…Œ๊ณ ๋ฆฌ์—์„œ๋Š” PspNet(c)์ด ์†Œ์˜ ์–ผ๋ฃฉ๋ฌด๋Šฌ๋ฅผ Horse๋กœ ์ž˜๋ชป ์˜ˆ์ธกํ•œ ๋ฐ˜๋ฉด Self-KD(d)์™€ Mutual-KD(e)๋Š” Cow ์นดํ…Œ๊ณ ๋ฆฌ๋กœ ์•Œ๋งž๊ฒŒ ์˜ˆ์ธก๋˜์—ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฐ๊ณผ๋Š” ์ง€์‹์ฆ๋ฅ˜๋ฅผ ์ ์šฉํ•œ ๊ฒฝ์šฐ๊ฐ€ ์ •์„ฑ์ ์ธ ์ธก๋ฉด์—์„œ ํ–ฅ์ƒ๋œ ๊ฒƒ์„ ๋ณด์—ฌ์ค€๋‹ค. ๋˜ํ•œ, Bird์—์„œ Self-KD(d)์˜ ๊ฒฐ๊ณผ๊ฐ€ PspNet(c)๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ ์ƒ๋‹จ์˜ ์ƒˆ๋ฅผ ๊ฑฐ์˜ ๊ตฌ๋ณ„ํ•˜์ง€ ๋ชปํ•œ ๋ฐ ๋น„ํ•ด Mutual-KD(e)์—์„œ๋Š” ์ƒ๋‹นํ•œ ๋น„์œจ๋กœ ์ด๋ฅผ ์ธ์‹ํ–ˆ์Œ์„ ๋ณด์—ฌ์ค€๋‹ค.

์ถ”๊ฐ€์ ์œผ๋กœ, ์ตœ์ข… ์˜ˆ์ธก์— ์ฃผ์š” ์˜ํ–ฅ์„ ์ฃผ๋Š” ๋ถ€๋ถ„์„ ํ™•์ธํ•˜๋Š”๋ฐ ์“ฐ์ด๋Š” Grad-CAM [13]์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก์— ๋Œ€ํ•œ ๋ณด์กฐ์ ์ธ ํ‰๊ฐ€๋„ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค. ๊ทธ๋ฆผ 8์— Grad-CAM์„ ์‚ฌ์šฉํ•œ ์ฃผ์š” ๊ฒฐ๊ณผ ์ด๋ฏธ์ง€๋ฅผ ์˜ˆ์‹œํ•˜์˜€๋‹ค. (a)๋Š” ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ์ด๋ฏธ์ง€, (b)๋Š” ์ •๋‹ต, ๊ทธ๋ฆฌ๊ณ  (c), (d), (e)๋Š” ์ˆœ์„œ๋Œ€๋กœ PspNet, Self-KD, Mutual-KD ๊ธฐ๋ฒ•์˜ Grad-CAM์˜ ํžˆํŠธ๋งต ๊ฒฐ๊ณผ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค.

๋Œ€ํ‘œ์ ์œผ๋กœ, TV/Monitor ์นดํ…Œ๊ณ ๋ฆฌ์˜ PspNet(c)์˜ Grad-CAM ํžˆํŠธ๋งต์€ ์ด๋ฏธ์ง€ ์ „๋ฐ˜์— ํ™œ์„ฑํ™” ๋œ ๋ฐ˜๋ฉด Self-KD(d)์™€ Mutual-KD(e)์˜ Grad-CAM ํžˆํŠธ๋งต์€ ํ•ด๋‹น ๊ฐ์ฒด์— ์ง‘์ค‘๋˜์—ˆ๋‹ค. ์ด ๊ฒฝ์šฐ์—๋„ Mutual-KD(e)๊ฐ€ Self-KD(d)์— ๋น„ํ•ด ์ข€ ๋” ๊ฐ์ฒด ์˜์—ญ์— ์ง‘์ค‘๋˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฐ๊ณผ๋Š” ์ง€์‹์ฆ๋ฅ˜๋ฅผ ์ ์šฉํ•œ ๊ฒฝ์šฐ๊ฐ€ PspNet(c)์— ๋น„ํ•ด ์ด๋ฏธ์ง€์˜ ์œ ์˜๋ฏธํ•œ ์ •๋ณด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธกํ•˜์˜€์Œ์„ ๋ณด์—ฌ์ค€๋‹ค.

๊ทธ๋ฆผ 8. ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์— ๋Œ€ํ•œ Grad-CAM ๊ฒฐ๊ณผ

Fig. 8. Grad-CAM results for segmentation

../../Resources/kiee/KIEE.2025.74.9.1575/fig8.png

6. ๊ฒฐ ๋ก 

๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์‹œ๋ฉ˜ํ‹ฑ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ๋ถ„์•ผ์—์„œ ๋‘๊ฐ€์ง€์˜ ์ง€์‹์ฆ๋ฅ˜ ๊ธฐ๋ฒ• 1) FPN์˜ ์ถœ๋ ฅ์„ ์ด์šฉํ•œ Self KD 2) ๋™์ผํ•œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•œ Mutual KD ์˜ ์„ฑ๋Šฅ์„ PASCAL-VOC 2012 ๋ฐ์ดํ„ฐ ์…‹์— ๋Œ€ํ•ด์„œ mIoU๋ฅผ ์‚ฌ์šฉํ•œ ์ •๋Ÿ‰ ํ‰๊ฐ€, ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ์˜ˆ์‹œ์™€ Grad-CAM ํžˆํŠธ๋งต์„ ํ†ตํ•ด์„œ ์ •์„ฑ ํ‰๊ฐ€๋ฅผ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค. ์ •๋Ÿ‰ ํ‰๊ฐ€๋ฅผ ํ†ตํ•ด ์ œ์•ˆ๋œ ๋‹จ์ผ ๋ชจ๋ธ ๊ธฐ๋ฐ˜ Self-KD์™€ Mutual-KD์˜ ์„ฑ๋Šฅ์ด PspNet๋ณด๋‹ค ์šฐ์ˆ˜ํ•จ์„ ํ™•์ธํ•˜์˜€๊ณ , ์ •์„ฑ ํ‰๊ฐ€์—์„œ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ํƒœ์Šคํฌ์— ๋Œ€ํ•œ ๋‹จ์ผ ๋ชจ๋ธ๋งŒ์„ ์‚ฌ์šฉํ•œ ์ง€์‹์ฆ๋ฅ˜ ์ ์šฉ์˜ ํƒ€๋‹น์„ฑ์„ ํ™•์ธํ•˜์˜€๋‹ค. ํ–ฅํ›„ ์ง€์‹์ฆ๋ฅ˜ ๊ตฌ์กฐ์˜ ๊ฐœ์„  ๋ฐ ๋” ๋†’์€ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ๋ชฉํ‘œ๋กœ ์‚ผ๊ณ  ์žˆ๋‹ค.

References

1 
T. Zhou, W. Xia, F. Zhang, B. Chang, W. Wang, Y. Yuan, E. Konukoglu and D. Cremers, โ€œImage Segmentation in Foundation Model Era: A Survey,โ€ arXiv preprint arXiv:2408.12957, 2024.DOI
2 
L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. L. Yuille, โ€œDeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs,โ€ IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, pp. 834-848, 2018. DOI:10.1109/TPAMI.2017.2699184DOI
3 
O. Ronneberger, P. Fischer and T. Brox, โ€œU-Net: Convolutional networks for biomedical image segmentation,โ€ Proc. Int. Conf. Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp. 234-241, 2015. DOI:10.1107/978-3-319-24574-4_28DOI
4 
V. Badrinarayanan, A. Kendall and R. Cipolla, โ€œSegNet: A deep convolutional encoder-decoder architecture for image segmentation,โ€ IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 12, pp. 2481-2495, 2017. DOI:10.1109/TPAMI.2016.2644615DOI
5 
H. Zhao, J. Shi, X. Qi, X. Wang and J. Jia, โ€œPyramid scene parsing network,โ€ in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 2881-2890, 2017. DOI:10.1109/CVPR.2017.660DOI
6 
S. C. Yurtkulu, Y. H. ลžahin and G. Unal, โ€œSemantic Segmentation with Extended DeepLabv3 Architecture,โ€ 2019 27th Signal Processing and Communications Applications Conference (SIU), Sivas, Turkey, pp. 1-4, Apr. 2019. DOI:10.1109/SIU.2019.8806244DOI
7 
G. Hinton, O. Vinyals and J. Dean, โ€œDistilling the knowledge in a neural network,โ€ Neural Information Processing Systems (NIPS) Workshop, 2014.DOI
8 
Y. Zhang, T. Xiang, T. Hospedales and H. Lu, โ€œDeep mutual learning,โ€ in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 4320-4328, 2018. DOI:10.1109/CVPR.2018.00454DOI
9 
S. H. Lee, D. H. Kim and B. C. Song, โ€œSelf-supervised knowledge distillation using singular value decomposition,โ€ in Proc. European Conf. Computer Vision (ECCV), pp. 335-350, 2018. DOI: 10.1007/978-3-030-01246-5_21DOI
10 
B. Cheng, A. Schwing and A. Kirillov, โ€œPer-pixel classification is not all you need for semantic segmentation,โ€ Advances in Neural Information Processing Systems (NeurIPS), 2021.URL
11 
F. Chollet, โ€œXception: Deep learning with depthwise separable convolutions,โ€ in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 1251-1258, 2017. DOI:10.1109/CVPR.2017.195DOI
12 
Z. Wang, M. Berman, A. Rannen-Triki, P. Torr, D. Tuia, T. Tuytelaars, et al., โ€œRevisiting evaluation metrics for semantic segmentation: Optimization and evaluation of fine-grained intersection over union,โ€ Neural Information Processing Systems (NeurIPS), 2023.URL
13 
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh and D. Batra, โ€œGrad-CAM: Visual explanations from deep networks via gradient-based localization,โ€ in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 618-626, 2017. DOI: 10.1109/CVPR.2017.74DOI

์ €์ž์†Œ๊ฐœ

๊น€๋ฏผ๊ทœ(Mingyu Kim)
../../Resources/kiee/KIEE.2025.74.9.1575/au1.png

He is currently pursuing his BS degree in Electronics and Computer Engineering at Seokyeong University, His research interests include deep learning, computer vision.

๊น€๊ฒฝ์ˆ˜(Gyeongsu Kim)
../../Resources/kiee/KIEE.2025.74.9.1575/au2.png

He is currently pursuing his BS degree in Electronics and Computer Engineering at Seokyeong University, His research interests include deep learning, computer vision.

์„œ๊ธฐ์„ฑ(Kisung Seo)
../../Resources/kiee/KIEE.2025.74.9.1575/au3.png

He received the BS, MS, and Ph.D degrees in Electrical Engineering from Yonsei University, Seoul, Korea, in 1986, 1988, and 1993 respectively. He joined Genetic Algorithms Research and Applications Group (GARAGe), Michigan State University from 1999 to 2002 as a Research Associate. He was also appointed Visiting Assistant Professor in Electrical & Computer Engineering, Michigan State University from 2002 to 2003. He was a Visiting Scholar at BEACON (Bio/computational Evolution in Action CONsortium) Center, Michigan State University from 2011 to 2012. He is currently Professor of Electronics Engineering, Seokyeong University. His research interests include deep learning, evolutionary computation, computer vision, and intelligent robotics.