Mobile QR Code QR CODE : The Transactions P of the Korean Institute of Electrical Engineers




Attention Mechanism, Code Generation, Screen Images, Convolutional Neural Network (CNN), Domain-Specific Language (DSL)

1. ์„œ ๋ก 

์›นํŽ˜์ด์ง€ ์‘์šฉ๋“ค์€ ์ผ๋ฐ˜์ ์œผ๋กœ ํด๋ผ์ด์–ธํŠธ ๊ณ„์ธต, ์‘์šฉ ๋กœ์ง ๊ณ„์ธต, ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ๊ณ„์ธต์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ํด๋ผ์ด์–ธํŠธ ๊ณ„์ธต์€ ์›น ๋ธŒ๋ผ์šฐ์ €๋ฅผ ํ†ตํ•ด ์‚ฌ์šฉ์ž์˜ ์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ชจ๋“ˆ๋กœ ์›นํŽ˜์ด์ง€์˜ ๊ตฌ์„ฑ์š”์†Œ, ์ƒ‰, ๋ชจ์–‘ ๋“ฑ์˜ ์Šคํƒ€์ผ, ํ™”๋ฉด ๋ ˆ์ด์•„์›ƒ ๋“ฑ์ด ์ค‘์š”ํ•œ ์š”์†Œ์ด๋‹ค. ๋ถ€์ ์ ˆํ•œ ํ™ˆํŽ˜์ด์ง€ ํ™”๋ฉด ๊ตฌ์„ฑ์€ ํ•ด๋‹น ์›น์‚ฌ์ดํŠธ์˜ ์‚ฌ์šฉ ๊ฐ€๋Šฅ์„ฑ(Userability)์„ ๋–จ์–ดํŠธ๋ ค ์„œ๋น„์Šค ํ’ˆ์งˆ์— ๋‚˜์œ ์˜ํ–ฅ์„ ๋ฏธ์น  ์ˆ˜ ์žˆ๋‹ค.

์›นํŽ˜์ด์ง€ ํ™”๋ฉด ๊ตฌ์„ฑ์€ ๋น„IT์ „๋ฌธ๊ฐ€์ธ ์›น ๋””์ž์ด๋„ˆ๋“ค์— ์˜ํ•ด ์ œ์ž‘๋˜๋ฉฐ ์›น ๊ฐœ๋ฐœ์ž๋“ค์€ ํ™”๋ฉด ์ด๋ฏธ์ง€๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์›น ํ”„๋กœ๊ทธ๋ž˜๋ฐ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ํ™ˆํŽ˜์ด์ง€ ํ™”๋ฉด ์ด๋ฏธ์ง€์—์„œ HTML ๋“ฑ์˜ ์†Œ์Šค ์ฝ”๋“œ๋ฅผ ๊ฐœ๋ฐœํ•˜๋Š” ์ž‘์—…์€ ์‹œ๊ฐ„์ด ์˜ค๋ž˜ ๊ฑธ๋ฆฌ๊ณ  ์˜ค๋ฅ˜ ๋ฐœ์ƒ์ด ๋นˆ๋ฒˆํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ, ์ž๋™ ์ฝ”๋“œ ์ƒ์„ฑ ๊ธฐ๋Šฅ์ด ์‚ฌ์šฉ๋œ๋‹ค๋ฉด ๊ฐœ๋ฐœ ์ƒ์‚ฐ์„ฑ์„ ๋†’์ผ ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.

๋ณธ ๋…ผ๋ฌธ์€ ์›นํŽ˜์ด์ง€์˜ ํด๋ผ์ด์–ธํŠธ ๋ชจ๋“ˆ ๊ฐœ๋ฐœ์˜ ์ƒ์‚ฐ์„ฑ์„ ๋†’์ด๊ธฐ ์œ„ํ•ด ์›นํŽ˜์ด์ง€ ํ™”๋ฉด ์ด๋ฏธ์ง€์—์„œ ์†Œ์Šค ์ฝ”๋“œ๋ฅผ ์ž๋™ ์ƒ์„ฑํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹ ๋„คํŠธ์›Œํฌ ๋ชจ๋ธ์„ ์ œ์‹œํ•œ๋‹ค. ์ด๋ฏธ์ง€์™€ ํ…์ŠคํŠธ ์ฝ”๋“œ ์ฒ˜๋ฆฌ์— ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ž…๋ ฅ ์ด๋ฏธ์ง€์— ๋ถ€ํ•ฉํ•˜๋Š” ์†Œ์Šค ์ฝ”๋“œ๋ฅผ ์ž๋™ ์ƒ์„ฑํ•˜๊ณ ์ž ํ•œ๋‹ค. ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ์— ๋›ฐ์–ด๋‚œ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง์„ ์ด์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€์— ํฌํ•จ๋œ ์›นํŽ˜์ด์ง€ ๊ตฌ์„ฑ์š”์†Œ, ์Šคํƒ€์ผ, ํ™”๋ฉด ๋ ˆ์ด์•„์›ƒ ๋“ฑ์— ๊ด€ํ•œ ํŠน์„ฑ์„ ์ถ”์ถœํ•˜๊ณ  ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ค‘๊ฐ„ ๋‹จ๊ณ„์˜ ํ…์ŠคํŠธ ๊ธฐ๋ฐ˜ ์†Œ์Šค ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ์ค‘๊ฐ„ ๋‹จ๊ณ„์˜ ์–ธ์–ด๋Š” Domain-Specific Language (DSL)๋ผ๊ณ  ํ•˜๋ฉฐ ํŠน์ • ๋ชฉ์ ์„ ์œ„ํ•ด ์„ค๊ณ„๋œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด๋กœ ์ด๋ฏธ์ง€-์ฝ”๋“œ ๋ณ€ํ™˜ ๋ฌธ์ œ์˜์—ญ์—์„œ ์‹ค์ œ ์ƒ์„ฑํ•  ์ฝ”๋“œ์˜ ์ค‘๊ฐ„ ํ˜•ํƒœ ์ฝ”๋“œ๋กœ HTML ๋ณด๋‹ค ๊ฐ„๊ฒฐํ•˜๊ณ  ์ง๊ด€์ ์ด๋‹ค. ์ƒ์„ฑ๋œ DSL ์ฝ”๋“œ๋Š” ๋ณ€ํ™˜ ํ”„๋กœ๊ทธ๋žจ์— ์˜ํ•ด HTML, CSS ๋“ฑ์˜ ํ™ˆํŽ˜์ด์ง€ ๊ด€๋ จ ์ฝ”๋“œ๋กœ ๋ณ€ํ™˜๋œ๋‹ค. ์ด๋ฏธ์ง€์˜ ๋ณต์žก๋„์— ๋”ฐ๋ผ, ์ด๋ฏธ์ง€์—์„œ ์ƒ์„ฑ๋œ ์†Œ์Šค ์ฝ”๋“œ๊ฐ€ ์™„์„ฑ๋œ ํ˜•ํƒœ๊ฐ€ ์•„๋‹ ์ˆ˜ ์žˆ์ง€๋งŒ, ํ™”๋ฉด ์ด๋ฏธ์ง€์—์„œ ์ฒ˜์Œ๋ถ€ํ„ฐ ์ฝ”๋“œ๋ฅผ ๊ฐœ๋ฐœํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค๋Š” ์ดˆ์•ˆ ๋ฒ„์ „์˜ ์ฝ”๋“œ ์ œ๊ณต์ด ์›นํŽ˜์ด์ง€ ๊ฐœ๋ฐœ์˜ ์ƒ์‚ฐ์„ฑ์„ ๋†’์ผ ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ๋Œ€ํ‘œ์ ์ธ ์ด๋ฏธ์ง€-์ฝ”๋“œ ๋ณ€ํ™˜ ์‹œ์Šคํ…œ์ธ Pix2code [1]์— ์–ดํ…์…˜ ๊ณ„์ธต [2]์„ ์ถ”๊ฐ€ํ•œ Pix2code-ATT ๋ชจ๋ธ์„ ์†Œ๊ฐœํ•œ๋‹ค. ๋˜ํ•œ ์ƒˆ๋กญ๊ฒŒ ์ถ”๊ฐ€๋œ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์ด์šฉํ•˜์—ฌ ์ œ์•ˆ๋œ ๋ชจ๋ธ์˜ ํ™œ์šฉ ๊ฐ€๋Šฅ์„ฑ์„ ํ™•์ธํ•œ๋‹ค.

์ด๋ฏธ์ง€-์ฝ”๋“œ ๋ณ€ํ™˜์€ ์ง€์†์ ์ธ ๊ด€์‹ฌ์„ ๋„๋Š” ์—ฐ๊ตฌ์ฃผ์ œ๋กœ ์ƒˆ๋กญ๊ฒŒ ์ œ์‹œ๋˜๋Š” ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ˆ ์„ ์ ์šฉํ•˜์—ฌ ๋”์šฑ ํšจ๊ณผ์ ์ธ ๊ฒฐ๊ณผ๋ฅผ ๊ธฐ๋Œ€ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค. ํŠนํžˆ, ์›นํŽ˜์ด์ง€ ํ™”๋ฉด๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๋ชจ๋ฐ”์ผ ์‹œ์Šคํ…œ์˜ ํ™”๋ฉด์—๋„ ์œ ์‚ฌํ•œ ๊ธฐ์ˆ ๋“ค์ด ์ ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค. ๋ชจ๋ฐ”์ผ ํ”Œ๋žซํผ์— ๋”ฐ๋ผ ์„ธ๋ถ€์ ์œผ๋กœ ํ™”๋ฉด ๊ตฌ์„ฑ์š”์†Œ, ํ™”๋ฉด ๋ฐฐ์น˜ ๊ด€๋ฆฌ์ž, ํ™”๋ฉด ๋ฐฐ์น˜ ๊ธฐ๋ฒ• ๋“ฑ์ด ๋‹ค๋ฅด์ง€๋งŒ ์ผ๋ฐ˜์ ์ธ ์‚ฌํ•ญ๋“ค์€ ๊ณต์œ ๋  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.

๋ณธ ๋…ผ๋ฌธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ตฌ์„ฑ๋œ๋‹ค. 2์žฅ์—์„œ๋Š” ์ด๋ฏธ์ง€-์ฝ”๋“œ ๋ณ€ํ™˜์— ๋Œ€ํ•œ ๊ด€๋ จ ์—ฐ๊ตฌ๋ฅผ ๊ธฐ์ˆ ํ•˜๊ณ , 3์žฅ์—์„œ๋Š” ํ™”๋ฉด ์ด๋ฏธ์ง€์—์„œ ์ฝ”๋“œ๋ฅผ ์ž๋™ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•œ ์ œ์•ˆ ๋ชจ๋ธ์„ ์†Œ๊ฐœํ•œ๋‹ค. 4์žฅ์€ ์ œ์•ˆ๋œ ๋ชจ๋ธ์„ ์ด์šฉํ•œ ์‹คํ—˜ ๊ฒฐ๊ณผ์™€ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•œ๋‹ค. 5์žฅ์€ ๊ฒฐ๋ก ๊ณผ ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ์„ ์ œ์‹œํ•œ๋‹ค.

2. ๊ด€๋ จ ์—ฐ๊ตฌ

์ด๋ฏธ์ง€-์ฝ”๋“œ ๋ณ€ํ™˜์€ ๋”ฅ๋Ÿฌ๋‹์˜ ์ด๋ฏธ์ง€ ์บก์…˜ ๋ฌธ์ œ ์œ ํ˜•๊ณผ ์œ ์‚ฌํ•˜๋‹ค. ์ฃผ์–ด์ง„ ์ด๋ฏธ์ง€๋ฅผ ์„ค๋ช…ํ•˜๋Š” ํ…์ŠคํŠธ๋ฅผ ์ƒ์„ฑํ•˜๋“ฏ์ด ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ์†Œ์Šค ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. Tony Beltramelli๊ฐ€ ๋ฐœํ‘œํ•œ Pix2code ๋ชจ๋ธ์€ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ˆ ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€์—์„œ ํ”„๋กœ๊ทธ๋žจ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋Œ€ํ‘œ์ ์ธ ๋„คํŠธ์›Œํฌ ๋ชจ๋ธ์ด๋ฉฐ ๊ทธ ํ›„ ์œ ์‚ฌํ•œ ๋ฒ”์ฃผ์— ์†ํ•˜๋Š” ์—ฐ๊ตฌ๋“ค์ด Pix2code์˜ ๋ชจ๋ธ์ด๋‚˜ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ ๋ณ€๊ฒฝํ•œ ์—ฐ๊ตฌ๋ฐฉ๋ฒ•์„ ์†Œ๊ฐœํ•˜๊ณ  ์žˆ๋‹ค. Pix2code ๋ชจ๋ธ์€ ์›นํŽ˜์ด์ง€ ์ด๋ฏธ์ง€์—์„œ DSL ์ฝ”๋“œ๋ฅผ ์ž๋™ ์ƒ์„ฑํ•œ๋‹ค. DSL ์ฝ”๋“œ๋Š” ์ž๋™ ๋ณ€ํ™˜ ๋ชจ๋“ˆ์— ์˜ํ•ด HTML ์ฝ”๋“œ๋กœ ์ž๋™ ๋ณ€ํ™˜๋œ๋‹ค. Pix2code๋Š” ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ CNN๊ณผ ํ…์ŠคํŠธ ์‹œํ€€์Šค์ธ DSL ์ฝ”๋“œ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ LSTM ๋ชจ๋ธ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค.

Pix2code ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•ด ๋ณ€ํ˜•๋œ ํ˜•ํƒœ์˜ ์ธ์ฝ”๋”-๋””์ฝ”๋” ๋ชจ๋ธ์ด ์ œ์‹œ๋˜๊ณ  ์žˆ๋‹ค. Y. Liu et al. [3]์€ Pix2code์˜ ๋””์ฝ”๋” ๋ชจ๋“ˆ์— ๊ตฌ์„ฑํ•˜๋Š” LSTM ๊ณ„์ธต ๋Œ€์‹  BiLSTM ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค. W. Zhang et al. [4]์€ Rapid Combined Model์„ ํ†ตํ•ด ๊ธฐ์กด ์ธ์ฝ”๋”-๋””์ฝ”๋” ๊ธฐ๋ฐ˜์˜ ์ด๋ฏธ์ง€-์ฝ”๋“œ ๋ณ€ํ™˜ ๋ชจ๋ธ์˜ ํ•™์Šต ์‹œ๊ฐ„์„ ๋‹จ์ถ•ํ•˜๊ณ  ์ƒ์„ฑ๋œ ์ฝ”๋“œ์˜ ์ •ํ™•๋„๋ฅผ ๋†’์ด๊ณ ์ž ํ•˜์˜€๋‹ค.

์ด๋ฏธ์ง€-์ฝ”๋“œ ๋ณ€ํ™˜ ๋ชจ๋ธ์€ ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ๋”ฐ๋ผ ํฌ๊ฒŒ ์˜ํ–ฅ์„ ๋ฐ›๋Š” ์ง€๋„ํ•™์Šต ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์ด๋ฏ€๋กœ ์–‘์งˆ์˜ ๋ฐ์ดํ„ฐ ์„ธํŠธ๊ฐ€ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๊ฒฐ์ •ํ•˜๋Š” ์ค‘์š” ์š”์†Œ ์ค‘ ํ•˜๋‚˜์ด๋‹ค. ์ˆ˜์ž‘์—…์œผ๋กœ ์ž‘์„ฑ๋œ ๋ฐ์ดํ„ฐ ์„ธํŠธ [5]๋‚˜ ํ‘๋ฐฑ์˜ ์Šค์ผ€์น˜ ์ด๋ฏธ์ง€ [6, 7]๋ฅผ ํ•™์Šต ๋ฐ ์‹œํ—˜ ๋ฐ์ดํ„ฐ๋กœ ์‚ฌ์šฉํ•˜๊ธฐ๋„ ํ•œ๋‹ค.

์ด๋ฏธ์ง€-์ฝ”๋“œ ๋ณ€ํ™˜ ๋ฌธ์ œ๋Š” ์ธ์ฝ”๋”-๋””์ฝ”๋” ๊ธฐ๋ฐ˜์˜ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ๊ณผ ๋”๋ถˆ์–ด ์ด๋ฏธ์ง€ ๋‚ด์˜ ๊ฐ์ฒด ํƒ์ง€์šฉ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ธ Faster R-CNN, YOLO ๋“ฑ๊ณผ ๋ ˆ์ด์•„์›ƒ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ด์šฉํ•œ ํ•ด๊ฒฐ์ฑ…์ด ์ œ์‹œ๋œ๋‹ค [8,9,10,11]. ๋ฒ„ํŠผ, ํ…์ŠคํŠธ, ํ—ค๋”, ํผ ๋“ฑ ์›นํŽ˜์ด์ง€ ์ด๋ฏธ์ง€์— ํฌํ•จ๋œ ๊ตฌ์„ฑ์š”์†Œ๋ฅผ ๊ฐ์ฒด ํƒ์ง€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ ์‹๋ณ„ํ•˜๊ณ  ํ™ˆํŽ˜์ด์ง€ ์ „์ฒด ๊ตฌ์กฐ๋Š” ๋ ˆ์ด์•„์›ƒ ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ ์ถ”์ถœํ•œ๋‹ค. K. Kikuchi et al. [12]์€ ๋‹จ์ˆœ ๊ตฌ์กฐ๋ณด๋‹ค๋Š” ๋‹ค๋ฅธ ์ปดํฌ๋„ŒํŠธ๋“ค์„ ํฌํ•จํ•˜๋Š” ์ปจํ…Œ์ด๋„ˆ ์ปดํฌ๋„ŒํŠธ๋“ค์„ ์ค‘์‹ฌ์œผ๋กœ ์ตœ์ ํ™” ๊ธฐ๋ฐ˜ ๊ณ„์ธต์  ๋ ˆ์ด์•„์›ƒ ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. D. S. Baule et al. [13]์€ ๊ธฐ์กด ๊ธฐ๊ณ„ํ•™์Šต/๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜์˜ ์ด๋ฏธ์ง€-์ฝ”๋“œ ๋ณ€ํ™˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜, ๋ชจ๋ธ, ํ•™์Šต ๋ฐ์ดํ„ฐ ์œ ํ˜•, ์„ฑ๋Šฅ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์˜ ์žฅ๋‹จ์ ์„ ๋น„๊ต ์„ค๋ช…ํ•œ๋‹ค. ์ด๋ฏธ์ง€-์ฝ”๋“œ ๋ณ€ํ™˜ ๋ชจ๋ธ์€ ์›นํŽ˜์ด์ง€๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์•ˆ๋“œ๋กœ์ด๋“œ๋‚˜ iOS์— ์ ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค. ๋ชจ๋ฐ”์ผ ์šด์˜ ํ”Œ๋žซํผ์— ๋”ฐ๋ผ ํ™”๋ฉด์„ ๊ตฌ์„ฑํ•˜๋Š” ์ปดํฌ๋„ŒํŠธ๊ฐ€ ๋‹ค๋ฅด์ง€๋งŒ, ์›นํŽ˜์ด์ง€์— ํ™œ์šฉ๋œ ๋ฐฉ๋ฒ•๋“ค์ด ์ ์šฉ๋  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค [14].

3. Pix2code-ATT ๋ชจ๋ธ

๋ณธ ์žฅ์—์„œ๋Š” ์ด๋ฏธ์ง€-์ฝ”๋“œ ๋ณ€ํ™˜์„ ์œ„ํ•œ Pix2code ๋ชจ๋ธ์— ์–ดํ…์…˜ ๋ชจ๋“ˆ์„ ์ถ”๊ฐ€ํ•œ Pix2code-ATT ๋ชจ๋ธ์— ๋Œ€ํ•ด ์„ค๋ช…ํ•œ๋‹ค.

3.1 Pix2code-ATT ์†Œ๊ฐœ

๊ทธ๋ฆผ 1์€ ์ด๋ฏธ์ง€์—์„œ ํ•ด๋‹นํ•˜๋Š” ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ์ด๋ฏธ์ง€-์ฝ”๋“œ ๋ณ€ํ™˜ ๋ชจ๋ธ์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์ฃผ์–ด์ง„ ์›น ์ด๋ฏธ์ง€๋Š” ์‹ค์ œ ์›น ์ด๋ฏธ์ง€์— ๋น„ํ•ด ๊ฐ„์†Œํ•œ ํ˜•ํƒœ์ด๋ฉฐ ์ƒ์„ฑ๋œ ์ฝ”๋“œ๋Š” ํ•ด๋‹น ์ด๋ฏธ์ง€์— ๋ถ€ํ•ฉํ•˜๋Š” DSL ์ฝ”๋“œ์ด๋‹ค. DSL ์ฝ”๋“œ๋Š” ํŠน์ • ๋ชฉ์ ์„ ์œ„ํ•ด ์„ค๊ณ„๋œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด๋กœ ์‹ค์ œ ์ƒ์„ฑํ•  ์ฝ”๋“œ์˜ ์ค‘๊ฐ„ ํ˜•ํƒœ์ด๋ฉฐ HTML ๋ณด๋‹ค ๊ฐ„๊ฒฐํ•˜๊ณ  ์ง๊ด€์ ์ด๋‹ค. ์ด๋ฏธ์ง€์—์„œ HTML ์ฝ”๋“œ๋ฅผ ์ง์ ‘ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค DSL ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•จ์œผ๋กœ์จ ์ฝ”๋“œ ์ƒ์„ฑ์ด ๋ณด๋‹ค ๊ฐ„๊ฒฐํ•ด์ง€๊ณ  ๋˜ํ•œ ์ฐจํ›„์— HTML์ด ์•„๋‹Œ ๋‹ค๋ฅธ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์–ด ํ™•์žฅ์„ฑ ์ธก๋ฉด์—์„œ ํšจ๊ณผ์ ์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์•ˆ๋“œ๋กœ์ด๋“œ ์ฝ”๋“œ๋‚˜ iOS ์ฝ”๋“œ๋กœ ํ™•์žฅํ•  ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ, ์ƒ์„ฑ ์–ธ์–ด์— ๋”ฐ๋ผ DSL์„ ํ™•์žฅํ•  ์ˆ˜ ์žˆ๋‹ค. DSL ์ฝ”๋“œ๋ฅผ HTML ์ฝ”๋“œ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์ž‘์—…์€ ์‚ฌ์ „์— ์ •์˜๋œ ํ…œํ”Œ๋ฆฟ์— ์˜ํ•ด ์ฒ˜๋ฆฌ๋œ๋‹ค.

๊ทธ๋ฆผ 1. ์›นํŽ˜์ด์ง€ ์ด๋ฏธ์ง€์—์„œ ์†Œ์Šค ์ฝ”๋“œ ์ƒ์„ฑ

Fig. 1. Generating Source Code from Webpage Images

../../Resources/kiee/KIEEP.2023.72.3.179/fig1.png

3.2 Pix2code-ATT ๊ตฌ์กฐ

๊ทธ๋ฆผ 2๋Š” Pix2code-ATT ๋ชจ๋ธ์˜ ๊ตฌ์กฐ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. Pix2code๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ์ธ์ฝ”๋”์™€ ๋””์ฝ”๋”๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์–ดํ…์…˜ ๋ชจ๋“ˆ์„ ์ถ”๊ฐ€ํ•œ๋‹ค. ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋Š” CNN์„ ํ†ตํ•ด, DSL ์ฝ”๋“œ๋Š” LSTM ๋ชจ๋“ˆ์„ ํ†ตํ•ด ์ธ์‹๋œ๋‹ค. ์ถ”์ถœ๋œ ์ด๋ฏธ์ง€ ํŠน์„ฑ๊ณผ DSL ํŠน์„ฑ์€ ํŠน์„ฑ ์—ฐ๊ฒฐ(Feature Concatenation) ๊ณผ์ • ํ›„ ๋””์ฝ”๋” ๊ธฐ๋Šฅ์˜ LSTM ๋ชจ๋“ˆ์˜ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉ๋œ๋‹ค. LSTM ๋””์ฝ”๋”๋Š” ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ด๋ฏธ์ง€ ์ปดํฌ๋„ŒํŠธ์— ํ•ด๋‹นํ•˜๋Š” DSL ํ† ํฐ์„ ์ƒ์„ฑํ•œ๋‹ค. ์ถ”๊ฐ€๋œ ์–ดํ…์…˜ ๋ชจ๋“ˆ์€ LSTM์˜ ์žฅ๊ธฐ ์˜์กด์„ฑ(Long-term Dependencies) ๋ฌธ์ œ [15]์— ๋Œ€ํ•œ ํ•ด๋ฒ•์ด ๋  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค. LSTM์ด RNN์˜ ์žฅ๊ธฐ ์˜์กด์„ฑ ๋ฌธ์ œ์— ๋Œ€ํ•œ ํ•ด๋ฒ•์œผ๋กœ ์ œ์‹œ๋˜์—ˆ์ง€๋งŒ, ์—ฌ์ „ํžˆ ๊ฐ™์€ ๋ฌธ์ œ์ ์„ ๊ฐ€์ง„๋‹ค. ์–ดํ…์…˜ ๋ชจ๋“ˆ์€ ์ž…๋ ฅ ์‹œํ€€์Šค ๋‚ด์˜ ๊ฐ ํ† ํฐ์— ๋Œ€ํ•ด ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•˜์—ฌ ์ค‘์š”ํ•œ ์ •๋ณด์— ๋” ์ง‘์ค‘ํ•˜๊ณ , ๋œ ์ค‘์š”ํ•œ ์ •๋ณด์—๋Š” ๋œ ์ง‘์ค‘ํ•˜๋„๋ก ํ•œ๋‹ค. ์–ดํ…์…˜ ๋ชจ๋“ˆ์€ LSTM์˜ ์žฅ๊ธฐ ์˜์กด์„ฑ ๋ฌธ์ œ์— ๋ณด๋‹ค ํšจ๊ณผ์ ์ผ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋œ๋‹ค.

๊ทธ๋ฆผ 2. Pix2code-ATT ์ „์ฒด ๊ตฌ์กฐ

Fig. 2. The Overall Architecture of Pix2code-ATT

../../Resources/kiee/KIEEP.2023.72.3.179/fig2.png

๊ทธ๋ฆผ 3์€ Pix2code-ATT ๋ชจ๋ธ์„ ๊ตฌ์„ฑํ•˜๋Š” ์›นํŽ˜์ด์ง€ ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ CNN ๋ชจ๋“ˆ ๊ณ„์ธต๊ณผ ์ถœ๋ ฅ ํ˜•์ƒ์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. ํ™”๋ฉด ์ด๋ฏธ์ง€์˜ ํŠน์„ฑ ์ถ”์ถœ์„ ์œ„ํ•ด ์ผ€๋ผ์Šค ํ”„๋ ˆ์ž„์›Œํฌ๊ฐ€ ์ œ๊ณตํ•˜๋Š” Conv2D ๊ณ„์ธต์„ 4๊ฐœ ์‚ฌ์šฉํ•œ๋‹ค. ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ๊ณต๊ฐ„ ์ฐจ์›์„ ์ค„์ด๊ธฐ ์œ„ํ•ด ์ตœ๋Œ€ ํ’€๋ง์ด ์‚ฌ์šฉ๋˜๊ณ  ๋ชจ๋ธ์˜ ๊ณผ์ ํ•ฉ์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด dropout ๊ณ„์ธต์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค. 4๊ฐœ์˜ Conv2D ๊ณ„์ธต์—์„œ ์ถ”์ถœ๋œ ํŠน์„ฑ๋งต์€ flatten ๊ณ„์ธต์—์„œ 1์ฐจ์› ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜๋˜๋ฉฐ ์™„์ „ ์—ฐ๊ฒฐ(Fully-connected) ์‹ ๊ฒฝ๋ง ๊ณ„์ธต์œผ๋กœ ์ž…๋ ฅ๋œ๋‹ค.

๊ทธ๋ฆผ 3. Pix2code-ATT์˜ CNN ๊ตฌ์กฐ

Fig. 3. The Architecture of the CNN in Pix2code-ATT

../../Resources/kiee/KIEEP.2023.72.3.179/fig3.png

๊ทธ๋ฆผ 4๋Š” Pix2code-ATT์— ์‚ฌ์šฉ๋œ LSTM ์…€์— ๋Œ€ํ•œ ์‹œ๊ฐ์  ํ‘œํ˜„๊ณผ ํ•ด๋‹นํ•˜๋Š” ์ˆ˜์‹์„ ๋‚˜ํƒ€๋‚ธ๋‹ค [16]. LSTM ๊ตฌ์กฐ์˜ ํŠน์„ฑ์€ ์ž…๋ ฅ ๊ฒŒ์ดํŠธ(it), ๋ง๊ฐ ๊ฒŒ์ดํŠธ(ft), ์ถœ๋ ฅ ๊ฒŒ์ดํŠธ(ot)๋ฅผ ํ†ตํ•ด ์ •๋ณด์˜ ํ๋ฆ„์„ ์กฐ์ ˆํ•˜๋Š” ๊ฒŒ์ดํŠธ(Gate) ๋ฉ”์นด๋‹ˆ์ฆ˜์„ ์‚ฌ์šฉํ•œ๋‹ค. ฯ•์™€ ฯƒ๋Š” ํ™œ์„ฑํ™”ํ•จ์ˆ˜๋กœ ๊ฐ๊ฐ ์‹œ๊ทธ๋ชจ์ด๋“œ์™€ ํ•˜์ดํผ๋ณผ๋ฆญํƒ„์  ํŠธ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์…€ ์ƒํƒœ(Cell State, ct)๋Š” ์ด์ „ ์ƒํƒœ์—์„œ ํ˜„์žฌ ์ƒํƒœ๋กœ ์ „๋‹ฌ๋˜๋Š” ๋ฉ”๋ชจ๋ฆฌ ๊ธฐ๋Šฅ์„ ์ˆ˜ํ–‰ํ•˜๋ฉฐ ์ž…๋ ฅ ๊ฒŒ์ดํŠธ์™€ ๋ง๊ฐ ๊ฒŒ์ดํŠธ๋ฅผ ํ†ตํ•ด ์กฐ์ •๋œ ์ •๋ณด๋“ค์ด ๋”ํ•ด์ ธ ์ƒˆ๋กœ์šด ์…€ ์ƒํƒœ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

๊ทธ๋ฆผ 4. Long Short-Term Memory(LSTM) ์…€์˜ ๊ตฌ์กฐ

Fig. 4. The Architecture of Long Short-Term Memory(LSTM) Cell

../../Resources/kiee/KIEEP.2023.72.3.179/fig4.png

๊ทธ๋ฆผ 5. ํŠธ๋žœ์Šคํฌ๋จธ ๋ธ”๋ก์˜ ๊ตฌ์กฐ

Fig. 5. The Architecture of the Transformer Block

../../Resources/kiee/KIEEP.2023.72.3.179/fig5.png

๊ทธ๋ฆผ 5๋Š” Pix2code-ATT ๋ชจ๋ธ์—์„œ ์‚ฌ์šฉํ•œ ํŠธ๋žœ์Šคํฌ๋จธ ๋ธ”๋ก๊ตฌ์กฐ [17]๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. ํŠธ๋žœ์Šคํฌ๋จธ ๋ธ”๋ก์€ MultiHeadAttention, LayerNormalization, Dense, Dropout ๋ ˆ์ด์–ด๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. MultiHeadAttention ๋ ˆ์ด์–ด๋Š” Self-Attention ๊ตฌํ˜„์„ ์œ„ํ•ด ์‚ฌ์šฉ๋˜๋ฉฐ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ๋‹ค์ค‘ ํ—ค๋“œ ์–ดํ…์…˜์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ์ธ num_heads๋Š” ์–ดํ…์…˜ ํ—ค๋“œ ๊ฐœ์ˆ˜๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฉฐ ์‹คํ—˜์—์„œ 8๊ฐœ์˜ ํ—ค๋“œ๊ฐ€ ์‚ฌ์šฉ๋œ๋‹ค. ์ด๋Š” 8๊ฐœ์˜ ๋ณ‘๋ ฌ ์–ดํ…์…˜์ด ์ด๋ฃจ์–ด์ง์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ๊ฐ๊ฐ์˜ ์–ดํ…์…˜ ๊ฐ€์ค‘์น˜๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ , ๊ฐ€์ค‘ํ•ฉ์„ ํ†ตํ•ด ์–ดํ…์…˜ ๊ฒฐ๊ณผ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ๋ณ‘๋ ฌ ์–ดํ…์…˜์„ ๋ชจ๋‘ ์ˆ˜ํ–‰ํ•œ ํ›„ ๋ชจ๋“  ์–ดํ…์…˜ ํ—ค๋“œ๋ฅผ ์—ฐ๊ฒฐ(Concatenate)ํ•œ๋‹ค. LayerNormalization ๋ ˆ์ด์–ด๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ํ‰๊ท ๊ณผ ํ‘œ์ค€ํŽธ์ฐจ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ ˆ์ด์–ด ์ •๊ทœํ™”๋ฅผ ์ˆ˜ํ–‰ํ•œ๋‹ค. ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ์™„์ „ ์—ฐ๊ฒฐ ์‹ ๊ฒฝ๋ง์„ ์ ์šฉํ•˜๋Š” Dense ๋ ˆ์ด์–ด๊ฐ€ ์‚ฌ์šฉ๋˜๋ฉฐ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋กœ๋Š” ReLU๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. Dense ๋ ˆ์ด์–ด ์ดํ›„์— Dropout ๋ ˆ์ด์–ด๋ฅผ ํ†ตํ•ด ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ์ผ๋ถ€๋ฅผ ๋ฌด์ž‘์œ„๋กœ ์„ ํƒํ•˜์—ฌ ๋“œ๋กญ์•„์›ƒ์„ ์ˆ˜ํ–‰ํ•จ์œผ๋กœ์จ ๊ณผ์ ํ•ฉ ๋ฐฉ์ง€ ํšจ๊ณผ๋ฅผ ๊ธฐ๋Œ€ํ•œ๋‹ค.

4. Pix2code-ATT ๋ชจ๋ธ ์„ฑ๋Šฅ ํ‰๊ฐ€

์ด๋ฏธ์ง€-์ฝ”๋“œ ๋ณ€ํ™˜ ๋ฌธ์ œ์— ๋Œ€ํ•œ Pix2code-ATT ๋ชจ๋ธ์˜ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ์„ ๊ฒ€์ฆํ•˜๊ธฐ ์œ„ํ•ด ์‹คํ—˜์„ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค. ์ฝ”๋“œ ์ƒ์„ฑ ์‹คํ—˜์— ์‚ฌ์šฉ๋œ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ๋”ฐ๋ผ ์‹คํ—˜โ… ๊ณผ ์‹คํ—˜โ…ก๋กœ ๋ถ„๋ฅ˜๋œ๋‹ค. ์‹คํ—˜โ… ์—์„œ๋Š” Pix2code ๋ชจ๋ธ์ด ์ œ๊ณตํ•˜๋Š” ๊ธฐ๋ณธ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ–ˆ์œผ๋ฉฐ ์‹คํ—˜โ…ก์—์„œ๋Š” ๊ธฐ๋ณธ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ์‹คํ—˜ํ•˜์˜€๋‹ค. ์ƒˆ๋กญ๊ฒŒ ์ถ”๊ฐ€๋œ ๋ฐ์ดํ„ฐ๋Š” ๊ธฐ๋ณธ ๋ฐ์ดํ„ฐ์— ์ƒˆ๋กœ์šด row๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ๋” ๋ณต์žกํ•˜๊ฒŒ ํ•ฉ์„ฑํ•œ ๋ฐ์ดํ„ฐ์ด๋‹ค. ์‹คํ—˜์„ ์œ„ํ•ด ํŒŒ์ด์ฌ 3.8๊ณผ ํ…์„œํ”Œ๋กœ์šฐ2 2.12 ๋ฒ„์ „์ด ์‚ฌ์šฉ๋˜์—ˆ์œผ๋ฉฐ Intel(R) Core i9-10900@3.7GHz์™€ GPU NVIDIA GeForce RTX 3093 (24 GB)์—์„œ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค.

4.1 ์‹คํ—˜โ… : Pix2code ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ๋Œ€ํ•œ ์„ฑ๋Šฅ ํ‰๊ฐ€

์‹คํ—˜โ… ์—์„œ๋Š” Pix2code์—์„œ ์ œ๊ณตํ•˜๋Š” ๊ธฐ๋ณธ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. Pix2code-ATT ๋ชจ๋ธ ํ•™์Šต์„ ์œ„ํ•ด 1,219๊ฐœ ์ด๋ฏธ์ง€์™€ ์ด๋ฏธ์ง€์— ํ•ด๋‹นํ•˜๋Š” DSL ์ฝ”๋“œ๊ฐ€ ์‚ฌ์šฉ๋˜์—ˆ์œผ๋ฉฐ ๋ชจ๋ธ์˜ ์‹œํ—˜์„ ์œ„ํ•ด 523๊ฐœ์˜ ์ด๋ฏธ์ง€๊ฐ€ ์‚ฌ์šฉ๋˜์—ˆ๋‹ค. ์ด๋ฏธ์ง€์—์„œ ์ƒ์„ฑ๋œ HTML ์ฝ”๋“œ๋Š” Bootstrap CSS๋ฅผ ๋”ฐ๋ฅธ๋‹ค. ํ•œ ์žฅ์˜ ์ด๋ฏธ์ง€๋Š” ํ•˜๋‚˜์˜ ํ™ˆํŽ˜์ด์ง€ ํ™”๋ฉด์„ ๋‚˜ํƒ€๋‚ด๋ฉฐ ํฌ๊ฒŒ header, ์—ฌ๋Ÿฌ ๊ฐœ์˜ row, footer๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ๊ฐ ๊ตฌ์„ฑ์š”์†Œ๋Š” HTML์˜ div ํƒœ๊ทธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ™”๋ฉด์„ ๊ตฌ์„ฑํ•œ๋‹ค. header๋Š” ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋ฒ„ํŠผ์œผ๋กœ ๊ตฌ์„ฑ๋˜๋ฉฐ ํ™œ์„ฑํ™” ํ˜น์€ ๋น„ํ™œ์„ฑํ™”๋œ๋‹ค. row๋Š” 1๊ฐœ, 2๊ฐœ, ๋˜๋Š” 4๊ฐœ์˜ ์ปฌ๋Ÿผ์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ๊ฐ ์ปฌ๋Ÿผ์€ small-title, text, ๋ฒ„ํŠผ์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ์ด๋ฏธ์ง€๊ฐ€ ๋‚˜ํƒ€๋‚ด๋Š” ํ™”๋ฉด ๊ตฌ์„ฑ์€ ์‹ค์ œ ํ™ˆํŽ˜์ด์ง€์— ๋น„ํ•ด ๋‹จ์ˆœํ•œ ํ˜•ํƒœ์ด์ง€๋งŒ ์ œ์•ˆํ•˜๋Š” ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๊ธฐ์— ์ ์ ˆํ•œ ๊ฒƒ์œผ๋กœ ํŒ๋‹จ๋œ๋‹ค.

๊ทธ๋ฆผ 6์€ ํ›ˆ๋ จ ๋ฐ ์‹œํ—˜ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ํฌํ•จ๋œ row ์ปดํฌ๋„ŒํŠธ ๊ฐœ์ˆ˜๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. DSL ํŒŒ์ผ์— ๋”ฐ๋ผ ์ปดํฌ๋„ŒํŠธ row ๊ฐœ์ˆ˜๋Š” ํ•˜๋‚˜์—์„œ๋ถ€ํ„ฐ ์…‹๊นŒ์ง€ ๋ถ„ํฌํ•œ๋‹ค. ํ›ˆ๋ จ์šฉ ์ด๋ฏธ์ง€์—์„œ ์„ธ ๊ฐœ์˜ row๋กœ ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ์˜ ๋น„์œจ์€ 76%์ด๊ณ , ์ „์ฒด ์‹œํ—˜์šฉ ์ด๋ฏธ์ง€์—์„œ ์„ธ ๊ฐœ์˜ row ๊ตฌ์„ฑ๋œ ์ด๋ฏธ์ง€๋Š” 80%์— ํ•ด๋‹นํ•œ๋‹ค. ์ด๋ฏธ์ง€์— ํฌํ•จ๋œ row ์ปดํฌ๋„ŒํŠธ์˜ ์ˆ˜๊ฐ€ ๋งŽ์„์ˆ˜๋ก ๋ณต์žก๋„๊ฐ€ ์ฆ๊ฐ€ํ•˜๋ฉฐ ์ธ์‹์ด ์‹คํŒจํ•˜๋Š” ๋ถ€๋ถ„์ด ๋ฐœ์ƒํ•  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์•„์งˆ ์ˆ˜ ์žˆ๋‹ค. ์ด๋ฏธ์ง€์˜ ๋ถ€๋ถ„ ์ธ์‹ ์‹คํŒจ๋Š” ์ƒ์„ฑ ์ฝ”๋“œ์˜ ๋ฏธ์™„์„ฑ์„ ์ดˆ๋ž˜ํ•œ๋‹ค. row๊ฐ€ 1, 2, 3๊ฐœ์ธ ๊ฒฝ์šฐ, row์— ํฌํ•จ๋œ ์ปดํฌ๋„ŒํŠธ๋Š” ์ตœ์†Œ 3๊ฐœ์—์„œ ์ตœ๋Œ€ 36๊ฐœ ๊ตฌ์„ฑ๋œ๋‹ค. ๊ทธ๋Ÿฌ๋ฏ€๋กœ row ๊ฐœ์ˆ˜๊ฐ€ ๋งŽ๊ณ  ๋‚ด๋ถ€ ๊ตฌ์„ฑ ์ปดํฌ๋„ŒํŠธ๊ฐ€ ๋งŽ์€ ๊ฒฝ์šฐ, ํ•ด๋‹น ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ์ธ์‹๋ฅ ์ด ๋–จ์–ด์งˆ ์ˆ˜ ์žˆ๋‹ค.

๊ทธ๋ฆผ 6. ํ›ˆ๋ จ ๋ฐ ์‹œํ—˜ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ํฌํ•จ๋œ row ๊ฐœ์ˆ˜

Fig. 6. The Number of Rows in the Training and Testing Datasets

../../Resources/kiee/KIEEP.2023.72.3.179/fig6.png

๊ทธ๋ฆผ 7์€ ๊ธฐ๋ณธ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ์ ์šฉ๋œ Pix2code์™€ Pix2code-ATT ๋ชจ๋ธ์˜ ์—ํฌํฌ ๊ฐ’์— ๋”ฐ๋ฅธ ํ•™์Šต ์†์‹ค์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. ๋ชจ๋ธ์˜ ํ•™์Šต ์‹œ์ž‘ ํ›„ ์—ํฌํฌ ๊ฐ’์ด 5๊ฐ€ ๋˜๋Š” ์‹œ์ ๊นŒ์ง€ ํ•™์Šต ์†์‹ค์ด ๊ธ‰ํ•˜๊ฒŒ ๋–จ์–ด์ง€๊ณ  ์žˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ์œผ๋ฉฐ ๋Œ€๋žต 9.5 ๋Œ€์˜ ์ •ํ™•๋„ (Accuracy)๋ฅผ ์œ ์ง€ํ•œ๋‹ค. ์ดํ›„ ์—ํฌํฌ์—๋Š” ํ•™์Šต ์†์‹ค์— ํฐ ๋ณ€ํ™”๊ฐ€ ์—†์ด ์—ํฌํฌ 70์ด ๋  ๋•Œ๊นŒ์ง€ ์ผ์ •ํ•œ ๊ฐ’์„ ์œ ์ง€ํ•จ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. ์ฆ‰, ์—ํฌํฌ ์ดˆ๊ธฐ์— ๋‘ ๋ชจ๋ธ์— ๋Œ€ํ•œ ํ•™์Šต์ด ์ถฉ๋ถ„ํžˆ ์ด๋ฃจ์–ด์ง€๋ฉฐ ์—ํฌํฌ 50 ์ดํ›„์— ๋‘ ๋ชจ๋ธ์˜ ํ•™์Šต ์†์‹ค์ด ๊ฑฐ์˜ ์œ ์‚ฌํ•ด์ง€๊ณ  ์žˆ๋‹ค.

๊ทธ๋ฆผ 7. ์—ํฌํฌ์— ๋”ฐ๋ฅธ ํ•™์Šต ์†์‹ค ์‹œ๊ฐํ™”

Fig. 7. Visualizing Training Loss in terms of the Epoch Size

../../Resources/kiee/KIEEP.2023.72.3.179/fig7.png

๋‘ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ํ‰๊ฐ€๋ฅผ ์œ„ํ•ด ๊ฐ ๋ชจ๋ธ์ด ์ƒ์„ฑํ•œ DSL ์ฝ”๋“œ์™€ Ground-truth์— ํ•ด๋‹นํ•˜๋Š” ์ •๋‹ต ์ฝ”๋“œ์™€์˜ ์˜ค๋ฅ˜์œจ์„ ์ธก์ •ํ•˜์˜€๋‹ค. ์˜ค๋ฅ˜์œจ ๋ฉ”ํŠธ๋ฆญ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์ •ํ™•๋„ ์ธก๋ฉด์—์„œ ์ œ์•ˆ๋œ ๋ชจ๋ธ์˜ DSL ์ฝ”๋“œ ์ƒ์„ฑ ๊ธฐ๋Šฅ์„ ํ‰๊ฐ€ํ•œ๋‹ค. ์˜ค๋ฅ˜์œจ์€ ์ •๋‹ต DSL ์ฝ”๋“œ์— ๋Œ€ํ•ด ์ƒ์„ฑ๋œ DSL ์ฝ”๋“œ์˜ ์˜ค๋ฅ˜๋ฅผ ์ •๋Ÿ‰ํ™”ํ•œ๋‹ค [18]. ์˜ค๋ฅ˜์œจ์€ [0, 1] ์‚ฌ์ด์— ๋ถ„ํฌํ•˜๋ฉฐ ์ƒ์„ฑ ์ฝ”๋“œ๊ฐ€ ์™„๋ฒฝํ•˜๊ฒŒ ์ผ์น˜ํ•œ ๊ฒฝ์šฐ๋Š” ์˜ค๋ฅ˜๊ฐ€ ์—†์œผ๋ฏ€๋กœ ์˜ค๋ฅ˜์œจ์€ 0์ด ๋˜๋ฉฐ ์ƒ์„ฑ ์ฝ”๋“œ์˜ ๊ธธ์ด๊ฐ€ ๋‹ค๋ฅด๊ฑฐ๋‚˜ ์ƒ์„ฑ๋œ DSL ํ† ํฐ์ด ๋ชจ๋‘ ๋‹ค๋ฅธ ๊ฒฝ์šฐ๋Š” ์ตœ๋Œ€ ์˜ค๋ฅ˜์œจ์ธ 1์ด ํ• ๋‹น๋œ๋‹ค. Ground-truth ์ฝ”๋“œ์™€ ์ƒ์„ฑ๋œ ์ฝ”๋“œ์˜ ๊ธธ์ด๊ฐ€ ๋‹ค๋ฅธ ๊ฒฝ์šฐ๋Š” ์ตœ๋Œ€ ์˜ค๋ฅ˜๊ฐ’์ธ 1์ด ํ• ๋‹น๋œ๋‹ค. ์ „์ฒด ๊ธธ์ด๊ฐ€ ๊ฐ™์€ ๊ฒฝ์šฐ๋Š” ์ƒ์„ฑ ์ฝ”๋“œ์˜ ํ† ํฐ๊ณผ Ground-truth ํ† ํฐ์ด ๋™์ผํ•˜๋ฉด 0์„, ๋‹ค๋ฅธ ๊ฒฝ์šฐ๋Š” 1์„ ํ• ๋‹นํ•œ๋‹ค. ๋ชจ๋“  ํ† ํฐ์— ๋Œ€ํ•ด ์ƒํ˜ธ ๋น„๊ต๋ฅผ ์ ์šฉํ•œ๋‹ค.

ํ‘œ 1์€ ๋ฉ€ํ‹ฐํ—ค๋“œ ์–ดํ…์…˜์ธต์˜ ํ—ค๋“œ ์ˆ˜(2, 4, 8๊ฐœ ์ ์šฉ)์— ๋”ฐ๋ฅธ Pix2code-ATT ๋ชจ๋ธ์ด ์ƒ์„ฑํ•œ ์ฝ”๋“œ ์˜ค๋ฅ˜์œจ์ด๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ์— ๋”ฐ๋ผ ์˜ค๋ฅ˜์œจ์ด ์ตœ์†Œ์ธ ํ—ค๋“œ ์ˆ˜๋ฅผ 8๋กœ ์„ค์ •ํ•œ๋‹ค.

ํ‘œ 1 ์–ดํ…์…˜ ๊ณ„์ธต ํ—ค๋“œ ์ˆ˜์— ๋”ฐ๋ฅธ Pix2code-ATT์˜ ์˜ค๋ฅ˜์œจ

Table 1 The Error Rates of Pix2code-ATT in terms of Number of Heads in the Attention Layer over the Basic Pix2code Datasets

#Epoch

Error Rates(%)

2 Heads

4 Heads

8 Heads

10

19.90

16.35

20.16

20

14.03

15.93

29.24

30

18.17

22.03

26.28

40

15.77

15.58

11.91

50

18.05

12.78

6.81

ํ‘œ 2๋Š” ์ œ์•ˆ๋œ 8 ํ—ค๋“œ Pix2code-ATT์™€ ๊ธฐ์กด Pix2code์˜ ์„ฑ๋Šฅ ๋น„๊ต๋ฅผ ์œ„ํ•ด ์˜ค๋ฅ˜์œจ์„ ์ธก์ •ํ•œ ๊ฒฐ๊ณผ์ด๋‹ค. 1,219๊ฐœ ์ด๋ฏธ์ง€์™€ ์ƒ์‘ํ•˜๋Š” DSL ์ฝ”๋“œ๊ฐ€ ๊ฐ ๋ชจ๋ธ์˜ ํ•™์Šต์— ์‚ฌ์šฉ๋˜์—ˆ์œผ๋ฉฐ ์˜ค๋ฅ˜์œจ์€ ์‹œํ—˜์šฉ ๋ฐ์ดํ„ฐ์ธ 523๊ฐœ์˜ ์ด๋ฏธ์ง€์— ์ ์šฉ๋˜์—ˆ๋‹ค. Pix2code ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ, 60 ์—ํฌํฌ์—์„œ ์˜ค๋ฅ˜์œจ์ด 10.92๋กœ ์ตœ์†Œ๊ฐ’์„ ๋ณด์˜€์œผ๋ฉฐ Pix2code-ATT ๋ชจ๋ธ์€ 50 ์—ํฌํฌ์—์„œ ์˜ค๋ฅ˜์œจ์ด 6.81๋กœ Pix2code์— ๋น„ํ•ด ์„ฑ๋Šฅ์ด ์šฐ์ˆ˜ํ•จ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

ํ‘œ 2 ์—ํฌํฌ ๊ฐ’์— ๋”ฐ๋ฅธ Pix2code-ATT์™€ Pix2code์˜ ์˜ค๋ฅ˜์œจ ๋น„๊ต

Table 2 Comparison of the Error Rate of Pix2code-ATT and Pix2code in terms of Epoch Size over the Basic Datasets

#Epoch

Error Rates(%)

Pix2code

Pix2code-ATT

10

24.74

20.16

20

27.63

29.24

30

26.04

26.28

40

12.95

11.91

50

14.73

6.81

60

10.92

12.61

70

17.49

15.02

4.2 ์‹คํ—˜ โ…ก: ๋ณต์žกํ•œ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ๋Œ€ํ•œ ์„ฑ๋Šฅ ํ‰๊ฐ€

์‹คํ—˜โ…ก๋Š” ์‹คํ—˜โ… ์—์„œ ์‚ฌ์šฉํ•œ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ์ถ”๊ฐ€ ์ƒ์„ฑํ•œ ์ด๋ฏธ์ง€์™€ ์ฝ”๋“œ๋กœ ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. ๊ทธ๋ฆผ 8๊ณผ ๊ฐ™์ด ๊ธฐ์กด ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ์ƒˆ๋กœ์šด row๋ฅผ ๋งˆ์ง€๋ง‰์— ์ถ”๊ฐ€ํ•œ๋‹ค. ์ƒˆ๋กญ๊ฒŒ ์ถ”๊ฐ€๋œ row๋“ค์€ 1, 2, ๋˜๋Š” 4๊ฐœ์˜ ์ปฌ๋Ÿผ์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ๊ฐ ์ปฌ๋Ÿผ์€ ๋ฒ„ํŠผ๊ณผ ํ…์ŠคํŠธ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ๊ทธ๋ฆผ 8์—์„œ ๊ธฐ์กด ์ด๋ฏธ์ง€๋Š” 3๊ฐœ์˜ row๋กœ ๊ตฌ์„ฑ๋˜์—ˆ์œผ๋ฉฐ ์ƒˆ๋กญ๊ฒŒ 4๋ฒˆ์งธ row๊ฐ€ ์ถ”๊ฐ€๋œ ์ด๋ฏธ์ง€๊ฐ€ ์ƒ์„ฑ๋˜์—ˆ๋‹ค. ์ถ”๊ฐ€๋œ ์ด๋ฏธ์ง€๋Š” ์ž„์˜๋กœ ๊ฒฐ์ •๋˜๋ฉฐ ์ด์— ํ•ด๋‹นํ•˜๋Š” DSL ์ฝ”๋“œ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

๊ทธ๋ฆผ 8. ์ถ”๊ฐ€๋œ ๋งˆ์ง€๋ง‰ row(์ด๋ฏธ์ง€์—์„œ ๋‘ฅ๊ทผ ์‚ฌ๊ฐํ˜• ํ‘œ์‹œ ๋ถ€๋ถ„)์™€ ์ƒ์‘ํ•˜๋Š” DSL ์ฝ”๋“œ

Fig. 8. The Added row marked with the Round Square in the Image and its Corresponding DSL Code

../../Resources/kiee/KIEEP.2023.72.3.179/fig8.png

๊ทธ๋ฆผ 9๋Š” ์ถ”๊ฐ€๋œ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ํฌํ•จ๋œ row ๊ฐœ์ˆ˜์˜ ๋ถ„ํฌ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. ํ›ˆ๋ จ ๋ฐ ์‹œํ—˜ ๋ฐ์ดํ„ฐ์˜ ๋น„์œจ์ด 8:2๋กœ ๊ฐ๊ฐ 2,368๊ฐœ์™€ 592๊ฐœ์ด๋‹ค. ๊ธฐ๋ณธ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— 4๋ฒˆ์งธ row๊ฐ€ ์ถ”๊ฐ€๋˜๋ฉฐ ์ด 1,218๊ฐœ์˜ ๋ฐ์ดํ„ฐ๊ฐ€ ์ฆ๊ฐ€ํ•˜์˜€๋‹ค. 4๋ฒˆ์งธ row์˜ ์ถ”๊ฐ€๋œ ๊ธฐ์กด ์ด๋ฏธ์ง€์— ๋น„ํ•ด ์ด๋ฏธ์ง€ ๋ณต์žก๋„๊ฐ€ ์ฆ๊ฐ€ํ•˜์˜€๋‹ค.

๊ทธ๋ฆผ 9. ํ™•์žฅ๋œ ํ›ˆ๋ จ ๋ฐ ์‹œํ—˜ ๋ฐ์ดํ„ฐ ์„ธํŠธ์˜ row ๊ฐœ์ˆ˜

Fig. 9. The Number of rows in the Extended Training and Testing Datasets

../../Resources/kiee/KIEEP.2023.72.3.179/fig9.png

์ƒˆ๋กญ๊ฒŒ ์ถ”๊ฐ€๋œ row์— ๋Œ€ํ•ด ์ƒ์„ฑ๋œ ์ฝ”๋“œ๋Š” ๋ฒ„ํŠผ ์ •๋ณด ๋“ฑ ์ผ๋ถ€ ๋‚ด์šฉ์ด ์ธ์‹๋˜์ง€ ์•Š๊ฑฐ๋‚˜ row์˜ ๋งˆ์ง€๋ง‰ ๊ด„ํ˜ธ(\})๊ฐ€ ๋ˆ„๋ฝ๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋ฐœ์ƒํ•˜์˜€๋‹ค (๊ทธ๋ฆผ 10 ์ฐธ์กฐ). ๋˜ํ•œ, row์˜ quadruple์„ double๋กœ ์ธ์‹ํ•จ์œผ๋กœ์จ ์ƒ์„ฑ๋œ ์ฝ”๋“œ์˜ ๊ธธ์ด๊ฐ€ Ground-true ์ฝ”๋“œ์˜ ๊ธธ์ด์— ๋น„ํ•ด ์งง์•„์ง€๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋ฐœ์ƒํ•˜์˜€๋‹ค. ๊ธธ์–ด์ง„ DSL ์ฝ”๋“œ๋กœ ์ธํ•ด ํ•™์Šต๊ณผ ์‹œํ—˜ ๊ณผ์ •์—์„œ ์ •ํ™•๋„๊ฐ€ ๋–จ์–ด์ง„ ๊ฒƒ์œผ๋กœ ํŒ๋‹จ๋˜๋ฉฐ ์ œ์•ˆ๋œ ๋ชจ๋ธ์˜ ํ•˜์ดํผํŒจ๋Ÿฌ๋ฏธํ„ฐ๋ฅผ ์„ธ๋ถ€ ์กฐ์ •ํ•˜๊ฑฐ๋‚˜ ์ƒˆ๋กœ์šด ๋ชจ๋ธ ๊ณ„์ธต์„ ์ถ”๊ฐ€ํ•จ์œผ๋กœ์จ ์„ฑ๋Šฅ ๊ฐœ์„ ์„ ๊ธฐ๋Œ€ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.

๊ทธ๋ฆผ 10. DSL ์ฝ”๋“œ ์ƒ์„ฑ ์‹คํŒจ ์‚ฌ๋ก€๋“ค

Fig. 10. The Failure Cases of DSL Code Generation

../../Resources/kiee/KIEEP.2023.72.3.179/fig10.png

ํ‘œ 3์€ ํ™•์žฅ๋œ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ๋Œ€ํ•ด์„œ Pix2code-ATT์˜ DSL ์ƒ์„ฑ ์ฝ”๋“œ์˜ ์˜ค๋ฅ˜์œจ์„ ํ‘œํ˜„ํ•œ๋‹ค. ์—ํฌํฌ ๊ฐ’์ด 5 ์ดํ›„๋กœ ์—๋Ÿฌ์œจ์ด ์ผ์ • ๋ฒ”์œ„๋‚ด์—์„œ ์œ ์ง€๋จ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. ์—ํฌํฌ 5 ์‹œ์ ์— ํ•™์Šต ์†์‹ค์ด ํฌ๊ฒŒ ์ค„์—ˆ๋˜ ์‚ฌํ•ญ์„ ๋ฐ˜์˜ํ•œ ๊ฒฐ๊ณผ๋ผ๊ณ  ํŒ๋‹จ๋œ๋‹ค.

ํ‘œ 3 ํ™•์žฅ๋œ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ๋Œ€ํ•ด ์—ํฌํฌ ๊ฐ’์— ๋”ฐ๋ฅธ Pix2code-ATT์˜ ์˜ค๋ฅ˜์œจ

Table 3 The Error Rate of Pix2code-ATT in terms of Epoch Size on the Extended Datasets

#Epoch

Error(%)

5

32.79

10

18.81

20

26.66

30

23.58

40

20.47

50

20.23

60

19.95

70

21.16

5. ๊ฒฐ๋ก  ๋ฐ ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ

๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์‹œํ•œ ์ฝ”๋“œ ์ƒ์„ฑ ๋ชจ๋ธ์€ Tony Beltramelli์˜ Pix2code ๋ชจ๋ธ๊ณผ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ๋‹ค. CNN๊ณผ LSTM์œผ๋กœ ๊ตฌ์„ฑ๋œ, ์›นํŽ˜์ด์ง€ ์ด๋ฏธ์ง€์—์„œ DSL ์ฝ”๋“œ๋ฅผ ์ž๋™ ์ƒ์„ฑํ•˜๋Š” ๊ธฐ์กด ๋ชจ๋ธ์— ์–ดํ…์…˜ ๊ณ„์ธต์„ ์ถ”๊ฐ€ํ•˜์˜€๋‹ค. ๋˜ํ•œ, ๋ฐ์ดํ„ฐ ์„ธํŠธ์˜ ๋ณ€ํ™”์—๋„ ์ œ์•ˆ๋œ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ์˜๋ฏธ ์žˆ์Œ์„ ์‹คํ—˜ํ•˜์˜€๋‹ค. ์ œ์•ˆ๋œ Pix2code-ATT ๋ชจ๋ธ์— ์˜ํ•ด HTML ์ฝ”๋“œ์˜ ์ผ๋ถ€๋ผ๋„ ์ž๋™ ์ƒ์„ฑ๋œ๋‹ค๋ฉด, ์›น์‚ฌ์ดํŠธ UI ๊ฐœ๋ฐœ์˜ ์ƒ์‚ฐ์„ฑ์„ ๋†’์ผ ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋œ๋‹ค. ๋น„์ „๋ฌธ ํ”„๋กœ๊ทธ๋ž˜๋จธ์ธ ์›น ๋””์ž์ด๋„ˆ์™€ ์›นํŽ˜์ด์ง€ ๊ฐœ๋ฐœ์ž ์‚ฌ์ด์˜ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋Šฅ๋ ฅ ์ฐจ์ด๋ฅผ ์ฝ”๋“œ ์ž๋™ ์ƒ์„ฑ์„ ํ†ตํ•ด ์ค„์ผ ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋œ๋‹ค.

ํ–ฅํ›„ ์—ฐ๊ตฌ ๋‚ด์šฉ์œผ๋กœ ์‹ค์ œ์ ์ธ ์›นํŽ˜์ด์ง€ ์ด๋ฏธ์ง€๋กœ ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ๊ตฌ์ถ•ํ•œ๋‹ค๋ฉด, ๋ณด๋‹ค ์‹ ๋ขฐ์„ฑ ์žˆ๊ณ  ์‹ค์šฉ์ ์ธ Pix2code- ATT ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค. ๋˜ํ•œ, ๊ทธ๋ž˜ํ”ฝ ์‚ฌ์šฉ์ž ์ธํ„ฐํŽ˜์ด์Šค๊ฐ€ ์ค‘์š”์‹œ๋˜๋Š” ๋ชจ๋ฐ”์ผ ์‘์šฉ ๋ถ„์•ผ์—๋„ ์ œ์•ˆ๋œ ์ ‘๊ทผ๋ฐฉ๋ฒ•์„ ์ ์šฉํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋œ๋‹ค.

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(No. 2021R1I1A3056172)

References

1 
T. Beltramelli, โ€œPix2code: Generating code from a graphical user interface screenshot,โ€ ArXiv, 2017. [Online]. Available: https://arxiv.org/abs/1705.07962.URL
2 
Zhang, Zhihang, Ye Ding, and Chenlin Huang, โ€œAutomatic Front-end Code Generation from image Via Multi-Head Attention,โ€ 2023 4th International Conference on Computer Engineering and Application (ICCEA), IEEE, 2023.DOI
3 
Y. Liu, Q. Hu, and K. Shu. โ€œImproving pix2code based Bi-directional LSTM,โ€ In 2018 IEEE International Conference on Automation, Electronics and Electrical Engineering (AUTEEE), IEEE, 2018.DOI
4 
W. Zhang, S. Luan, and L. Tian, โ€œA Rapid Combined Model for Automatic Generating Web UI Codes,โ€ Wireless Communications and Mobile Computing, 2022.DOI
5 
S. Riaz, A. Arshad, S. S. Band, and A. Mosavi, โ€œTransforming hand drawn wireframes into front-end code with deep learning,โ€ Computers, Materials & Continua, vol. 72, no. 3, pp. 4303โ€“4321, 2022.DOI
6 
A. Robinson, โ€œSketch2code: Generating a website from a paper mockup,โ€ ArXiv, 2019. [Online]. Available: https://arxiv. org/abs/1905.13750.URL
7 
V. Jain, P. Agrawal, S. Banga, R. Kapoor, and S. Gulyani, โ€œSketch2Code: Transformation of Sketches to UI in Real- time Using Deep Neural Network,โ€ ArXiv, 2019. [Online]. Available: https://arxiv.org/abs/1910.08930v1.URL
8 
T. Bouรงas and A. Esteves, โ€œConverting Web Pages Mockups to HTML using Machine Learning,โ€ In Proceedings of the 16th International Conference on Web Information Systems and Technologies, 2020.URL
9 
Y. Xu, L. Bo, X. Sun, B. Li, J. Jiang, and W. Zhou. โ€œimage2emmet: Automatic code generation from web user interface image,โ€ Journal of Software: Evolution and Process, 2021.DOI
10 
J. S. Ferreira, A. Restivo, and H. S. Ferreira, โ€œAutomatically Generating Websites from Hand-drawn Mockups,โ€ In Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. 2021.URL
11 
X. Yao, โ€œAutomatic GUI Code Generation with Deep Learning,โ€ PhD dissertation, Manchester Metropolitan University, 2022.URL
12 
K. Kikuchi, M. Otani, K. Yamaguchi, and E. Simo-Serra, โ€œModeling Visual Containment for Web Page Layout Optimization,โ€ Computer Graphics Forum. vol. 40. no. 7. 2021.DOI
13 
D. de Souza Baulรฉ, C. G. von Wangenheim, A. von Wangenheim, and J. C. Hauck, โ€œRecent Progress in Automated Code Generation from GUI Images Using Machine Learning Techniques,โ€ J. Universal Computer Science, 26(9), 2020.URL
14 
C. Chen, T. Su, G. Meng, Z. Xing, and Y. Liu, โ€œFrom UI Design Image to GUI Skeleton: A Neural Machine Translator to Bootstrap Mobile GUI Implementation,โ€ 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), Gothenburg, Sweden, 2018.DOI
15 
S. Hochreiter and J. Schmidhuber., โ€œLong short-term memory,โ€ Neural computation, 9(8):1735โ€“1780, 1997.DOI
16 
F. A. Gers, J. Schmidhuber, and F. Cummins, โ€œLearning to forget: Continual prediction with lstm,โ€ Neural computation, 12(10), 2000.DOI
17 
Text classification with Transformer, https://keras.io/examples/nlp/text_classification_with_transformer/, Accessed July 1, 2023.URL
18 
X. Pang, Y. Zhou, P. Li, W. Lin., W. Wu, and J. Wang, โ€œA novel syntax-aware automatic graphics code generation with attention-based deep neural network,โ€ Journal of Network and Computer Applications, 2020.DOI

์ €์ž์†Œ๊ฐœ

๊น€๋™๊ด€ (Kim, Dong Kwan)
../../Resources/kiee/KIEEP.2023.72.3.179/au1.png

In 1993 and 1998, respectively, he earned his B.S. and M.S. degrees in the Department of Computer Engineering at Soongsil University. He later completed his Ph.D. in Computer Science at Virginia Tech in 2009, specializing in software engineering. He currently holds the position of an associate professor in the Department of Computer Engineering at Mokpo National Maritime University. His research interests include deep learning, software evolution, run-time systems, and mobile programming. E-mail: dongkwan@mmu.ac.kr