๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

๐“’๐“ช๐“ฝ๐“ฎ๐“ฐ๐“ธ๐“ป๐”‚26

Blog Update ํ•ญ์ƒ ์•Œ์ฐฌ ๋งˆ์Œ์œผ๋กœ ๊ธ€์“ฐ๊ธฐ๋ฅผ ์‹œ์ž‘ํ–ˆ๋‹ค๊ฐ€ ๋งˆ๋ฌด๋ฆฌํ•˜์ง€ ๋ชปํ•˜๊ณ  ํ์ง€๋ถ€์ง€ ๋๋‚ธ ๊ธ€๋“ค์ด ๋Œ€๋‹ค์ˆ˜์ธ ๊ฒƒ ๊ฐ™๋‹ค.. ์ข…๊ฐ•ํ•˜๊ณ  ๋‚˜์„œ๋Š” ๊ณต๋ถ€ํ•œ ๋‚ด์šฉ๋“ค์„ ๋ธ”๋กœ๊ทธ์— ์ œ๋Œ€๋กœ ์ •๋ฆฌํ•ด๋†”์„œํ•„์š”ํ•  ๋•Œ ๊ธˆ๋ฐฉ ๋ฆฌ๋งˆ์ธ๋“œ ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด๋†”์•ผ๊ฒ ๋‹ค..:)  ๊ทธ๋Ÿฐ ์˜๋ฏธ์—์„œ.. ๋‚˜ ์–ผ๋  ์ข…๊ฐ•์‹œ์ผœ์ค˜ใ… ใ… ใ… ใ…  2024. 6. 14.
[Overview] Image Formation Image Formation : Projection of 3D scene onto 2D plane : scene๊ณผ image๊ฐ„์˜ geometric and photometric relation์— ๋Œ€ํ•ด์„œ ์ดํ•ดํ•  ํ•„์š”๊ฐ€ ์žˆ์Œ - geometric : scene์˜ ํ•œ point๊ฐ€ ์žˆ์„ ๋•Œ, image์— ์–ด๋–ป๊ฒŒ ํ‘œํ˜„๋˜๋Š”์ง€์˜ ๊ด€์  - photometric : scene์˜ brightness์™€ apearance๊ฐ€ image์—์„œ๋Š” ์–ด๋–ป๊ฒŒ ํ‘œํ˜„๋˜๋Š”์ง€์˜ ๊ด€์  Topics : (1) Pinhole and Perspective Projection - ๊ฐ€์žฅ ๊ธฐ๋ณธ์ด ๋˜๋Š”, ์—ญ์‚ฌ๊ฐ€ ๊ธด pinhole ์นด๋ฉ”๋ผ์— ๋Œ€ํ•ด์„œ ์•Œ์•„๋ณด๊ฒ ์Œ - ๋ฌผ๋ก  ์žฅ์ ๋„ ๋งŽ์€ ์นด๋ฉ”๋ผ์ด์ง€๋งŒ (can produce great clarity) ๋น›์„ ๋ชจ์œผ๋Š”๋ฐ ๋ฌธ์ œ๊ฐ€ .. 2024. 4. 6.
[๋…ผ๋ฌธ ๋ฐœํ‘œ] NeRF : Representing Scenes as Neural Radiance Fields for View Synthesis ์ด๋ฒˆ์— NeRF ๋…ผ๋ฌธ์„ ์ฝ์—ˆ๋Š”๋ฐ์š”, ๊ฐ„๋žตํ•˜๊ฒŒ NeRF์— ๋Œ€ํ•ด์„œ ์„ค๋ช…ํ•œ ๋‹ค์Œ์— ๊ตฌ์ฒด์ ์œผ๋กœ NeRF์˜ ์•„ํ‚คํ…์ฒ˜์™€ ๊ตฌํ˜„ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด์„œ ๋ฐœํ‘œํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. NeRF๋Š” Neural Radiance Field์˜ ์•ฝ์ž์ž…๋‹ˆ๋‹ค. ์ œ๋ชฉ์—์„œ๋Š” View Synthesis๋ฅผ ํ•˜๊ธฐ ์œ„ํ•ด์„œ NeRF๋ฅผ ์‚ฌ์šฉํ•˜๊ณ ์ž ํ•œ๋‹ค๋Š”๋ฐ์š”, ์—ฌ๊ธฐ์„œ view synthesis๋ผ๋Š” ๊ฒƒ์€ ์—ฌ๋Ÿฌ view์—์„œ ์ฐ์€ ์–ด๋–ค ๊ฐ์ฒด์˜ ์‚ฌ์ง„์„ ํ•™์Šต ์‹œ์ผฐ์„ ๋•Œ ๊ฐ์ฒด๋ฅผ ์ƒˆ๋กœ์šด view์—์„œ ๋ฐ”๋ผ๋ณด์•˜์„ ๋•Œ์˜ ๋ชจ์Šต์„ ์•Œ์•„๋‚ด๋Š” ์ž‘์—…์ž…๋‹ˆ๋‹ค. ์•ž์„œ ์—ฐ๊ตฌ์—์„œ๋Š” ์ด view synthesis ์ž‘์—…์ด ์„ฑ๋Šฅ์ด ์ข‹์ง€ ์•Š๊ฒŒ ๋‚˜์˜ค๊ฑฐ๋‚˜ ๋งŽ์€ ๋ฐ์ดํ„ฐ์…‹์„ ์š”๊ตฌํ•ด ๋„ˆ๋ฌด ๋งŽ์€ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์š”ํ•˜๋Š” ์–ด๋ ค์›€์ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” ์ƒˆ๋กœ์šด ๋ฐฉ์•ˆ์ธ "NeRF"๋ฅผ ์ œ์‹œํ•œ๊ฑด๋ฐ.. 2024. 3. 13.
Lecture 2. Image Classification Image Classification What is Image Classification? Example : Input : ๊ณ ์–‘์ด ์‚ฌ์ง„ ์ปดํ“จํ„ฐ๋Š” ์‚ฌ์ „์— ์ •ํ•ด์ง„ label๋“ค์˜ ์ง‘ํ•ฉ์„(predetermined set of labels) ๊ฐ€์ง€๊ณ , input๊ฐ’๊ณผ ์ผ์น˜ํ•˜๋Š” label๊ฐ’์„ output์œผ๋กœ ์ถœ๋ ฅํ•˜๋„๋ก ๊ณ„์‚ฐํ•œ๋‹ค. Output : Cat Semantic Gap (์˜๋ฏธ์  ์ฐจ์ด) ์ •์˜ : ์‹ค์ œ ์ด๋ฏธ์ง€๊ฐ€ ๊ฐ–๊ณ  ์žˆ๋Š” ์˜๋ฏธ์™€ ์ปดํ“จํ„ฐ๊ฐ€ ๋ณด๋Š” ํ”ฝ์…€๊ฐ’ ์˜๋ฏธ์˜ ์ฐจ์ด ์šฐ๋ฆฌ๋Š” ์‰ฝ๊ฒŒ ๊ณ ์–‘์ด๋ฅผ ๋ณด๊ณ  "๊ณ ์–‘์ด"์ž„์„ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ์ปดํ“จํ„ฐ์˜ ๊ฒฝ์šฐ์—๋Š” ํ•˜๋‚˜์˜ image๊ฐ€ ๊ฑฐ๋Œ€ํ•œ ์ˆซ์ž ๊ทธ๋ฆฌ๋“œ(gigantic grid of numbers)๋กœ ๋ณด์ด๊ธฐ ๋•Œ๋ฌธ์— ๊ณ ์–‘์ด๋ฅผ ๋ฐ”๋กœ ์—ฐ์ƒํ•  ์ˆ˜ ์—†๋Š” ๊ฒƒ์ด๋‹ค. Challenges : Viewpoin.. 2024. 2. 16.
Lecture 13. Generative Models Overview - Unsupervised Learning - Generative Models PixelRNN and PixelCNN Variational Autoencoders (VAE) Generative Adversarial Networks (GAN) Classification : Input : Image Output : Text (Label) Object Detection : Input : Image Output : Bounding Boxes of instances Semantic Segmentation (having label for every pixel) : ? Image Captioning : Input : Image Output : Caption (form of natural languag.. 2024. 2. 13.
[๋…ผ๋ฌธ ์Šคํ„ฐ๋””] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT) ๋…ผ๋ฌธ ์ƒ์„ฑ ๋ฐฐ๊ฒฝ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ์—์„œ๋Š” ์ด์ œ RNN์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ  transformer๊ฐ€ NLP์˜ ํ‘œ์ค€์ด๋ผ๊ณ  ํ•  ์ •๋„๋กœ ์ž๋ฆฌ๊ฐ€ ์žกํžŒ ์ค‘์š”ํ•œ ๋ชจ๋ธ์ด๋‹ค. ์ด๋ฅผ ์ปดํ“จํ„ฐ ๋น„์ „์˜ Image Classification์— ์ ์šฉ์„ ํ•ด๋ณด๊ธฐ ์œ„ํ•ด ๋งŽ์€ ๋…ธ๋ ฅ๋“ค์ด ์žˆ์—ˆ์ง€๋งŒ, ์—ฌ์ „ํžˆ CNN ๋ชจ๋ธ์— ์˜์กด์ ์ธ ๋ชจ๋ธ๋“ค์ด ๋งŽ์ด ๋‚˜์™”๊ณ  ์™„๋ฒฝํ•˜๊ฒŒ transformer๋งŒ ์‚ฌ์šฉํ•œ ๋ชจ๋ธ๋“ค์€ ์ด๋ก ์ ์œผ๋กœ๋Š” ํšจ์œจ์ ์ด๊ฒ ์ง€๋งŒ, specialized attention pattern๋“ค์„ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ตœ์‹  ํ•˜๋“œ์›จ์–ด ๊ฐ€์†๊ธฐ์—์„œ๋Š” ์•„์ง ํšจ๊ณผ์ ์œผ๋กœ ํ™•์žฅ๋˜์ง€ ์•Š์•˜๋‹ค. ๊ทธ๋ž˜์„œ ์ด ๋…ผ๋ฌธ์—์„œ๋Š” CNN ๊ตฌ์กฐ๋ฅผ ๋ฒ„๋ฆฐ, ์˜จ์ „ํžˆ transformer๋งŒ ์‚ฌ์šฉํ•˜์—ฌ Image Classificationํ•  ์ˆ˜ ์žˆ๋„๋ก ViT(Vision Transformer) ๋ชจ๋ธ์ด ๋‚˜์˜ด ๋…ผ๋ฌธ ๋ชจ๋ธ ๊ตฌ.. 2024. 2. 9.