๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
AI/Stanford CS231n

Lecture 2. Image Classification

by coderSohyun 2024. 2. 16.

Image Classification

What is Image Classification?

  • Example :
    • Input : ๊ณ ์–‘์ด ์‚ฌ์ง„
    • ์ปดํ“จํ„ฐ๋Š” ์‚ฌ์ „์— ์ •ํ•ด์ง„ label๋“ค์˜ ์ง‘ํ•ฉ์„(predetermined set of labels) ๊ฐ€์ง€๊ณ , input๊ฐ’๊ณผ ์ผ์น˜ํ•˜๋Š” label๊ฐ’์„ output์œผ๋กœ ์ถœ๋ ฅํ•˜๋„๋ก ๊ณ„์‚ฐํ•œ๋‹ค. 
    • Output : Cat

 

Semantic Gap (์˜๋ฏธ์  ์ฐจ์ด)

  • ์ •์˜ : ์‹ค์ œ ์ด๋ฏธ์ง€๊ฐ€ ๊ฐ–๊ณ  ์žˆ๋Š” ์˜๋ฏธ์™€ ์ปดํ“จํ„ฐ๊ฐ€ ๋ณด๋Š” ํ”ฝ์…€๊ฐ’ ์˜๋ฏธ์˜ ์ฐจ์ด 
    • ์šฐ๋ฆฌ๋Š” ์‰ฝ๊ฒŒ ๊ณ ์–‘์ด๋ฅผ ๋ณด๊ณ  "๊ณ ์–‘์ด"์ž„์„ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ์ปดํ“จํ„ฐ์˜ ๊ฒฝ์šฐ์—๋Š” ํ•˜๋‚˜์˜ image๊ฐ€ ๊ฑฐ๋Œ€ํ•œ ์ˆซ์ž ๊ทธ๋ฆฌ๋“œ(gigantic grid of numbers)๋กœ ๋ณด์ด๊ธฐ ๋•Œ๋ฌธ์— ๊ณ ์–‘์ด๋ฅผ ๋ฐ”๋กœ ์—ฐ์ƒํ•  ์ˆ˜ ์—†๋Š” ๊ฒƒ์ด๋‹ค. 
  • Challenges : Viewpoint variations (๋ฐ”๋กœ๋ณด๋Š” ๊ด€์ ์—์„œ์˜ ์ฐจ์ด ์ธ์‹), Illumination (๋น›์˜ ๋ฐ˜์‚ฌ์— ์˜ํ•œ ์ฐจ์ด ์ธ์‹), Deformation, Occlusion, Background and Clutter, Intraclass variations
    • ํ•˜์ง€๋งŒ ํ˜„์žฌ์˜ ์ปดํ“จํ„ฐ๋Š” ์ธ๊ฐ„ ์ˆ˜์ค€์˜ ์ •ํ™•๋„๋กœ ์ œํ•œ๋œ ์ƒํ™ฉ ์†์—์„œ ๋น ๋ฅด๊ฒŒ ์‚ฌ๋ฌผ์„ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ์Œ!              → HOW?? 

๊ฐ™์€ ์‚ฌ๋ฌผ์ด์ง€๋งŒ ์‚ฌ๋ญ‡ ๋‹ค๋ฅด๊ธฐ๋งŒ ํ•ด๋„, ์ด๋ฏธ์ง€์˜ ํ”ฝ์…€๊ฐ’์ด ์™„์ „ํžˆ ๋ฐ”๋€Œ๋ฏ€๋กœ ์ปดํ“จํ„ฐ๋Š” ๋‹ค๋ฅด๋‹ค๊ณ  ์ธ์‹ํ•  ๊ฒƒ์ž„.

 

 

How to Implement Image Classification

  • Image Classification์„ ํ•˜๊ธฐ ์œ„ํ•ด์„œ ์–ด๋–ค API๋ฅผ ๋งŒ๋“ค์–ด๋ณผ ์ˆ˜ ์žˆ์„๊นŒ? ๊ณ ๋ฏผํ•˜๊ฒŒ ๋  ๊ฒƒ์ด๋‹ค.
  • ๋ญ”๊ฐ€ image๋ฅผ inputํ•˜๊ณ  ์ค‘๊ฐ„์— ๋งˆ๋ฒ• ๊ฐ™์€ ์ฝ”๋“œ๋ฅผ ๋งŒ๋“ค์–ด์„œ class_label์„ return ๋ฐ›๊ณ  ์‹ถ์„ ๊ฒƒ์ด์ง€๋งŒ..
  • ๋ณดํ†ต ์ผ์ด ์ฃผ์–ด์ ธ์„œ ์ˆœ์ฐจ์ ์œผ๋กœ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๋งŒ๋“ค์–ด ๋‚˜๊ฐ€๋Š” ๊ฒƒ๊ณผ ๋‹ค๋ฅด๊ฒŒ..                         ์ด๊ฑธ ์–ด๋–ป๊ฒŒ ์ฝ”๋“œ๋ฅผ ์งœ์•ผ ์ž˜ ์งฐ๋‹ค๊ณ  ์†Œ๋ฌธ์ด ๋‚ ์ง€ ๋„์ €ํžˆ ๊ฐ์ด ์žกํžˆ์ง€ ์•Š์„ ๊ฒƒ์ด๋‹ค.. 

 

 

 

  • ๊ทธ๋ž˜๋„ ์ผ๋‹จ ํ•œ ๊ฐ€์ง€์˜ ๋ฐฉ๋ฒ•์„ ๋– ์˜ฌ๋ ค๋ณด์ž๋ฉด! ๊ณ ์–‘์ด๋ฅผ ์ธ์‹ํ•  ์ˆ˜ ์žˆ๋Š” ๊ทœ์น™(rule)์„ ๋งŒ๋“ค์–ด ๋ณผ ์ˆ˜๋„ ์žˆ์„ ๊ฒƒ์ด๋‹ค.
    • ๊ณ ์–‘์ด์—๊ฒŒ๋Š” ๋ˆˆ, ๊ท€, ์ž…, ์ฝ” ๋“ฑ์ด ์žˆ์Œ
    • visual recognition์—๋Š” edge๋“ค์ด ์ค‘์š”ํ•œ ๊ฒƒ์„ ์•Ž
      • ํ˜•์ฒด์™€ edge์— ๋งž๊ฒŒ corner์™€ boundaries๊ฐ€ ์žˆ์Œ์„ ์ธ์‹ํ•˜๊ณ  ๊ทœ์น™์„ ์„ธ์›Œ๋ด„
      • ํ•˜์ง€๋งŒ ์ด ๋ฐฉ๋ฒ•์€ 1. super brittle 2. ๋ชจ๋“  object category๋งˆ๋‹ค ๊ทœ์น™์„ ๋งŒ๋“ค์–ด์ค˜์•ผํ•จ (์ข‹์ง€ ์•Š์€ ๋ฐฉ๋ฒ•..)

 

 

Data-Driven Approach

1. Collect a dataset of images and labels

  • ์ธํ„ฐ๋„ท์„ ํ†ตํ•ด์„œ, large quantity & high quality dataset์„ ๊ตฌํ•  ์ˆ˜ ์žˆ์Œ
  • dataset์„ ์ง์ ‘ ์ฐพ๋Š” ๊ณผ์ •์€ ์—„์ฒญ ์˜ค๋ž˜ ๊ฑธ๋ฆฌ์ง€๋งŒ, ๋‹คํ–‰ํžˆ ์ธํ„ฐ๋„ท์— ์ข‹์€ dataset๋“ค์ด ๋งŽ์€ ์ƒํƒœ 

2. Use Machine Learning to train a classifier 

  • ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ๋ฅผ ๋จธ์‹ ๋Ÿฌ๋‹์— ๋„ฃ์œผ๋ฉด ๋ฐ์ดํ„ฐ๋ฅผ ์š”์•ฝํ•˜๊ณ  ๋ชจ๋ธ์— ์ง‘์–ด๋„ฃ๊ณ  ํ•™์Šต์ด ์™„๋ฃŒ๊ฐ€ ๋œ ๋ชจ๋ธ์ด ๋‚˜์˜ฌ ๊ฒƒ์ž„

3. Evaluate the classifier on new images 

  • ์•ž์„œ ํ•™์Šต์‹œํ‚จ ๋ชจ๋ธ์— ์ƒˆ๋กœ์šด image๋ฅผ ๋„ฃ์–ด image classification์„ ์ˆ˜ํ–‰ํ•จ

์•ž์„œ์™€ ๋‹ฌ๋ฆฌ, train์„ ํ•˜๋Š” ํ•จ์ˆ˜(input: images and labels, output: model)์™€ predict๋ฅผ ํ•˜๋Š” ํ•จ์ˆ˜(input: model, test image, output: test labels) ๋‘ ๊ฐ€์ง€๋ฅผ ํ†ตํ•ด์„œ Image Classification์„ ํ•จ์„ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ์Œ.

 

 

 

So, in this example, the image is 800 by 600 pixels

And each pixel is represented by three numbers,

giving the red, green, and blue values for that pixels.

 

 

 

 

 

 

'AI > Stanford CS231n' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

Lecture 13. Generative Models  (0) 2024.02.13