7. Training Neural Networks 2

training 데이터의 loss값 낮추는 것

learning rate은 hyperparameter!

Optimization은 training 데이터의 loss값을 줄이는 것이었고 이번에는,

training 데이터와 valiatation 데이터의 gap을 줄이는 것(overfitting)
-> Model Ensemble & Regularization

Model Ensemble: use multiple snapshots of a single model during training& average results

-> training할 때 random noise를 주고, test할 때는 noise를 marginalize해서 overfitting을 줄인다.

L1, L2 regularization
L1, L2는 neural net에서 많이 쓰이지 않는다!
Regularization
Dropout
train에서는 overfitting하지 않게 랜덤성을 부여, test에서는 랜덤성을 average out
- 주로, Fully Connected layer에서 함
- test할 때는 dropout probability만큼 multiply해준다!
Batch Normalization : dropout과 마찬가지로 train에서는 랜덤성 부여, test에서는 랜덤성을 average out
Data Augmentation : random crops and scales, color jittering이미지를 변형 시킴
(신규) DropConnect : output이 아닌 w를 0으로
(신규) Fractional Max Pooling : train에는 필터를 랜덤하게 조합, test에서는 average out
(신규) Stochastic Depth : train에는 depth를 랜덤으로 drop, test에서는 전체 네트워크 사용

데이터의 양이 적을 때, 풀고자 하는 문제와 비슷하면서 사이즈가 큰 데이터로 이미 학습되어 있는 모델을 이용
Object Detection, Image Captioning 모두 이미지넷의 CNN으로 시작함!

9. CNN Architectures (0)	2024.05.05
8. Deep Learning Software (0)	2024.04.29
6. Training Neural Networks 1 (0)	2024.04.08
5. Convolutional Neural Networks (0)	2024.04.08
4. Introduction to Neural Networks (0)	2024.04.02

sonyCYDFsame