Jeongho Shin (Leo)

Someone who wants to change the world.

[Google_Bootcamp_Day11]

Updated: November 07, 2020

Batch Normalization

makes hyperparameter search much easier

Normalizing inputs to speed up learning

Normalizing input features help train ‘w,b’ more efficiently
Same process to hidden layer

Implementing Batch Norm

Adding Batch Norm to a network

With mini-batch

Implementing gradient descent

Why Batch norm work

Problem
- Even the exact value of Z1,Z2,…,Z4 (all in second layer) change, at least the mean and the variance remain same
Solution
- Batch Norm reduces the amount that the distribution of hidden unit values shift around
- Limits the amount that updating parameters in the earlier layers can affect the distribution of values
- Can use batch norm as regularization
  - Each mini-batch is scaled by the mean/variance computed on just that mini-batch
  - This adds some noise to the values z[l] within that minibatch.
  - So, similar to dropout, it adds some noise to each hidden layer’s activations
  - large mini-batch size -> reduce noise -> reduce regularization effect

Batch Norm at test time

Softmax regression

Ex. Recognizing cats(1), dogs(2), and baby chicks(3)
Softmax regression generalizes logistic regression to C classes

Softmax layer

[Source] https://www.coursera.org/learn/deep-neural-network

Share on

Twitter Facebook LinkedIn

Leave a comment

You may also enjoy

[Network] Transport Layer 2

April 09 2021

Transport Layer 2

[Network] Transport Layer 1

April 09 2021

Transport Layer 1

[Web Backend] SQL Basic / JDBC

March 31 2021

SQL 데이터베이스에 있는 정보를 사용할 수 있도록 지원하는 언어 모든 DBMS에서 사용가능 대소문자는 따로 구별하지 않음 (단, 데이터의 대소문자는 구분)

[Data Structure] Linked_List

February 13 2021

Linked List