[Google_Bootcamp_Day1]

Updated:

Google Machine Learning Bootcamp (Coursera Deep Learning Specialization)

GB(Google Bootcamp) series is based on the coursera lecture “Deep Learning Specialization”. These posts are for my review of basic machine learning, but I also hope that it could be helpful for others.

  • If you’ve done coursera lecture “Deep Learning Specialization”, then it would be perfect for you to remind.
  • If you studied basic machine learning before, it could be nice to wrap up your knowledge.
  • If you haven’t studied basic machine learning before, then I recommend you to use this blog posts as supplementary materials with other full-organized materials(books, lectures, etc.)

Standard notations for Deep Learning

notation

Binary Classification

  • The result is a discrete value output
  • Goal is to learn a classifier that can input an image represented by feature vector x, and predict whether the corresponding label y is 1 or 0, that is, whether this is a cat image or a non-cat image image
  • An image is stored in the computer in three separate matrices corresponding to Red, Green and Blue color channels of the image
  • To create a feature vecotr x, the pixel value will be “reshape” for each color
  • Dimension of the input feature vector x is n_x = 64 * 64 * 3 = 12,288 feature_vector

Logistic Regression

  • A learning algorithm used in a supervised learning problem when the output y are either 0 or 1 (used in binary classification model)
  • Goal of logistic regression is to minimize the error between its predictions and training data logistic
  • wTx + b is a linear function, but since we are looking for a probability constraint between 0 and 1, so the sigmoid function is used
  • If z is a large positive number, then sigma(z) = 1
  • If z is small or large negative number, then sigma(z) = 0
  • If z = 0, then sigma(z) = 0.5

Logistic Regression Cost function

  1. Loss function
    • measures the discrepancy between the prediction(y_hat) and the desired output(y)
    • computes the error for a single training example

loss1

  • This loss function has an optimization problem (multiple local optima), so gradient descent may not find global optimum
  • Sensitive to outliers
  • Usually use the loss function below (cross-entropy loss)

loss2

  1. Cost function
    • the average of the loss function of the entire training set
    • try to find the parameters w and b that minimize the overall cost function cost

supp

Gradient Descent

gradient

Computation Graph

computation

Computing derivatives

derivatives

[Source] https://www.coursera.org/learn/neural-networks-deep-learning

Leave a comment