[Google_Bootcamp_Day20]

Updated: November 23, 2020

Object Detection

Pass lots of little cropped images into the ConvNet and have it classified zero or one for each position
Problem : computation cost

Turning FC layer into Convolutional layers
Convolution implementation
crop out a region
Repeat run cropped image to your ConvNet until recognizes the object
Instead of doing it sequentially, implement the entire image, all maybe 28 by 28 and convolutionally make all the predictions at the same time by one forward pass
Problem : the position of the bounding boxes is not going to be too accurate (none of the bounding boxes match exactly)

Non-max suppression is a way for you to make sure that your algorithm detects each object only once
What non-max suppression does is that it cleans up all detections, so they end up with just one detection per car, rather than multiple detections per car

Disard all boxes with Pc <= 0.6 (threshold)
While there are any remaining boxes
- Pick the box with the largest Pc, then output that as a prediction
- Discard any remaining box with IoU >= 0.5 (threshold) with the box output in the previous step

Problem : each of the grid cells can detect only one object
Motivation : What if a grid cell wants to detect multiple objects?
Previous method
- Each object in training image is assigned to grid cell that contains that object’s midpoint
- Output y : 3 * 3 * 8
With two anchor boxes
- Each object in training image is assigned to grid cell that contains object’s midpoint and anchor box for the grid cell with highest IoU (similar shape between bounding box and anchor box)
- Output y : 3 * 3 * 16
- Objects assiged to (grid cell, anchor box) pairs

R-CNN
- Idea : Rather than running sliding windows on every single window, you instead select just a few windows and run your classifier on just a few windows (segementation algorithm applied in order to figure out in which windows could be objeccts)
- Propose regions
- Classify proposed regions one at a time
- Output label + bounding box
Fast R-CNN
- Propose regions
- Use convolution implementation of sliding windows to classify all the proposed regions
Faster R-CNN
- Use convolutional network to propose regions

[source] https://www.coursera.org/learn/convolutional-neural-networks

[Network] Transport Layer 2