Computer Vision by Andrew Ng - 11 Lessons Learned

I recently completed Andrew Ng’s computer vision course on Coursera. In this article, I will discuss 11 key lessons that I learned in the course.
c
comments

By Ryan Shrott, Chief Analyst at National Bank of Canada.


Created in week 4 of the course. Combined Ng’s face with the style of Rain Princess by Leonid Afremov.

I recently completed Andrew Ng’s computer vision course on Coursera. Ng does an excellent job at explaining many of the complex ideas required to optimize any computer vision task. My favourite component of the course was the neural style transfer section (see lesson 11), which allows you to create artwork which combines the style of Claud Monet with the content of whichever image you would like. This is an example of what you can do:

In this article, I will discuss 11 key lessons that I learned in the course. Note that this is the fourth course in the Deep Learning specialization released by deeplearning.ai. If you would like to learn about the previous 3 courses, I recommend you check out this blog.

Lesson 1: Why computer vision is taking off?

Big

Lesson 7: Use Transfer Learning!

Training large networks, such as inception, from scratch can take weeks on a GPU. You should download the weights from a pretrained network and just retrain the last softmax layer (or the last few layers). This will greatly reduce training time. The reason this works is that earlier layers tend to be associated with concepts in all images such as edges and curvy lines.

Lesson 8: How to win computer vision competitions

Ng explains that you should train several networks independently and average their outputs to get better performance.

Lesson 11: How to create artwork using Neural Style Transfer

Ng explains how to generate an image with a combining content and style. See the examples below.

The key to Neural Style Transfer is to understand the visual representations for what each layer in a convolutional network is learning. It turns out that earlier layers learn simple features like edges and later features learn complex objects like faces, feet and cars.

To build a neural style transfer image, you simply define a cost function which is a convex combination of the similarity in content and style. In particular, the cost function would be:

J(G) = alpha * J_content(C,G) + beta * J_style(S,G)


where G is the generated image, C is the content image and S is the style image. The learning algorithm simply uses gradient descent to minimize the cost function with respect to the generated image, G.

The steps are as follows:

  1. Generate G randomly.
  2. Use gradient descent to minimize J(G), i.e. write G := G-dG(J(G)).
  3. Repeat step 2.

Conclusion

By completing this course, you will gain an intuitive understanding of a large chunk of the computer vision literature. The homework assignments also give you practice implementing these ideas yourself. You will not become an expert in computer vision after completing this course, but this course may kickstart a potential idea/career you may have in computer vision.

If you have any interesting applications of computer vision you would like to share, let me know in the comments below. I would be happy to discuss potential collaboration on new projects.

That’s all folks — if you’ve made it this far, please comment below and add me on LinkedIn.

My Github is here.

 
Bio: Ryan J. Shrott is Chief Analyst at National Bank of Canada Financial Markets.

Original. Reposted with permission.