CV Vs. NLP – Is it a Difficult Choice?

ShareShare on FacebookShare on Google+Email this to someoneTweet about this on TwitterShare on LinkedIn

Are you a junior python developer who looks to indulge in more interesting projects? Are you a data scientist and you need to expand your expertise in the field of Machine Learning? Are you a software developer that has many years concentrated in web and mobile app development? Are you a graduate or post graduate student in Machine Learning who is in a dilemma on what projects to do in Machine Learning?


Well in Machine Learning and to some extent there’s much that we can do to solve everyday problems. You can upgrade your skills or expertise in Machine Learning in either of Computer Vision or Natural Language Processing. This blog will tackle two application areas of Machine Learning, that is, Computer Vision and Natural Language Processing. I will try to dissect the recent developments in each field. I will also give the most recent popular projects in each field which you could also try. Finally, on give my perspective on which field I would choose over the other.


Round 1: Computer Vision

The demand for Computer Vision application is higher than before. Every industry from finance, security, transportation to marketing has lots of repetitive tasks that can be automated using Computer Vision. In the early 2000, a library called opencv was released which helped solving Computer Vision problems though not to a high accuracy. Stronger algorithms, that I will mention briefly have overtaken the library and solved issues that opencv couldn’t solve at a high accuracy rate. In my limited experience with Computer Vision and my interest in reading about CV and implementation of CV projects, I can at least say that most solutions include human beings and cars.
Here are some of the problems that you can look to tackle for a start.

  • Image Classification
    According to Microsoft, Google and Baidu, three technology giants claim they have programmed computers that can beat humans in image classification and sorting. With enough data and computational power, it is proven that a computer can beat a human in classifying many images.In the early 1990s, the biggest problem which researchers were trying to solve was recognition of hand written digits and also classifying images of Dogs and Cats. Now a days this problem has been easily solved with over 95% accuracy using the powerful algorithms developed. Using numerous researches by Machine Learning developers, this problem can be easily solved by a Convolution Neural Network (CNN). To get started with and implement your first CNN, you may look at this article that classifies dogs vs. cats.


  • Object detection and Segmentation
    This is the most interesting problem to tackle currently. It involves detecting objects in images or videos using bounding boxes identify the location of objects in an image and count the number of instances of an object. This has been used in the security industry and also in the transportation industries, where you can detect the cars in a certain road by the CCTV cameras erected. APIs such as tensorflow, keras and YOLO have proven successful in object detection. To get started with object detection you can try a library called ImageAI where through this article, you can achieve your custom object detection using only 6 lines of code!!!. Isn’t it amazing?

Object Detection and Segmentation


  • Facial Recognition
    The first efficient face detector was the Viola-Jones Algorithm built in 2002 by Paul Viola and Michael Jones. Their demo showed faces being detected in real time by a webcam feed. It was a stunning demonstration of CV and its potential at the time. It involved facing the webcam at a straight position where they hand coded the position of the eyes, mouth, nose etc. The algorithm failed when, let’s say, you tilted your head 45% sideways. It couldn’t detect your facial features then.Many deep learning methods have come about the past five years that are more robust. In these deep learning methods, we need a large amount of data and the more we have, the better the results get. A recent development in this field is one shot learning which doesn’t require much data in order to do correct matching. For example, one shows you a picture of ALICE and a picture of BOB, your brain doesn’t need thousands of pictures of the ALICE or BOB in order to be able to recognize them. A picture or two is enough for you to remember them. This is the idea behind one shot learning.


Round 2 : Natural Language Processing

Computers are pretty good in dealing with numeric data. On the contrary, we humans communicate with words not numbers. NLP focuses on enabling computers to understand and communicate in human language. There are many application areas of NLP. Chat-bots and question answering systems are one of the major developments. I will leave that for now because there’s too much subject matter that I will explain about it in another separate article. I will briefly mention a few others for starters.


  • Automatic Summarization
    Do you ever feel like you are flooded with information? It is proven that most people when reading a big article, they don’t tend to read the full document at first. They first skim through the document searching for information that they are looking for. If they don’t find what they are looking for, they close the article and look for another article. Through skimming, if they find the information that they are looking for, that’s when they start reading the article attentively.
    Automatic summarization in NLP has helped in giving summary of large amounts of text or even a book. It generates a summary of the text in a few lines, thus saving the skimmers some time off the clock.


  • Sentiment Analysis
    Sentiment analysis is the process of determining the opinion or feeling of text. There are tons and tons of data in social media sites like twitter, Facebook, Reddit etc. If a company launches a new product, people tend to give a lot of reviews online in the social media. With millions and millions of comments, it’s impossible for the company to review the general mood of the people by employing a communications department reviewing the tweets. Thanks to the developments in NLP, it’s easy to develop automatic systems to review the tweets.
    I recently build a sentiment analysis system reviewing the 2019 India general elections between two promising parties, analyzing over 100,000 tweets.


  • Machine Translation
    Google is a champion in this field where they have built a robust system that translates most of the languages in this planet earth.



It's a tie
My verdict is 50%-50% tie. Well, there’s a lot of subject matter that I would have discussed in both of the two fields. This was just a hindsight of the two fields to let you know where you can start at. According to me, both fields are interesting and with a lot of interest and research. You can thrive in implementing the simple problems we have in NLP and CV or even a combination of two where for example you can build a mobile application which would implement object detection (CV) and then using speech recognition (NLP) speak the recognized object. This would help the blind people be aware of their surrounding environment.

The most important skill for any Machine Learning developer is problem solving. As long as he/she is able to solve a problem and add value, that is what matters. That is my take. What’s yours?

Leave a Reply