Let’s face it: aside from a few narrow use cases, you can’t have computer vision for real-world money-generating industry problems without some machine learning.
From clustering to OCR, to dimensionality reduction, to classification, to segmentation, to regression, computer vision nowadays exhales machine learning. When we get into deep nets, it gets even crazier. Computer vision and machine learning become incredibly symbiotic.
I feel like whenever I take a look at some computer vision posts from my favorite websites (/r/computervision/, HackerNews, DataTau, or Kaggle), everything is strictly about deep learning. People are forgetting – or maybe don’t even know about – the other parts of computer vision: the ‘classic’ computer vision, image processing and pre-processing, feature extraction, etc…
Don’t get me wrong; machine learning is the most promising future of computer vision. It’s a vast research field, and there is still a lot to explore. Nevertheless, this isn’t the entirety of computer vision. There is so much more to it. We have got edge detection, convolutions, transforms, kernels, color spaces, distortion, binarization, noise removal, reconstruction, moments, centroids and a lot more that can still be useful – with or without machine learning.
Some people seem to believe that computer vision is that simple: get a sufficient amount of images, feed them into a machine learning algorithm and, after some training, it will classify your pictures.
Following that line of thought, anyone who works with machine learning can develop computer vision, right? It is the same thing. If I know a machine learning guy, he should be able to develop computer vision as well. Isn’t computer vision just a subset of machine learning? That is simply not true.
In my opinion, machine learning in computer vision is currently over glorified. Yes, it is that powerful. Yes, it has tremendous classification accuracy. However, it needs to be applied to the right type of problem and aligned with the right amount of ‘old-school’ computer vision.
No single machine learning model is the “silver bullet” to solve all your problems. Let us treat machine learning models as tools in our toolbox. Just like a hammer alone isn’t enough to build an entire house, other techniques besides machine learning are necessary to create a complete computer vision solution.
We need to spend more time thinking about the problems we are trying to solve instead of just throwing a bunch of images inside the machine learning algorithms and seeing what sticks. It may sometimes stick. We not only need to understand the problem but also understand what it implies in the real world. We need to strive to be more than people who blindly throw a bunch of images into a machine learning library.
The takeaway is: machine learning is a methodology with a rational thought process that needs to be adapted and structured to fit our problems. Machine learning alone can’t solve all computer vision problems. The data we provide our models can’t be raw. It needs to be processed beforehand. We shouldn’t blindly provide example images to algorithms and see what works. We need to calm down, explore the features of the problem, examine the possible solutions. Only then may we consider the best mode of action within the vast science that is computer vision.
Visão Computacional, Machine Learning e Otimização Multiobjetivo são algumas das técnicas usadas pela Enacom na criação de soluções personalizadas para o seu negócio.
Qual desafio sua empresa está enfrentando? Nós podemos te ajudar.