History of Computer Vision and Its Principles

By Nichole Peterson • Aug 26, 2019

Recognizing people walking in a busy street with Computer Vision

Software and hardware technologies are advancing to a place where you can easily equip low-power, resource-constrained edge devices with AI computer vision capabilities. Here's a brief history of computer vision (CV), how it started, what CV is currently working and how you can use it to empower your existing equipment.

Ultimately, developers everywhere can build and deploy deep learning computer vision applications to make your business more functional, productive, and profitable.

A Brief History of Computer Vision

CV got its start in the 1950s. Back then, it had almost zero resemblance to the real-time object detection and tracking technology you see now. The high cost and low power of sensors, data storage, and processing all played a role in limiting the industrial applications. Back then, the only place you saw computer image recognition was in academic studies — and science fiction. The first “seeing” robots, so to speak, were only tested on their abilities to recognize basic shapes in highly controlled conditions.

The only place you saw computer image recognition was in universities and science fiction novels. The first were tested on their abilities to recognize basic shapes in highly controlled conditions. Robots might not be thinking for themselves yet, but you can harness artificial intelligence (AI) to allow your devices to see for themselves. Various advancements — computer optics, processor development, computer science, big data — have made CV what it is today. Here's a timeline of computer vision breakthroughs for the past seventy years:

History of Computer Vision and Its Timeline

  • Introduction of AGM-65 Maverick optical contrast seeker missiles with onboard cameras for targeting.
  • Optical Character Recognition (OCR) debuts via Kurzweil Computer Products.

  • Smart cameras, such as the Xerox optical mouse, and layered image processing.

  • The Internet and the dot-com boom.

  • Security Show Japan 2005 and the first mobile phone with working facial recognition features.

  • AI’s average object-recognition error rate becomes better than humans’; 2.5% versus 5%.
  • First ImageNet competition. Developers competed for developing image recognition models. ImageNet now has up to 15 million images in 20,000 categories.

  • Google’s X Lab’s neural network starts to look for cat videos when allowed access to YouTube.

  • Tesla releases autopilot in its Tesla Model S electric car. The autopilot not only worked offline but offered advanced capabilities including parking and semi-autonomous driving.

  • Google introduces FaceNet for facial recognition. Requires minimum data set.

  • Niantic Inc. releases Pokemon Go, an AR based mobile game which relied on mobile cameras to display Pokemon in augmented reality. Players could globally battle them by traveling to certain physical locations and using their mobiles to “see” them through the in-app camera feature.
  • Tesla releases Hardware Version 2 on its vehicles. A more advanced version which could automatically switch lanes, apply brakes, manage acceleration and deceleration. 
  • Vision Processing Unit (VPU) microprocessors which are designed for machine vision processes start to trend.

  • ASOS integrates features which enables searching product through photos from client end. 

  • Candy retailers Lolli & Pops deploys facial recognition in physical stores for identifying frequent customers as part of a reward program.
  • Google announces its Google Maps Live View which provides live viewing of different sites using augmented reality on mobile cameras.
  • NVIDIA announces that it will make its hyper-realistic face generator StyleGAN open source.

The Current State: What Can Computer Vision Do Now?

Computer vision recognizing a bus 

As you can see from the history of computer vision, in sixty years science fiction became reality. You can place your apps on standalone edge devices — no need for cloud computing resources, high-power devices, constant connectivity, and so on.

With the plethora of training data and models being developed today, CV applications can identify, recognize, and track nearly any object. You can now use it to identify just about anything humans can see. Also, you can use it to identify things the human eye can't detect, such as minuscule defects in highly refined products or cancerous cells during a medical procedures.

An Eye Towards the Future

In general, it's clear that the world is heading towards using more computer vision-enabled devices, especially at the edge. Some estimates put the worldwide total of security cameras at around 350 million. If you add that to all the existing cameras on smartphones, smart home devices and so forth, it represents a major opportunity for businesses and enterprises to embed deep learning, decision-making capabilities into their existing equipment.

Computer vision been use in a farm. History of computer vision

In the next five years, you will likely see greater adoption and use of edge CV applications spanning multiple industries. Security enterprises will rely on drone assistants to provide birds-eye-view perspectives. Grocery stores will keep customers safe and avoid liability by using cameras to detect spills immediately. Retail stores will optimize the layout of products on their floor as well as gain valuable data about which areas of a shelf customers focus on the most. Factories will use CV to perform preventative maintenance by catching defects in manufacturing line output early and often. 

Here are a few examples in various industries where deep learning CV could play a key role:

  • Security. An automated human-detecting drone flyover of a train-yard could eliminate dangerous patrols.

  • Retail. Existing cameras could detect low stock and notify managers.

  • Entertainment. Cameras and object tracking CV tools could place viewers of an entertainment program in the action they're watching.

  • Health and Wellness. Fitness tracking devices could be embedded into workout equipment, such as treadmills or suggest changes.

  • Medicine and Medical Devices. Object classification using sensors. They are more sensitive than human vision and they can assist doctors during diagnosis.

  • Agriculture. Harvesting equipment with object counting technology could immediately establish yield numbers.

  • Manufacturing. Quality-control stations could use AI pattern recognition to find small circuit board defects.

  • Automotive/Transportation. CV tracking and counting tools could inform traffic management systems' rerouting capabilities for more effective responses of autonomous vehicles.

  • Smart Home/Consumer Devices. Smart devices could monitor children's breathing and heart rates.

  • Drones/Aerospace. Machine-vision measurement tools could help mechanics increase the accuracy and speed of maintenance checks for aircraft.

Looking farther down the line, you may have a CV-powered self-driving car. It would be as ubiquitous of a safety feature as airbags and seatbelts are now. Car manufacturers are already incorporating CV applications into vehicles with driver assistance functionality, guiding parallel parking or braking a vehicle when it comes too close to an object.

Additionally, as future generations continue tech innovation, there exists a massive opportunity for implementing CV in everyday life, whether it's caring for kids, trying out a new recipe in the kitchen, or buying new clothes.

Computer Vision Key Terms

The core functions of CV all have to do with how your applications treat images and handle objects. Identifying the use case for computer vision as it applies to your business needs is the first step in using deep learning. As you enter the computer vision field and start to explore building it into your application, here are some important, high-level terms you’ll want to understand:

  • Object. When looking for “things” in your image, you’re looking for objects. An object can be almost anything in your image. You’ll set out to identify that object, understand what class it belongs to, and assign it a label with a particular level of confidence.

  • Model. Models analyze uploaded or inputted data to learn what a particular object looks like. Models require examples of both positive and negative identifications to establish their probability tables for detecting objects. In order for your model to be accurate and efficient, large datasets of quality images or videos are necessary. Therefore, the more data a model receives during training, the more reliable it will be. Deep learning CV platforms are able to provide pre-trained models or enable you to train your own models.
  • Device. Smart devices make decisions using AI tools. This includes edge and IoT devices capable of low-power CV and off-cloud applications such as object detection and tracking.

  • Segmentation. computer vision applications split up images to generate regions for object proposals in a process known as segmentation. It groups pixels according to certain characteristics which then identifies object. The way that the machine segments images is one of the main factors in how quickly it performs other computer vision tasks.

  • Detection. Object detection is a computer vision function that takes image data and uses a trained model to return the probability that a known object is in the frame. Objects have to be detected before they can be classified, tracked, counted or recognized.

  • Tracking. Object tracking identifies objects in live or recorded video and follows their positions as they move throughout the field of view. 

  • Counting. Counting functions return the population count of object proposals in an image, a number usually limited by a preset probability threshold. For example, you could develop an application for a drone that would count the number of damaged plants in a farm after a storm.

  • Classification. AI models are often trained to recognize specific classes of objects, such as people, vehicles or circuit patterns. For example, you might want to use image classification to scan pictures of dogs in social media posts to determine which was the most popular breed in a specific market.

  • Recognition. Object recognition uses detection to find familiar objects, give a probability score for them via classification and locate them in the frame. Recognition is commonly thought of in facial recognition where biometric security systems match a particular face to a known positive occurrence of the face in the past.

Challenges Deploying CV on Embedded Devices

 Computer vision Camera

A company may look to drive down data transfer and storage costs by deploying deep learning capabilities directly on the device that is being manufactured or sold. Whether it is hardware limitations, connectivity issues, environmental factors or resource constraints, building deep learning applications for edge devices can be a significant challenge. Developers need to consider the size of their models, processing requirements for their app, and several other factors when working on an embedded device. It is also important to think about how a company can scale their application from prototype to hundreds (or thousands) of devices.

These challenges are being met with new entrants into the computer vision space aimed at keeping processing requirements low, file sizes down, and leveraging API platforms to plugin to core CV functions without the necessary time-intensive development process.

How Do I Get Started? 

Create your account here as a first step to understanding how easily you can implement CV into your app. You’ll get access to our API platform, SDK, and pre-trained models to simply bring deep learning to your dev board.

A Look at Our Platform

We want to give all developers across any enterprise or industry the tools needed to deploy CV in a simple and affordable way without needing to dive into the deep inner workings of deep learning. Our robust SDK, pre-trained model libraries and open API platform make integrating CV a straightforward process.

What we promise to you:

  • An intuitive, easy-to-use API platform.
  • A platform that gives you the power to apply deep learning CV to your specific business needs.
  • A system that interfaces and supports a broad range of embedded and IoT devices.
  • Pre-trained models or the ability to use your own custom training data.
  • A robust SDK to get you up-and-running quickly.
  • Simple APIs that allow you to use of complex, incredibly technical frameworks.
  • The ability to implement core CV services quickly, such as object counting, tracking classification and detection.

Join us! We're writing the history of computer vision.

We'd love for you to test out our platform. It will work on any 32- or 64-bit ARM-based developer board running Linux. We can’t wait to see what you build... Start deploying your CV solutions as quickly and easily as possible!

{% video_player "embed_player" overrideable=False, type='scriptV4', hide_playlist=True, viral_sharing=False, embed_button=False, autoplay=False, hidden_controls=False, loop=False, muted=False, full_width=False, width='1920', height='1080', player_id='12900503201', style='' %}
By Nichole Peterson • Aug 26, 2019

Developer stories to your inbox.

Subscribe to the Developer Digest, a monthly dose of all things code.

You may unsubscribe at any time using the unsubscribe link in the digest email. See our privacy policy for more information.

alwaysAI Ad
stylized image of a computer chip

Sign up today and start your project

We can't wait to see what you'll build!