Computer Vision-based imaging has become a powerful tool by actually recognizing the context of everything around a particular object being monitored
Why is this such a boon? Because, technology such as this is best utilized for eliminating boring, repetitive jobs that humans don’t enjoy doing anyway. That’s why companies are using Computer Vision to automate mundane tasks like sorting green apples from red apples or separating recyclable items from the trash.
Recently, Microsoft’s Azure Cognitive Services made news when it announced an innovative new service that allows developers to automatically generate captions for images. This latest addition to the cognitive intelligence system that leverages Computer Vision technology can reportedly generate image captions that are, in many cases, better or more accurate descriptions than what humans write. An image caption generated by a machine for a machine will be much more effective and make your Bing or Google search results much more relevant. This can help drive organic traffic to your webpage.
In addition, Computer Vision-based image captioning is a big milestone because it leverages AI systems that are now beginning to detect, understand, and describe an action or motion within the context of everything else around it. It accomplishes this amazing task by using deep learning to detect what the item is, the action it is performing and then uses Natural Language Generation (NLG) to describe it.
As a subsegment of Artificial Intelligence, Computer Vision is a breakthrough because it now is actually replicating the “visual” intelligence of the human brain and is critical for computers to gain a more robust understanding of the environment in the same way that humans do. In fact, another similarity between this technology and human behavior is that modern Computer Vision actually relies on deep learning algorithms a.k.a. neural networks, to understand the objects it’s witnessing. These neural networks use massive amounts of visual data to learn and find patterns to arrive at a highly educated conjecture about what a certain object actually is. These algorithms are inspired by the human understanding of how brains function, in particular, the interconnections between the neurons in the cerebral cortex.
There are many Computer Vision applications businesses are leveraging to automate or streamline processes. In the Healthcare field, Computer Vision can detect cancer from CT scans better than doctors. In highly secure environments, retinal and fingerprint scanning can uniquely identify individuals to enable or restrict access. Wind turbines may be inspected for defects via autonomous drone footage with high-definition mounted cameras. But, perhaps the most practical application of Computer Vision can be found within retail applications – package management and billing.
A critical aspect of package management involves reading labels with unique tracking numbers or identifiers to identify each individual item. These labels are placed on every item and reading these identifiers, either by handheld scanners or having a person physically entering the label information into a computer database, dramatically slows the entire delivery process. However, utilizing the unique capabilities of Computer Vision will enable systems to automatically read and record these labels, eliminating the need for handheld scanners or manual data input by staff. LabelPack is a good example of an automated label registration process that offers one hundred percent item visibility across the entire chain while reducing human intervention and costs.
Computer Vision is also helping to automate the supply chain and redirect humans to perform more complicated tasks. A good example is Position Imaging, where the company is applying its Amoeba Computer Vision technology to help multi-family property managers automate the package handling process and redirect the staff to manage residents, rather than packages. It provides an enhanced experience for the residents because they no longer have to wait or contact staff to pick up their packages. Couriers deliver packages directly to the Smart Package Room where the Amoeba Computer Vision technology virtually tags and monitors the location of each package, essentially keeping eyes on the packages 24/7 until the owner picks them up.
Logistics companies can also use Computer Vision to audit the package dimensions traveling through their hubs, enabling senders to easily and accurately measure package dimensions before shipping them. The lack of understanding of how to properly measure package dimensions can lead to misaligned expectations on how much shipping will cost versus how much a sender actually gets billed. By automating the manual task of measuring package dimensions, logistics companies can streamline and enhance the customer experience and reduce costs. For example, Py-tesseract is one such tool that enables data extraction from an image and detection option to further process a document after scanning.
Artificial Intelligence’s cousin, Computer Vision, is streamlining many, once mundane, human processes and we have only scratched the surface. The use of Computer Vision technology will soon be widespread and access to information will become that much easier. This information will be leveraged in a large part, to improve the package logistics trail and shopper experiences—by streamlining the shopping experience while helping retailers transform their own logistics process and focus on customers, not fulfillment. As a Forbes Contributor, Rob Toews, recently said, “A wave of billion-dollar Computer Vision startups is coming.” And this technology has the vision to unburden us of many unwanted tasks.
Ned Hill is the founder and CEO of Position Imaging (PI), a pioneer in the field of advanced tracking technologies. Under Ned’s strategic vision and guidance, PI has developed an industry leading tracking solution, utilized computer vision and laser guidance to simplify item delivery, and created unique AI-based technologies. These combine to improve logistics efficiency and continuous visibility to items at any stage in the process. Ned has raised close to 20 million in funding, driven product development, and created a partner ecosystem of industry leaders in hardware (Hitachi-LG Data Storage, Intel), software (Microsoft, Salesforce), solutions (Zebra, Lozier), and service (Bell and Howell). Ned is the inventor or co-inventor of over 50 patents/patent applications and a speaker at industry conferences including CES, Live Free and Start, and at MIT.