Computer vision applications: The power and limits of deep learning

8min read
facial recognition mobile app
Image credit: Depositphotos

本文是的一部分Demystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI.

Since the early days of artificial intelligence, computer scientists have been dreaming of creating machines that can see and understand the world as we do. The efforts have led to the emergence ofcomputer vision, a vast subfield of AI and computer science that deals with processing the content of visual data.

In recent years, computer vision has taken great leaps thanks to advances of深度学习and artificial neural networks. Deep learning is a branch of AI that is especially good at processing unstructured data such as images and videos.

这些进步已经为提高了现有域中的计算机愿景并将其引入新的计算机来铺平了途径。在许多情况下,计算机视觉算法已成为我们每天使用的应用程序的一个非常重要的组成部分。

关于当前计算机愿景状态的一些注意事项

Before becoming too excited about advances in computer vision, it’s important to understand the limits of current AI technologies. While improvements are significant, we are still very far from having computer vision algorithms that can make sense of photos and videos in the same way as humans do.

暂且,深神经网络,计算机视觉系统的肉和土豆非常擅长像素水平的匹配模式。它们在对图像中的图像和本地化对象进行分类时特别有效。但是,当谈到了解视觉数据的背景并描述不同对象之间的关系时,它们会失败。

最近在现场完成的工作显示计算机视觉算法的限制以及对新评估方法的需求。尽管如此,电脑视觉的当前应用展示了单独匹配模式可以实现多少。在这篇文章中,我们将探讨其中一些应用程序,但我们还将讨论它们的限制。

Commercial applications of computer vision

您每天都在使用计算机视觉应用程序,也许在某些情况下不明显。以下是将生活有趣和方便的计算机视觉的一些实用和流行应用。

图像搜索

计算机愿景取得了巨大进展的领域之一是图像分类和对象检测。在足够的标记数据上培训的神经网络将能够检测和突出各种具有令人印象深刻的对象。

Few companies that match Google’s vast store of user data. And the company has been using its virtually limitless (and ever-growing) repository of user data to develop some of the most efficient AI models. When you upload photos in Google Photos, it uses its computer vision algorithms to annotate them with content information about scenes, objects, and persons. You can then search your images based on this information.

For instance, if you search for “dog,” Google will automatically return all images in your library that contain dogs.

谷歌照片图像search
谷歌使用机器学习和计算机愿望来搜索图像的内容,即使您还没有标记它们。

Google’s image recognition isn’t perfect, however. In one incident, the computer vision algorithm mistakenly将两个深色皮肤的人的图片标记为“大猩猩”,对公司造成尴尬。

谷歌还使用计算机愿景从库中的图像中提取文本,驱动器和Gmail附件中的图像。例如,当您在收件箱中搜索一个术语时,Gmail也将在图像中的文本中查找。一段时间后,我在Gmail中搜索了我的家庭地址,并收到了一个包含图像附件的电子邮件,其中包含一个带有我的地址的亚马逊包。

Image editing and enhancement

Many companies are now usingmachine learning为照片提供自动增强功能。谷歌的像素电话线使用设备上的神经网络来制作自动增强,如白色平衡和添加效果,如blurring the background

Another remarkable improvement that advances in computer vision have ushered in is smart zooming. Traditional zooming features usually make images blurry because they fill the enlarged areas by interpolating between pixels. Instead of enlarging pixels, computer vision–based zooming focuses on features such as edges, patterns. This approach results in crisper images.

Many startups and longstanding graphics companies have turned to deep learning to make enhancements to images and videos. Adobe’sEnhance Details technology, featured in Lightroom CC, uses machine learning to create sharper zoomed images.

machine learning image enhancement adobe lightroom
Adobe uses deep learning to enhance the details of zoomed images.

图像编辑工具Pixelmator Pro Sports AnML Super Resolutionfeature, which uses a convolutional neural network to provide crisp zoom and enhance.

Facial recognition applications

直到不久前,面部识别是一个笨重而昂贵的科技有限于警察研究实验室。但近年来,由于计算机视觉算法的进步,面部识别已经进入各种计算设备。

iPhone X introduced FaceID, anauthentication system它使用On-Device神经网络在看到其所有者的脸部时解锁手机。在设置过程中,面部地面在所有者的脸上培训其AI模型,并在不同的照明条件下,面部头发,理发,帽子和眼镜工作。

In China, many stores are now using facial recognition technology toprovide a smoother payment experience客户(尽管如此,他们的隐私价格)。客户只需要将他们的视觉视觉相机展示他们的脸部,而不是使用信用卡或移动支付应用程序。

Despite the advances, however, current facial recognition is not perfect. AI and security researchers have found numerous ways to cause facial recognition systems to make mistakes. In one case,researchers at Carnegie Mellon Universityshowed that by wearing specially crafted glasses, they could fool facial recognition systems to mistake them for celebrities.

AI逆境攻击面部认可
Researchers at Carnegie Mellon University discovered that by donning special glasses, they could fool facial recognition algorithms to mistake them for celebrities (Source:http://www.cs.cmu.edu.)

数据高效的家庭安全

With the chaotic growth of the事情互联网(物联网),互联网连接的家庭安全摄像机已经增长了普及。您现在可以轻松安装安全摄像头并随时在线监控您的家。

Each camera sends a lot of data to the cloud. But most of the footage recorded by security cameras is irrelevant, causing a largewaste of network, storage, and electricity resources。Computer vision algorithms can enable home security camera to become more efficient in the usage of these resources.

智能摄像机保持闲置,直到它们检测到视频源中的对象或移动,之后他们可以开始向云发送数据或向相机所有者发送警报。但请注意,计算机愿景仍然不太擅长了解上下文。所以不要指望它在良性运动(例如,穿过房间里的球)和需要你注意的东西(例如,闯入你家的小偷)。

与现实世界互动

Augmented reality,通过虚拟对象重叠现实世界视频和图像的技术已成为过去几年的不断增长的市场。AR欠计算机视觉算法的进步大部分扩张。AR应用程序使用机器学习来检测和跟踪它们放置虚拟对象的目标位置和对象。您可以在许多应用中看到AR和计算机视觉的组合,如Snapchat过滤器和Warby帕克的虚拟试穿。

计算机愿景还使您可以通过手机相机的镜头从现实世界中提取信息。一个非常显着的例子是谷歌镜头,它使用计算机视觉算法执行各种任务,例如阅读名片,检测家具和衣服的风格,转换路牌,以及将手机连接到Wi-Fi网络based on router labels.

Advanced applications of computer vision

由于深度学习的进步,计算机愿景正在解决以前非常努力甚至不可能对计算机解决的问题。在某些情况下,训练有素的计算机视觉算法可以与具有多年经验和培训的人类进行。

医学图像处理

Before deep learning, creating computer vision algorithms that could process medical images required extensive efforts from software engineers and subject matter experts. They had to cooperate to develop code that extracted relevant features from radiology images and then examine them for diagnosis. (AI researcher Jeremy Howard hasan interesting discussion on this。)

深度学习算法提供结束解决方案,使过程非常容易。工程师创建了正确的神经网络结构,然后在X射线,MRI图像或CT扫描中培训它的结果。然后,神经网络找到与每个结果相关的相关特征,然后可以识别具有令人印象深刻的准确性的未来图像。

计算机愿景已找到它进入许多医学领域,包括cancer detection and prediction,放射学,糖尿病性视网膜病变

Some AI researchers have gone as far as saying deep learning很快就会替代放射科医生。But those who have experience in the fieldbeg to differ。诊断和治疗疾病的疾病多得多,而不是看幻灯片和图像。让我们不要忘记深度学习从像素中提取模式 - 它不复制人类医生的所有功能。

玩游戏

Teaching computers to play games一直是AI研究的热门区域。大多数游戏播放程序使用reinforcement learning, an AI technique that develops its behavior through trial and error.

Computer vision algorithms play an important role in helping these programs parse the content of the game’s graphics. One thing to note, however, is that in many cases, the graphics are “dumbed down” or simplified to make it easier for the neural networks to make sense of them. Also, for the moment, AI algorithms need huge amounts of data to learn games. For instance,Openai的Dota-Play Aihad to go through 45,000 years’ worth of gameplay to achieve champion level.

Cashier-less stores

2016年,亚马逊推出了Go., a store where you could walk in, pick up whatever you want, and walk out without getting arrested for shoplifting. Go used various artificial intelligence systems to obviate the need for cashiers.

As customers move about the store, cameras equipped with advanced computer vision algorithms monitor their behavior and keep track of the items they pick up or return to shelves. When they leave the store, their shopping cart is automatically charged to their Amazon account.

宣布三年后,亚马逊已开设18家商店,仍然是一项正在进行的工作。但是有希望的迹象表明计算机愿景(帮助其他技术)将有一天会使结账线路成为过去。

自驾

可以导航没有人类司机的道路的汽车成为AI社区最长的梦想和最大的挑战之一。今天,我们仍然很远self-driving cars that can navigate any road在各种照明和天气条件下。但由于深神经网络的进步,我们已经取得了很大的进步。

One of the biggest challenges of creating self-driving carsenabling them to make sense of their surroundings。虽然不同的公司正在以各种方式解决问题,但其中一件事是计算机视觉技术。

在车辆周围安装的摄像机监控汽车的环境。深度神经网络解析镜头并提取有关周围物体和人的信息。该信息与来自其他设备的数据相结合,例如Lidars,以创建区域的地图,并帮助汽车导航道路并避免碰撞。

Creepy applications of computer vision

Like all other technologies, not everything about artificial intelligence is pleasant. Advanced computer vision algorithms can scale up malicious uses. Here are some of the applications of computer vision that have caused concern.

监视

它不仅是对面部识别技术感兴趣的电话和计算机制造商。事实上,面部识别技术的最大客户是政府机构,他们对使用该技术自动识别安全相机镜头中的犯罪分子的既得利益机构。

但问题是,您在哪里绘制国家安全和公民隐私之间的界限?中国展示了太多的前者和太少的前者可能导致一个监视状态that gives too much control to the government. The widespread use of security cameras powered by facial recognition technology enables the government to closely track the movements of millions of citizens, whether they are criminal suspects or not.

在美国和欧洲,事情有点复杂。科技公司面临抵抗力从他们的员工和数字版权活动家提供对执法的面部识别技术。美国的一些州和城市有禁止公众使用面部识别

Autonomous weapons

Computer vision can also give eyes to weapons. Military drones can use AI algorithms to identify objects and pick out targets. In the past few years, there’s been a lot of controversy over the use of AI by the military. Google had to呼吁续签合同to develop computer vision technology for the Department of Defense after it faced criticism from its employees.

For the moment, there are still no autonomous weapons. Most military institutions are using AI and computer vision in systems that have a human in the loop.

But there’s fear that with advances in computer vision and greater engagement of the military sector, it’s only a matter of time before we have weapons that choose their own targets and pull the trigger without a human making the decision.

着名的计算机科学家和AI研究员Stuart Russell已经创立了一个致力于的组织stopping the development of autonomous weapons

发表评论

This site uses Akismet to reduce spam.了解如何处理评论数据