「经济学人+音频」人工智能助力机器人更快学习新动作 | 外刊阅读


在人工智能技术的引领下,机器人正在经历一场变革。

外刊原文
Large behaviour models

Robots can learn new actions faster thanks to AI techniques
They could soon show their moves in settings from car factories to care homes

INSIDE THE robotics laboratory of the Toyota Research Institute (TRI) in Cambridge, Massachusetts, a group of robots are busy cooking. There is nothing special about that; robotic chefs have been around for a while. But these robots are more proficient than most, flipping pancakes, slicing vegetables and making pizzas with ease. The difference is that instead of being laboriously programmed to carry out their tasks, the Cambridge robots have been taught only a basic set of skills. Using the wonders of artificial intelligence (AI), they quickly improved upon those skills to become far more dexterous.
Despite their extraordinary culinary capabilities, these robots are not destined for a career in catering. “If you give a robot the confidence to work in a kitchen, it will also have the confidence to work in a factory or a person’s home,” says Gill Pratt, Toyota’s chief scientist. Cooking involves lots of complex tasks, such as picking up and placing items, pouring liquids and mixing ingredients. All this makes a kitchen an ideal training ground for experimenting with a new method of using generative AI to train robots known as “diffusion policy”.
Diffusion, already used to help AI models generate images, has been developed as a way to speed up the training of robots by TRI and roboticists at Columbia University and the Massachusetts Institute of Technology (MIT). To explain how diffusion works, Russ Tedrake, TRI’s vice-president of robotics research and a professor at MIT, uses a typical kitchen task: teaching a robot how to load a dishwasher, once its fellow machines are done with their cooking.
Traditionally, robots are programmed with reams of computer code. This can be produced manually or created by remotely moving the robot’s arms and hands to replicate the actions required. A robot in Cambridge, equipped with camera eyes and touch sensors to provide feedback, was taught in the remote-control manner to pick up dishes and stack them in the dishwasher. This involved about 100 such demonstrations, each slightly different, to deal with the various items and how they should be stacked.
Yet even 100 demonstrations are not enough to cover every eventuality, which is where diffusion comes in. The process is a bit like learning how to build a gizmo by taking it apart and trying to reassemble it. For image generation, this involves adding random “noise” to a picture until it becomes unrecognisable and then reversing the process to learn the steps involved in generating a new, realistic image.
For robot training, the AI uses the actions it has been taught to randomly generate potential new movements, which are then refined into useful actions that can deal with new environments. That could be how to pick up a plate placed at an unusual angle or an oddly shaped bowl. The robot will keep trying new actions until it succeeds in its task. By using diffusion Dr Tedrake says it was possible to train a robot in a couple of hours to load a dishwasher, whereas programming one conventionally would have taken a year or more.
Having got diffusion to work for a variety of different tasks, the researchers are now trying to bring hundreds of such tasks together into what they call a large behaviour model (LBM). This will be analogous to a large language model (LLM), which is used to power AI services like ChatGPT. Instead of generating answers to questions based on information which an LLM has been trained on, an LBM contains sets of behaviours which can be used to generate new behaviours. In its simplest form, this means the skills involved in picking groceries from supermarket shelves (which one of the Cambridge robots has learned how to do) can also be used to select components in a factory making cars.
These new skills, once acquired, can then be transferred wirelessly from one robot to another using what is called “fleet learning”. This will also help speed up robot training. In time, even basic training could be made faster and simpler. Instead of having someone move its limbs remotely, the robot could simply watch someone demonstrate how a job is done.
To further this work the TRI, which is based in Silicon Valley, recently teamed up with Boston Dynamics. Widely seen as one of the world’s leaders in developing walking robots, Boston Dynamics is working on a lighter and smaller version of Atlas, its hulking humanoid, which can run, jump and even perform cartwheels. The new Atlas will provide an agile robot which the TRI aims to equip with an LBM.
The idea is that, initially at least, these robots will be deployed in factories, most likely making vehicles (both the TRI and Boston Dynamics are part of big carmakers: Toyota is Japan’s largest carmaker and in 2021 Hyundai, a big South Korean producer, bought a majority stake in Boston Dynamics). Factories are a relatively structured environment in which automation is already widely used, which makes the introduction of AI-powered humanoids easier. Humanoids are widely seen as the most efficient shape to use in a human-built environment, rather than wheeled or tracked robots. The same is true in homes.
Eventually the car factories could themselves mass-produce robots, which would bring prices down and allow their introduction into other areas, such as helping care for the elderly and people with disabilities. Elon Musk appears to have a similar strategy planned for Optimus, a humanoid AI-powered robot being developed by his electric-car company, Tesla. Mr Musk has not revealed any details about the form of AI which Tesla is using.
All this may seem to herald a future in which humans are no longer required in factories. But, says Dr Pratt, as manufacturing becomes more flexible, and a greater variety of products are made on the same line, factories will become ever more reliant on a human workforce to manage the changes and maintain the robots. Many hands make light work.

词汇 & 表达
1. dexterous  /ˈdekstrəs/   adj.  灵巧的;熟练的;敏捷的
《纽约时报》例句:
She was patient, keen-eyed and dexterous. 她很有耐心、目光敏锐且手巧。
《经济学人》例句:
Since then, they have got vastly more dexterous, mobile and autonomous. 后来,它们变得灵巧、机动和自主多了。
《金融时报》例句:
All today’s surgical robots work as dexterous assistants, helping human surgeons who control their movements. 现今的所有手术机器人都是充当灵巧的助手,辅助操控它们的外科医生。
《嘉莉妹妹》(Sister Carrie)例句:
He was rather dexterous in avoiding everything that would suggest that he knew anything of Carrie's past. He kept away from personalities altogether, and confined himself to those things which did not concern individuals at all.
他态度圆活,避开任何让人看出他知道嘉莉过去的话题。他的谈话完全不涉及个人,只说些和任何人无关的事情。
2. destined /ˈdestɪnd/  adj. (尤指)命中注定的 (~to);去往某地的 (~for)
词典例句:We seem destined never to meet. 我们似乎是命中注定无缘相见。
《经济学人》在一篇报道立陶宛垃圾回收计划的文章中有一个例句:
In Britain alone households generate 30% more waste, an extra 3m tonnes, in the month over Christmas. Most is destined for landfill.
仅在英国,普通家庭在圣诞节后一个月内产生的垃圾比平时多了30%(相当于额外产生300万吨垃圾)。其中大多数垃圾被运往填埋场。
3. eventuality  /ɪˌventʃuˈæləti/ n. (尤指令人不快的)可能发生的事情,可能出现的结果
for an/every eventuality 表示以防万一、未雨绸缪
词典例句:
We were prepared for every eventuality . 我们准备应付任何可能出现的情况。
The money had been saved for just such an eventuality . 钱积攒下来就是为应付这样的意外。
4. gizmo /ˈɡizməʊ/ n. 新玩意儿,小物件,小装置
5. hulking  /ˈhʌlkɪŋ/ adj.  庞大而笨重的
6. humanoid  /ˈhjuːmənɔɪd/ n. 仿真机器人,人形机器人


参考译文
大型行为模型 
人工智能助力机器人更快学习新动作 
从汽车工厂到养老院,机器人即将展示全新技能
在美国马萨诸塞州剑桥市的丰田综合研究所(Toyota Research Institute,TRI)机器人实验室内,一群机器人正忙于烹饪。这本身并不稀奇;烹饪机器人早已存在。但这些机器人的熟练程度远超常人,他们轻而易举地翻煎饼、切蔬菜、制作披萨。与众不同的是,这些机器人并非通过繁复编程获得技能,而是仅仅学习了基本技巧,随后借助人工智能( AI )迅速提升了动作能力。
尽管烹饪本领非凡,这些机器人并非为餐饮业而生。"当一个机器人能自信地在厨房工作,它也就能自信地在工厂或家庭中工作,"丰田首席科学家 Gill Pratt 说。烹饪包含诸多复杂任务,如放置物品、倒液体和搅拌原料,这使得厨房成为尝试全新 AI 训练方法"扩散策略"的理想场所。
"扩散"技术最初用于 AI 图像生成,现已被 TRI 及哥伦比亚大学和麻省理工学院的机器人学家开发,用于加速机器人训练。麻省理工学院机器人研究副总裁兼教授 Russ Tedrake 用装载洗碗机这一典型厨房任务来解释"扩散"原理。
传统上,机器人需要通过大量计算机代码编程,可以是手动编写,也可以通过远程控制机器人手臂和手部来复制所需动作。在剑桥的一个机器人,配备摄像头和触觉传感器以提供反馈,通过遥控方式学习拿取餐具并将其整齐堆放在洗碗机中。这个过程需要约100次不同的示范,以应对各种餐具和堆放方式。
然而,即使100次示范也无法涵盖所有可能情况,这正是"扩散"技术发挥作用的地方。这个过程类似于拆解并重组一个装置以学习其原理。在图像生成中,这涉及向图像添加随机"噪声"直至无法识别,然后反向推进以学习生成逼真图像的步骤。
对于机器人训练,AI 会利用已学习的动作随机生成潜在新动作,随后将其refined为适用于新环境的有效行为。这可能意味着能够拿起放置在异常角度的盘子或形状奇特的碗。机器人将不断尝试新动作直至成功完成任务。Tedrake 博士表示,通过使用扩散技术,训练一个机器人装载洗碗机只需几小时,而传统编程可能需要一年或更长时间。
研究人员已成功将扩散技术应用于多种不同任务,目前正尝试将数百个任务整合into所谓的大型行为模型( LBM )。这类似于大型语言模型( LLM ),后者支持诸如 ChatGPT 等 AI 服务。与 LLM 根据训练信息生成答案不同,LBM 包含可用于生成新行为的行为集。简单来说,这意味着在超市货架上拣选杂货的技能(剑桥机器人已学会)同样可用于汽车工厂挑选零部件。
这些新技能一旦习得,可通过"群体学习"方式无线传输给其他机器人,从而进一步加速机器人训练。随着技术发展,基础训练可能变得更快、更简单。与其由人远程操控肢体,机器人只需观察他人如何完成工作即可。
为推进这项工作,位于硅谷的 TRI 最近与波士顿动力公司(Boston Dynamics) 结成联盟。作为全球领先的行走机器人开发商,波士顿动力公司正致力于开发更轻巧的 Atlas 人形机器人——一个能奔跑、跳跃甚至翻跟头的紧凑版本。TRI 的目标是为这个新 Atlas 配备大型行为模型。
初期,这些机器人很可能部署在工厂,尤其是汽车制造工厂(丰田综合研究所 和 波士顿动力公司 都隶属大型汽车制造商:丰田是日本最大汽车制造商,现代汽车于2021年收购了波士顿动力公司的大部分股权)。工厂是相对结构化的环境,已广泛使用自动化,这使得 AI 驱动的人形机器人的引入变得更为容易。人形机器人被广泛认为是在人造环境中最高效的形态,无论是在工厂还是家庭中。
最终,汽车工厂可能会批量生产机器人,这将降低成本,并使其进入其他领域,如照顾老年人和残疾人。特斯拉首席执行官埃隆·马斯克似乎也计划通过其电动汽车公司开发的 Optimus 人形 AI 机器人实现类似战略。马斯克尚未透露特斯拉正在使用的 AI 形式的任何细节。
这一切看似预示着人类在工厂中将不再必要。但正如 Pratt 博士指出的,随着制造业变得更加灵活,同一生产线上生产的产品种类不断增加,工厂将越来越依赖人类劳动力来管理变革和维护机器人。众人拾柴火焰高。

END
感谢大家阅读,Enjoy!
来源:The Economist,Nov. 30-Dec. 6, 2024
非官方译文,仅供参考
往期阅读
「经济学人」马斯克变了!Elon Musk’s transformation, in his own words  | 外刊阅读
如何理解剑桥词典2024年度词汇manifest? | 外刊阅读
最新一期「时代周刊」每天改善大脑健康的5种方法 | 外刊阅读
介绍一本金句频出、提升认知的英语原版书
到顶部