第一作者:Haibo Sun
通讯作者:Feng Zhu
合写作者:YangyangLi,PengfeiZhao,Yanzi Kong,Jianyu Wang,Yingcai Wan,Shuangfei Fu
发表刊物:Frontiers in Neurorobotics
卷号:17
DOI码:10.3389/fnbot.2023.1093132
所属单位:Faculty of Robot Science and Engineering, Northeastern University
教研室:物理化学
刊物所在地:SWITZERLAND
摘要:Active object recognition (AOR) provides a paradigm where an agent can capture additional evidence by purposefully changing its viewpoint to improve the quality of recognition. OneofthemostconcernedproblemsinAORisviewpointplanning (VP) which refers to developing a policy to determine the next viewpoints of the agent. A research trend is to solve the VP problem with reinforcement learning, namely to use the viewpoint transitions explored by the agent to train the VP policy. However, most research discards the trained transitions, which may lead to an ine cient use of the explored transitions. To solve this challenge, we present a novel VP method with transition management based on reinforcement learning, which can reuse the explored viewpoint transitions. To be specific, a learning framework of the VP policy is first established via the deterministic policy gradient theory, which provides an opportunity to reuse the explored transitions. Then, we design a scheme of viewpoint transition management that can store the explored transitions and decide which transitions are used for the policy learning. Finally, within the framework, we develop an algorithm based on twin delayed deep deterministic policy gradient and the designed scheme to train the VP policy. Experiments on the public and challenging dataset GERMS show thee ectiveness of our method in comparison with several competing approaches.
关键字:active object recognition, viewpoint planning, deterministic policy gradient, twin delayed deep deterministic policy gradient, viewpoint transition management, reinforcement learning
论文编号:WOS:000950219200001
学科门类:理学
一级学科:化学
页面范围:1093132
ISSN号:1662-5218
是否译文:否
扫描查看移动版
校址:辽宁省沈阳市和平区文化路三巷11号 | 邮编:110819