孙宏滨

Personal profile

个人简介

东北大学理学院化学系教授、博士生导师,研究领域包括电化学储能、氢能、催化化学、精细化学品合成工艺、复合纳米材料、水处理等。主持国家重点研发计划课题、国家自然科学基金等多项国家级、省部级科研项目,以及多项横向项目。在Nat. Commun., Angew. Chem.,Adv. Funct. Mater.等国内外知名期刊发表论文130余篇,授...

more+

论文成果

Viewpoint planning with transition management for active object recognition

发布时间:2024-07-18  点击次数:

第一作者:Haibo Sun

通讯作者:Feng Zhu

合写作者:YangyangLi,PengfeiZhao,Yanzi Kong,Jianyu Wang,Yingcai Wan,Shuangfei Fu

发表刊物:Frontiers in Neurorobotics

卷号:17

DOI码:10.3389/fnbot.2023.1093132

所属单位:Faculty of Robot Science and Engineering, Northeastern University

教研室:物理化学

刊物所在地:SWITZERLAND

摘要:Active object recognition (AOR) provides a paradigm where an agent can capture additional evidence by purposefully changing its viewpoint to improve the quality of recognition. OneofthemostconcernedproblemsinAORisviewpointplanning (VP) which refers to developing a policy to determine the next viewpoints of the agent. A research trend is to solve the VP problem with reinforcement learning, namely to use the viewpoint transitions explored by the agent to train the VP policy. However, most research discards the trained transitions, which may lead to an ine cient use of the explored transitions. To solve this challenge, we present a novel VP method with transition management based on reinforcement learning, which can reuse the explored viewpoint transitions. To be specific, a learning framework of the VP policy is first established via the deterministic policy gradient theory, which provides an opportunity to reuse the explored transitions. Then, we design a scheme of viewpoint transition management that can store the explored transitions and decide which transitions are used for the policy learning. Finally, within the framework, we develop an algorithm based on twin delayed deep deterministic policy gradient and the designed scheme to train the VP policy. Experiments on the public and challenging dataset GERMS show thee ectiveness of our method in comparison with several competing approaches.

关键字:active object recognition, viewpoint planning, deterministic policy gradient, twin delayed deep deterministic policy gradient, viewpoint transition management, reinforcement learning

论文编号:WOS:000950219200001

学科门类:理学

一级学科:化学

页面范围:1093132

ISSN号:1662-5218

是否译文:

扫描查看移动版

访问量:     最后更新时间:--