2024-07-27T02:49:19Z https://muroran-it.repo.nii.ac.jp/oai

oai:muroran-it.repo.nii.ac.jp:02000059 2023-10-05T01:01:26Z 216:325 46

Self-generation of reward by logarithmic transformation of multiple sensor evaluations Ono, Yuya 小野, 裕也 Kurashige, Kentarou 倉重, 健太郎 Hakim Afiqe Anuar Bin Muhammad Nor Sakamoto, Yuma 坂本, 悠真 Self-Generation of Reward Reinforcement learning Danger recognition Although the design of the reward function in reinforcement learning is important, it is difficult to design a system that can adapt to a variety of environments and tasks. Therefore, we propose a method to autonomously generate rewards from sensor values, enabling task- and environment-independent reward design. Under this approach, environmental hazards are recognized by evaluating sensor values. The evaluation used for learning is obtained by integrating all the sensor evaluations that indicate danger. Although prior studies have employed weighted averages to integrate sensor evaluations, this approach does not reflect the increased danger arising from a higher amount of more sensor evaluations indicating danger. Instead, we propose the integration of sensor evaluation using logarithmic transformation. Through a path learning experiment, the proposed method was evaluated by comparing its rewards to those gained from manual reward setting and prior approaches. journal article Springer Nature 2023 AM application/pdf Artificial Life and Robotics 2 28 287 294 1433-5298 https://muroran-it.repo.nii.ac.jp/record/2000059/files/camera_ready.pdf eng 10.1007/s10015-023-00855-1 © International Society of Artifcial Life and Robotics (ISAROB) 2023