ログイン
Language:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. 学位論文
  2. 博士論文

動的な階層環境における強化学習エージェントの確率知識を用いた方策改善に関する研究

https://doi.org/10.15118/00005125
https://doi.org/10.15118/00005125
de442983-c8aa-4138-985d-3e296000a32f
名前 / ファイル ライセンス アクション
A369.pdf A369 (3.3 MB)
A369_summary.pdf A369_summary (473.4 kB)
アイテムタイプ 学位論文 / Thesis or Dissertation(1)
公開日 2015-06-11
タイトル
タイトル 動的な階層環境における強化学習エージェントの確率知識を用いた方策改善に関する研究
言語 ja
言語
言語 jpn
資源タイプ
資源タイプ識別子 http://purl.org/coar/resource_type/c_db06
資源タイプ doctoral thesis
ID登録
ID登録 10.15118/00005125
ID登録タイプ JaLC
アクセス権
アクセス権 open access
アクセス権URI http://purl.org/coar/access_right/c_abf2
著者 ポッマサク, ウタイ

× ポッマサク, ウタイ

ja ポッマサク, ウタイ

en PHOMMASAK, UTHAI


Search repository
抄録
内容記述タイプ Abstract
内容記述 With the increasing use of rescue robots in disasters, such as earthquakes and tsunami, there is an urgent need to develop robotics software that can learn and adapt to any environment. Reinforcement Learning (RL) is often used in the development of robotic software. RL is a field of machine learning within the computer science domain; moreover, many RL methods have been proposed recently and applied to a variety of problems, where agents learn policies to maximize the total number of rewards determined according to specific rules. In the process whereby agents obtain rewards, data consisting of state-action pairs are generated. The agents’ policies are improved effectively by a supervised learning mechanism using a sequential expression of the stored data series and rewards. Typically, RL agents must initialize policies when they are placed in a new environment, and the learning process starts afresh each time. Effective adjustment to an unknown environment becomes possible using statistical methods, such as a Bayesian network model, mixture probability, and clustering distribution, which consist of observational data for multiple environments that the agents have learned. However, adapting to environmental change, such as unknown environments, is challenging. For example, setting appropriate experimental parameters, including the number of the input status and the output action, becomes difficult in complicated real environments, and that makes it difficult for an agent to learn a policy. Furthermore, the use of a mixture of Bayesian network models increases the system’s calculation time. In addition, due to limited processing resources, it becomes necessary to control computational complexity. The goal of this research is to create an efficient and practical RL system that is adaptive to unknown and complex environments, such as dynamic movement environments and multi-layer environments. In addition, the proposed method attempts to control computation complexity while retaining system performance. In this study, a modified profit-sharing method with new parameters, such as changing reward value, is proposed. A weight update system and changing the dimension of the episode data make it possible to work in dynamically moving multi-layer environments. A mixture probability consisting of the integration of observational environmental data that an agent has learned within an RL framework is introduced. This provides initial knowledge to the agent and enables efficient adjustment to a changing environment. A clustering method that enables selection of fewer elements has also been implemented. This reduces computational complexity significantly while retaining system performance. By statistical-model approach, an RL system with a utility algorithm that can adapt to unknown multi-layer environments is realized.
言語 ja
学位授与機関
学位授与機関識別子Scheme kakenhi
学位授与機関識別子 10103
学位授与機関名 室蘭工業大学
言語 ja
学位授与機関名 Muroran Institute of Technology
言語 en
学位名
学位名 博士(工学)
言語 ja
学位の種別
言語 ja
値 課程博士
学位授与番号
学位授与番号 甲第369号
報告番号
言語 ja
値 甲第369号
学位記番号
言語 ja
値 博甲第369号
学位授与年月日
学位授与年月日 2015-03-23
日本十進分類法
主題Scheme NDC
主題 548
著者版フラグ
出版タイプ VoR
出版タイプResource http://purl.org/coar/version/c_970fb48d4fbd8a85
フォーマット
内容記述タイプ Other
内容記述 application/pdf
戻る
0
views
See details
Views

Versions

Ver.1 2023-06-19 11:18:26.145472
Show All versions

Share

Share
tweet

Cite as

Other

print

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR 2.0
  • OAI-PMH JPCOAR 1.0
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX
  • ZIP

コミュニティ

確認

確認

確認


Powered by WEKO3


Powered by WEKO3