{"id":32,"date":"2023-12-12T21:46:56","date_gmt":"2023-12-12T21:46:56","guid":{"rendered":"https:\/\/sites.tntech.edu\/lcasl\/?page_id=32"},"modified":"2023-12-13T17:45:21","modified_gmt":"2023-12-13T17:45:21","slug":"stochastic-methods","status":"publish","type":"page","link":"https:\/\/sites.tntech.edu\/lcasl\/stochastic-methods\/","title":{"rendered":"Stochastic Methods"},"content":{"rendered":"\n<div style=\"height:var(--wp--preset--spacing--50)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-group alignwide has-global-padding is-layout-constrained wp-block-group-is-layout-constrained\">\n<h3 class=\"wp-block-heading alignwide has-text-align-center has-xx-large-font-size\" style=\"line-height:1.2\">Adaptive Risk Sensitive Path Integral for Model Predictive Control via Reinforcement Learning<\/h3>\n<\/div>\n\n\n\n<div class=\"wp-block-group alignfull has-global-padding is-layout-constrained wp-container-core-group-is-layout-2388926a wp-block-group-is-layout-constrained\" style=\"padding-top:var(--wp--preset--spacing--50);padding-right:var(--wp--preset--spacing--50);padding-bottom:var(--wp--preset--spacing--50);padding-left:var(--wp--preset--spacing--50)\">\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-694ead9e wp-block-columns-is-layout-flex\" style=\"margin-top:0;margin-bottom:0\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<p class=\"wp-block-paragraph\">We propose a reinforcement learning framework where an agent uses an internal nominal model for stochastic model predictive control (MPC) while compensating for a disturbance. Our work builds on the existing risk-aware optimal control with stochastic differential equations (SDEs) that aims to deal with such disturbance. However, the risk sensitivity and the noise strength of the nominal SDE in the risk-aware optimal control are often heuristically chosen. In the proposed framework, the risk-taking policy determines the behavior of the MPC to be risk-seeking (exploration) or risk-averse (exploitation). Specifically, we employ the risk-aware path integral control that can be implemented as a Monte-Carlo (MC) sampling with the fast parallel simulations using a GPU. The MC sampling implementations of the MPC have been successful in robotic applications due to its real-time computation capability. The proposed framework that adapts the noise model and the risk sensitivity outperforms the standard model predictive path integral in simulation environments that have disturbances.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"672\" height=\"288\" src=\"https:\/\/sites.tntech.edu\/lcasl\/wp-content\/uploads\/sites\/163\/2023\/12\/diagramrisksensitivepathrl_v2.png\" alt=\"\" class=\"wp-image-39\" style=\"width:1111px;height:auto\" srcset=\"https:\/\/sites.tntech.edu\/lcasl\/wp-content\/uploads\/sites\/163\/2023\/12\/diagramrisksensitivepathrl_v2.png 672w, https:\/\/sites.tntech.edu\/lcasl\/wp-content\/uploads\/sites\/163\/2023\/12\/diagramrisksensitivepathrl_v2-300x129.png 300w\" sizes=\"auto, (max-width: 672px) 100vw, 672px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading has-text-align-center\">UAV Simulation with Double Integrator Kinematics<\/h3>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"720\" style=\"aspect-ratio: 720 \/ 720;\" width=\"720\" controls src=\"https:\/\/sites.tntech.edu\/lcasl\/wp-content\/uploads\/sites\/163\/2023\/12\/Media1.mp4\"><\/video><\/figure>\n\n\n\n<h3 class=\"wp-block-heading has-text-align-center\">UGV Simulation with Bicycle Kinematics<\/h3>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"720\" style=\"aspect-ratio: 720 \/ 720;\" width=\"720\" controls src=\"https:\/\/sites.tntech.edu\/lcasl\/wp-content\/uploads\/sites\/163\/2023\/12\/Media2.mp4\"><\/video><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Relevant Paper:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Hyung-Jin Yoon, Chuyuan Tao, Hunmin Kim, Naira Hovakimyan, Petros Voulgaris, \u201cSampling Complexity of Path Integral Methods for Trajectory Optimization,\u201d IEEE American Control<br>Conference (2022 ACC)<br><\/li>\n\n\n\n<li>Chuyuan Tao, Hunmin Kim, Hyung-Jin Yoon, Naira Hovakimyan, Petros Voulgaris, \u201cControl Barrier Function Augmentation in Sampling-based Control Algorithm for Sample Efficiency,\u201d<br>IEEE American Control Conference (2022 ACC)<br><\/li>\n\n\n\n<li>Chuyuan Tao, Hyung-Jin Yoon, Hunmin Kim, Naira Hovakimyan, Petros Voulgaris, &#8220;Path Integral Methods with Stochastic Control Barrier Functions,&#8221;&nbsp;IEEE Conference on Decision and Control (2022 CDC)<br><\/li>\n\n\n\n<li>Yoon, Hyung-Jin, et al. &#8220;Adaptive Risk Sensitive Path Integral for Model Predictive Control via Reinforcement Learning.&#8221; 2023 31st Mediterranean Conference on Control and Automation (MED). IEEE, 2023.<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Adaptive Risk Sensitive Path Integral for Model Predictive Control via Reinforcement Learning We propose a reinforcement learning framework where an agent uses an internal nominal model for stochastic model predictive control (MPC) while compensating for a disturbance. Our work builds on the existing risk-aware optimal control with stochastic differential equations (SDEs) that aims to deal [&hellip;]<\/p>\n","protected":false},"author":184,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-32","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/sites.tntech.edu\/lcasl\/wp-json\/wp\/v2\/pages\/32","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sites.tntech.edu\/lcasl\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/sites.tntech.edu\/lcasl\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/sites.tntech.edu\/lcasl\/wp-json\/wp\/v2\/users\/184"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.tntech.edu\/lcasl\/wp-json\/wp\/v2\/comments?post=32"}],"version-history":[{"count":8,"href":"https:\/\/sites.tntech.edu\/lcasl\/wp-json\/wp\/v2\/pages\/32\/revisions"}],"predecessor-version":[{"id":77,"href":"https:\/\/sites.tntech.edu\/lcasl\/wp-json\/wp\/v2\/pages\/32\/revisions\/77"}],"wp:attachment":[{"href":"https:\/\/sites.tntech.edu\/lcasl\/wp-json\/wp\/v2\/media?parent=32"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}