A Proposed Priority Pushing and Grasping Strategy Based on an Improved Actor-Critic Algorithm
Апстракт
The most basic and primary skills of a robot are pushing and grasping. In cluttered scenes, push to make room for arms and fingers to grasp objects. We propose a modified Actor-Critic (A-C) framework for deep reinforcement learning, Cross-entropy Softmax A-C (CSAC), and use the Prioritized Experience Replay (PER) based on the theoretical foundation and main methods of deep reinforcement learning, combining the advantages of algorithms based on value functions and policy gradients. The grasping model is trained using self-supervised learning to achieve end-to-end mapping from image to propulsion and grasping action. A vision module and an action module have been created out of the entire algorithm framework. The prioritized experience replay is improved to further improve the CSAC-PER algorithm for model sample diversity and robot exploration performance during robot grasping training. The experience replay buffer is dynamically sampled using the prior beta distribution and the dynamic ...sampling algorithm based on the beta distribution (CSAC-beta) is proposed based on the CSAC algorithm. Despite its low initial efficiency, the experimental simulation results show that the CSAC-beta algorithm eventually achieves good results and has a higher grasping success rate (90%).
Кључне речи:
robotic manipulation / FCN / deep reinforcement learning / beta distributionИзвор:
Electronics, 2022, 11, 13Издавач:
- MDPI, Basel
Финансирање / пројекти:
- ProvincialNatural Science Foundation [2108085ME166]
- Natural Science Research Project of Universities in Anhui Province [KJ2021A0408]
- Open Project of China International Science and Technology Cooperation Base on Intelligent Equipment Manufacturing in Special Service Environment [ISTC2021KF08
DOI: 10.3390/electronics11132065
ISSN: 2079-9292
WoS: 000825683600001
Scopus: 2-s2.0-85133135093
Колекције
Институција/група
Mašinski fakultetTY - JOUR AU - You, Tianya AU - Wu, Hao AU - Xu, Xiangrong AU - Petrović, Petar AU - Rodić, Aleksandar PY - 2022 UR - https://machinery.mas.bg.ac.rs/handle/123456789/3725 AB - The most basic and primary skills of a robot are pushing and grasping. In cluttered scenes, push to make room for arms and fingers to grasp objects. We propose a modified Actor-Critic (A-C) framework for deep reinforcement learning, Cross-entropy Softmax A-C (CSAC), and use the Prioritized Experience Replay (PER) based on the theoretical foundation and main methods of deep reinforcement learning, combining the advantages of algorithms based on value functions and policy gradients. The grasping model is trained using self-supervised learning to achieve end-to-end mapping from image to propulsion and grasping action. A vision module and an action module have been created out of the entire algorithm framework. The prioritized experience replay is improved to further improve the CSAC-PER algorithm for model sample diversity and robot exploration performance during robot grasping training. The experience replay buffer is dynamically sampled using the prior beta distribution and the dynamic sampling algorithm based on the beta distribution (CSAC-beta) is proposed based on the CSAC algorithm. Despite its low initial efficiency, the experimental simulation results show that the CSAC-beta algorithm eventually achieves good results and has a higher grasping success rate (90%). PB - MDPI, Basel T2 - Electronics T1 - A Proposed Priority Pushing and Grasping Strategy Based on an Improved Actor-Critic Algorithm IS - 13 VL - 11 DO - 10.3390/electronics11132065 ER -
@article{ author = "You, Tianya and Wu, Hao and Xu, Xiangrong and Petrović, Petar and Rodić, Aleksandar", year = "2022", abstract = "The most basic and primary skills of a robot are pushing and grasping. In cluttered scenes, push to make room for arms and fingers to grasp objects. We propose a modified Actor-Critic (A-C) framework for deep reinforcement learning, Cross-entropy Softmax A-C (CSAC), and use the Prioritized Experience Replay (PER) based on the theoretical foundation and main methods of deep reinforcement learning, combining the advantages of algorithms based on value functions and policy gradients. The grasping model is trained using self-supervised learning to achieve end-to-end mapping from image to propulsion and grasping action. A vision module and an action module have been created out of the entire algorithm framework. The prioritized experience replay is improved to further improve the CSAC-PER algorithm for model sample diversity and robot exploration performance during robot grasping training. The experience replay buffer is dynamically sampled using the prior beta distribution and the dynamic sampling algorithm based on the beta distribution (CSAC-beta) is proposed based on the CSAC algorithm. Despite its low initial efficiency, the experimental simulation results show that the CSAC-beta algorithm eventually achieves good results and has a higher grasping success rate (90%).", publisher = "MDPI, Basel", journal = "Electronics", title = "A Proposed Priority Pushing and Grasping Strategy Based on an Improved Actor-Critic Algorithm", number = "13", volume = "11", doi = "10.3390/electronics11132065" }
You, T., Wu, H., Xu, X., Petrović, P.,& Rodić, A.. (2022). A Proposed Priority Pushing and Grasping Strategy Based on an Improved Actor-Critic Algorithm. in Electronics MDPI, Basel., 11(13). https://doi.org/10.3390/electronics11132065
You T, Wu H, Xu X, Petrović P, Rodić A. A Proposed Priority Pushing and Grasping Strategy Based on an Improved Actor-Critic Algorithm. in Electronics. 2022;11(13). doi:10.3390/electronics11132065 .
You, Tianya, Wu, Hao, Xu, Xiangrong, Petrović, Petar, Rodić, Aleksandar, "A Proposed Priority Pushing and Grasping Strategy Based on an Improved Actor-Critic Algorithm" in Electronics, 11, no. 13 (2022), https://doi.org/10.3390/electronics11132065 . .