Mobile QR Code
Title Enhancing Task Planning Efficiency of Reinforcement Learning Agents through LLM-based Action Masking
Authors 조준형(Junhyung Cho) ; 정소이(Soyi Jung)
Page pp.120-130
ISSN 2287-5026
Keywords Reinforcement learning; Large language model; Action masking; Task planning; Hierarchical learning
Abstract Reinforcement learning(RL) agents face exploration inefficiency challenges in complex sequential task planning due to large action spaces. This paper proposes an large language model(LLM)-based action masking technique to address this problem. The proposed method adopts a training-validation bifurcated structure, where a robust policy is acquired without masking during training, and phase-based action masks generated by LLM analyzing target geometry are applied during validation. Experimental results on autonomous excavation control demonstrate that the proposed method achieves 16.9% improvement in success rate and 38.6% improvement in spatial accuracy compared to baseline RL without masking, and 10.5% improvement in success rate and 31.2% improvement in spatial accuracy compared to rule-based masking. Notably, the superiority of the proposed method becomes more pronounced as target area size increases, demonstrating that LLM's adaptive geometric decomposition and dynamic action masking effectively handle increasing complexity. This study shows that effectively integrating LLM reasoning capabilities into RL exploration processes can simultaneously enhance learning efficiency and performance in complex sequential task planning problems.