Keywords |
TRPO; PPO; PID; 건물에너지; HVAC 제어 |
Abstract |
In traditional HVAC systems, the PID gains (Kp, Ki, Kd) for control have been determined by experts or through classic PID tuning methods based on system estimation as linear systems. This approach aims to optimize the PID gain values through control to enhance thermal comfort in buildings while reducing energy consumption. However, limitations exist in linear system estimation due to variations in control response time, diverse climate conditions, and challenges in predicting occupant behavior, which are contingent on building characteristics. In this study, we propose a method to discover PID gain values (Kp, Ki, Kd) for HVAC systems using reinforcement learning algorithms i.e., Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO). Through this approach, we have achieved a significant energy savings compared to conventional methods, and we plan to further validate this method across various building types and diverse climatic conditions. |