| Title |
Design and Implementation of an IREE Compiler based RISC-V SoC Architecture for On-device AI Inference |
| Authors |
박수환(SuHwan Park) ; 강진구(Jin-Ku Kang) ; 김용우(Yongwoo Kim) |
| DOI |
https://doi.org/10.5573/ieie.2026.63.4.12 |
| Keywords |
On-device AI inference; Interpreter; IREE compiler; SoC design; FPGA |
| Abstract |
As demand for on-device inference grows, compiler-based SoC designs capable of running diverse models on lightweight hardware are attracting increasing attention. However, prior studies based on ONNX-MLIR and NEST-C require hardware-specific code generation and tend to be optimized for particular network families, limiting generality and maintainability. In this work, we design a RISC-V?based SoC that directly interprets and executes VM Bytecode generated by the IREE compiler. The proposed SoC runs a C-implemented interpreter on the Rocket Core and leverages a common VM Bytecode format with ukernel invocation, thereby accommodating compiler extensions and new operators without hardware redesign. The design also minimizes on-chip RAM while using host memory for large data, satisfying edge resource constraints. Performance was evaluated on a Xilinx Zynq ZC706 FPGA with MUL, MMT, and MNIST models, focusing on how the Rocket Core data cache capacity affects inference time. Compared to the 1 KB cache configuration, we observe up to a 28.8% speedup, while the benefit of larger capacities saturates depending on data size and access patterns. These results validate the feasibility of an IREE-based compiler?hardware integration and demonstrate that memory and cache hierarchies are decisive factors for on-device inference performance. |