Mobile QR Code
Title Design and Implementation of an IREE Compiler based RISC-V SoC Architecture for On-device AI Inference
Authors 박수환(SuHwan Park) ; 강진구(Jin-Ku Kang) ; 김용우(Yongwoo Kim)
DOI https://doi.org/10.5573/ieie.2026.63.4.12
Page pp.12-21
ISSN 2287-5026
Keywords On-device AI inference; Interpreter; IREE compiler; SoC design; FPGA
Abstract As demand for on-device inference grows, compiler-based SoC designs capable of running diverse models on lightweight hardware are attracting increasing attention. However, prior studies based on ONNX-MLIR and NEST-C require hardware-specific code generation and tend to be optimized for particular network families, limiting generality and maintainability. In this work, we design a RISC-V?based SoC that directly interprets and executes VM Bytecode generated by the IREE compiler. The proposed SoC runs a C-implemented interpreter on the Rocket Core and leverages a common VM Bytecode format with ukernel invocation, thereby accommodating compiler extensions and new operators without hardware redesign. The design also minimizes on-chip RAM while using host memory for large data, satisfying edge resource constraints. Performance was evaluated on a Xilinx Zynq ZC706 FPGA with MUL, MMT, and MNIST models, focusing on how the Rocket Core data cache capacity affects inference time. Compared to the 1 KB cache configuration, we observe up to a 28.8% speedup, while the benefit of larger capacities saturates depending on data size and access patterns. These results validate the feasibility of an IREE-based compiler?hardware integration and demonstrate that memory and cache hierarchies are decisive factors for on-device inference performance.