Mobile QR Code
Title Design and Implementation of an IREE Bytecode Interpreter onRISC-V SoCs for Efficient AI Inference
Authors 박상철(Sangcheol Park) ; 강진구(Jin-Ku Kang) ; 김용우(Yongwoo Kim)
DOI https://doi.org/10.5573/ieie.2026.63.4.3
Page pp.3-11
ISSN 2287-5026
Keywords IREE; Interpreter; RISC-V; AI inference
Abstract Machine learning inference in small embedded environments commonly relies on Ahead-of-Time (AOT) compilers such as TensorFlow Lite Micro, TVM (Tensor Virtual Machine), and XLA (Accelerated Linear Algebra) to reduce interpretation overhead. However, these approaches require full firmware recompilation for every model update and cause storage redundancy and maintenance overhead when multiple models are used. FPGA overlays like TVM+VTA (Versatile Tensor Accelerator) improve throughput but demand additional resources and bitstream re-synthesis, limiting their suitability for lightweight devices. To overcome these limitations, we propose a lightweight interpreter that directly executes bytecode generated by the IREE (Intermediate Representation Execution Environment) compiler. The interpreter remains unchanged, while only the model data and initialization routine are replaced, enabling model substitution without code modification. On a 32-bit RISC-V-based Rocket-SoC with 256 KB memory, the proposed method achieved about 21.14× performance improvement on MNIST classification with μkernel optimization compared to non-optimized execution. On x86_64, deployment size was minimized, being 6.39× and 16.52× smaller than the official IREE and TVM runtimes, respectively. FPGA synthesis also showed lower LUT, FF, and memory usage than TVM+VTA, without requiring bitstream re-synthesis for model changes. In conclusion, the proposed architecture addresses the limitations of static methods, offering advantages in model replacement flexibility, deployment efficiency, and resource savings.