Mobile QR Code QR CODE

Precision Matching Optimization of Optical Metrology and Inspection Equipment for Yield Enhancement in Semiconductor Manufacturing

https://doi.org/10.5573/JSTS.2025.25.6.623

(Hyoseop Shin) ; (Hojun Lee) ; (Dongkun Shin)

As semiconductor manufacturing continues to scale down, the precision and reliability of high-resolution optical inspection equipment have become critical to maintaining process yield. However, variations in hardware sensitivity and parameter settings among identical tools often lead to inconsistencies in defect detection, undermining the stability of process control. This study proposes an automated hardware matching framework that combines a structured calibration sample fabricated using a Particle Deposition System (PDS) with a residual matrix-based linear interpolation algorithm. The proposed method enables automatic alignment of sensitivity responses across tools by quantifying detection discrepancies and deriving optimal settings for laser power and time delay integration (TDI) gain. Applied to 191 optical inspection tools in a Samsung Electronics high-volume manufacturing (HVM) line, the approach achieved a sensitivity matching accuracy within ±2% and reduced preventive maintenance (PM) time by 45%. Deployed across 191 IS4100 patterned-wafer optical inspectors operating at after-develop (ADI), after-etch (AEI), and post-CMP inspection (PCI) checkpoints, the matched fleet provided consistent in-line gating of excursions, aligning with the observed 0.41-pp improvement in production yield. This framework demonstrates a significant advancement over manual calibration approaches and validates its potential as a core enabling technology for Smart Fab and Unattended Fab environments.

An Error-aware 4-2 Compressor Design for Balanced Accuracy and Efficiency in Approximate Multipliers

https://doi.org/10.5573/JSTS.2025.25.6.633

(Dongju Kim) ; (Yongtae Kim)

This paper presents an error-aware approximate 4-2 compressor design that enables a balanced trade-off between hardware efficiency and computational accuracy in approximate multipliers. The proposed design consists of two key components: a logic-simplified approximate compressor and a selective error compensation mechanism. The first component refines the Boolean expressions of an existing approximate 4-2 compressor to reduce area and power consumption. The second component introduces a lightweight correction circuit that targets high-probability error patterns, effectively minimizing the overall error distance. When implemented in a 32-nm CMOS technology, experimental results show that the proposed designs reduce area and power by up to 15% compared to existing approximate equivalents while achieving competitive accuracy metrics. The proposed multipliers are also applied to image processing and deep learning tasks to validate their practical benefits in error-resilient computing scenarios.

Real-time TSV 3D Shape Defect Inspection System Using Deep Learning based Fast Object Detection

https://doi.org/10.5573/JSTS.2025.25.6.645

(Kyeong Beom Park) ; (Jae Yeol Lee) ; (Harim Lee)

Conventional inspection methods for Through-Silicon Via (TSV) 3D shape defect detection, such as Scanning Electron Microscopy (SEM) and X-ray inspection, have been widely used due to their precision in structural analysis. However, these methods suffer from significant limitations including high equipment cost, long inspection time, and the inability to operate in real-time. Moreover, SEM is inherently a destructive technique, while X-ray imaging lacks sufficient resolution to detect nanoscale shape anomalies or polymer residues. These drawbacks hinder the implementation of fast and scalable inspection systems, which are increasingly demanded in modern semiconductor manufacturing, especially for high-density 3D integration. Therefore, a new approach is urgently required?one that ensures high detection accuracy while also being non-destructive, fast, and suitable for realtime inspection in practical production environments. In this paper, we develop a real-time TSV 3D shape defect inspection system implemented with a deep learning-based object detection method. For the real-time operation of the proposed system, YOLOv8 and YOLOv10 are utilized because the YOLO family of networks can guarantee fast inference performance as well as excellent detection performance. The YOLOv8 and YOLOv10 have intrinsic differences in the network architecture, such as anchor-free detection structure and NMS-free training and a dualhead structure, resulting in different inference and detection performance. Therefore, based on the performance comparison of the two networks, the appropriate model should be selected according to the specific needs for either faster inference or higher detection accuracy. In addition, for more reliable training of object detection networks, we collect 3D point cloud data containing TSV normal and defective pattern data created on real 8-inch silicon wafers. By obtaining the datasets from real silicon wafers, we can ensure the reliability and the practical applicability of the trained network performance. For the performance comparison, we utilize several performance metrics, which are processing time, precision, F1 score, and Fβ score. Finally, extensive evaluations confirm that the YOLOv8-l model achieves the highest precision (0.99997) and F1 score (0.99989), while the YOLOv10-n model exhibited the fastest processing time (0.18601 seconds) and the highest Fβ score (0.92565).

Coplanar Waveguide Sensor for Dimethyl Methyl Phosphonate (DMMP) Vapor Detection at Microwave Frequencies

https://doi.org/10.5573/JSTS.2025.25.6.654

(Zabdiel Brito-Brito) ; (Jorge A. I. Araujo) ; (Sung-min Sim) ; (Marcos T. de Melo) ; (Ignacio Llamas-Garro) ; (Jung-Mu Kim)

The development of a palladium-based Coplanar Waveguide (CPW) microwave sensor for detecting Dimethyl methyl phosphonate (DMMP) vapor is designed, fabricated, and tested in this paper. The sensor is fabricated using a thin palladium layer on a planar quartz substrate. Microwave transmission loss variations were used to identify the presence of DMMP vapor in a controlled environment. The presence of 400 parts per million (ppm) DMMP caused detectable changes in the transmission (S21) and admittance (Y11) parameters.

Hardware-software Co-design for Vector Similarity Search on HBM-PIM

https://doi.org/10.5573/JSTS.2025.25.6.662

(Nahyeon Kim) ; (Sujin Kim) ; (Min Jung) ; (Haechannuri Noh) ; (Ji-Hoon Kim)

Vector similarity search is a key component of Retrieval-Augmented Generation (RAG) for large language models (LLMs), requiring memory-intensive computations such as Manhattan distance, Euclidean distance, and cosine similarity. Processing-In-Memory (PIM) architectures offer a promising solution to accelerate these memory-bound operations by reducing data movement between memory and processor. This study presents a hardware-software co-design approach for optimizing distance computation on PIM. We first implemented and evaluated a vector similarity search application on a DRAM-based PIM platform using the developed computation library, achieving 44.2% and 59.0% speed improvements for Euclidean distance and cosine similarity, respectively, compared to the CPU. However, instruction set limitations led to performance bottlenecks despite software-level optimization. To address this, we utilized an HBM-based PIM simulator and proposed two new instructions, AMC and MAN, optimized for Euclidean and Manhattan distance computations. Evaluation using a simulator integrated with DRAMSim2 showed that the proposed instructions reduced the total cycle count for distance computations by up to 44% compared to the baseline, with performance gains increasing for larger input sizes. These results demonstrate that both software-level and instruction-level optimizations are essential to fully exploit the performance potential of PIM architectures for distance computation workloads.

A 64-channel High-compliance Neural Stimulator IC in Standard CMOS with Sub-1nC Charge Balancing for Seizure Suppression

https://doi.org/10.5573/JSTS.2025.25.6.670

(Seokbeom Cheon) ; (Seungah Lee) ; (Byeongseol Kim) ; (Joonsung Bae)

We present a 64-channel implantable neural stimulator with sub-nC charge-balanced current stimulation for seizure suppression applications. The regulated cascode current driver achieves almost full VDD compliance voltage range with 98% supply voltage utilization (4.9V from 5V) in standard 0.18 μm CMOS technology, eliminating the need for expensive high-voltage processes. A passive charge balancing scheme using a bootstrapped switch with reduced on-resistance (19.56 Ω) maintains residual charge levels below 1 nC, well within the 15 nC safe limits, enabling reliable long-term operation. The stimulation parameters, including current pulse width (1 μs-1023 μs), channel activation, stimulation frequency, and current amplitude (1 μA to 1.8 mA), are highly reconfigurable through a 10 MHz SPI interface, enabling real-time adaptive stimulation protocols. The hierarchical 8 + 3-bit DAC architecture provides superior current resolution compared to existing single DAC systems while maintaining compact area efficiency of 0.0125 mm2 per channel. In-vivo animal experiments demonstrate effective seizure suppression, achieving seizure reduction within 14 seconds after 40 seconds of 5 Hz stimulation at 50 μA amplitude, validating the therapeutic efficacy of the proposed system. The fabricated IC in 0.18 μm standard CMOS process successfully combines high channel count, optimal compliance voltage utilization, enhanced safety margins, and in-vivo validation, making it suitable for practical implantable epilepsy treatment applications.

An Area-efficient Two-step Vernier Time-to-digital Converter with a Metastability-free Phase Detector for NAND Flash Memory Interfaces

https://doi.org/10.5573/JSTS.2025.25.6.679

(Dong-Ho Shin) ; (Jun-Ha Lee) ; (Kang Yoon Lee)

This paper presents a low-power time-to-digital converter (TDC) designed for duty-cycle correction in NAND Flash memory interfaces. The proposed 5.5-bit two-step Vernier TDC integrates coarse and fine delay stages to support a wide timing range with compact implementation. A charge elimination circuit is incorporated into the true single-phase clocked (TSPC) sampling register to mitigate hold-time metastability with minimal area overhead. In addition, a twist power-gating (TPG) technique is applied to the delay chains to reduce leakage current with minimal impact on delay and performance. These techniques are suitable for multi-die memory systems requiring low standby current and compact layout. The TDC is fabricated in a 28-nm FD-SOI process using 150-nm thickoxide transistors to emulate NAND Flash interface conditions. Measurement results demonstrate a resolution of 3.64 ps at 100 MS/s and a power consumption of 0.9 mW. The core occupies an area of 0.0025 mm2 . The design achieves a balanced trade-off among resolution, power, and area, confirming its applicability to high-speed, lowpower memory interfaces.

Energy Efficient CMOS Stochastic Bit-based Bayesian Inference Accelerator

https://doi.org/10.5573/JSTS.2025.25.6.688

(Honggu Kim) ; (Yong Shim)

Stochastic computing-based Bayesian inference has emerged as a powerful approach for statistical computation, particularly in domains requiring high-dimensional probabilistic analysis. However, in conventional Von Neumann architectures, stochastic computing faces significant energy challenges due to the exponential growth of data volume associated with the Internet of Things (IoT). In this work, we proposed a CMOS stochastic bitbased Bayesian inference accelerator designed for energy-efficient stochastic computation. The stochastic bit in our design performs dual functions as both: 1) stochastic computation unit and 2) memory element, enabling an energyoptimized implementation of Bayesian inference. The proposed design is validated through a case study involving a 3-layer, 4-variable Bayesian network model, implemented using TSMC 65 nm GP process technology, with total energy of 1.5 nJ.

Enhanced Methane Gas Sensors Utilizing Lithium-ion Decorated SWCNTs Networks

https://doi.org/10.5573/JSTS.2025.25.6.696

(Lae Kim) ; (Myung-Hyun Baek)

Methane (CH4) is a colorless, odorless, and highly flammable gas, posing significant health and environmental risks due to its asphyxiant properties and increasing annual emissions. Accurate CH4 monitoring is essential, especially in confined environments, yet current detection solutions remain limited by sensitivity and recovery issues. We fabricated an effective and simple methane gas sensor based on lithium-ion (Li+) decorated single-walled carbon nanotube (SWCNTs) networks. By leveraging the unique interaction between Li+ ions and CH4 molecules, as well as the exceptional conductivity and surface area of SWCNTs, the proposed sensor demonstrates a 157% improvement in sensitivity and a substantial increase in recovery compared to undecorated SWCNT-based devices. The device demonstrated improved sensitivity and recovery across a range of methane concentrations and humidity levels, suggesting potential for further development toward industrial, environmental, and IoT methane detection applications.

Room Temperature Hydrogen Gas Sensor Based on Pd-SnO2 Nanomaterials with Electro-spinning

https://doi.org/10.5573/JSTS.2025.25.6.703

(Dongjun Jang) ; (Sangwan Kim) ; (Min-Woo Kwon)

The palladium (Pd) nanodots (NDs)-embedded SnO2 nanowires (NWs) are synthesized to fabricate highly sensitive hydrogen (H2) gas sensors, with a focus on the effect of electrospinning time. The SnO2 NWs are synthesized using the electro-spinning technique for durations of 5, 10, and 20 seconds. The Pd NDs are decorated on the surface of SnO2 via sputtering. The Pd- SnO2 gas sensor of the electro-spinning for 10 seconds demonstrates a low detection concentration range (4-50 ppm), high responsivity (∼ 10), and faster response/recovery times at room temperature. The improved sensing characteristics are attributed to the increased NWs density and the catalytic effects of PdHx formation, along with the Schottky barrier effect. Furthermore, the sensor demonstrates excellent stability and repeatability under ambient conditions (27?C, 31% RH), indicating its practical potential for real-time H2 monitoring in industrial safety and environmental applications. This study highlights the influence of electrospinning duration on the sensitivity and performance of H2 gas sensors.

A 1.12-ps Resolution Flash ADC-assisted Coarse-to-fine Time-to-digital Converter with Adaptive Reference-voltage Calibration and Digital Linearity Correction

https://doi.org/10.5573/JSTS.2025.25.6.711

(Solmon Shin) ; (Hyunwoo Son) ; (Youngsik Kim) ; (Shinwoong Kim)

This paper proposes a high-resolution time-to-digital converter (TDC) featuring a coarse-to-fine architecture that integrates a 13-stage ring oscillator for coarse measurement and a flash ADC for sub-phase fine correction. To address differential nonlinearity (DNL) induced by fixed ADC reference voltage, we introduce two calibration techniques: an adaptive reference voltage calibration loop?comprising a peak detector, comparator, delta-sigma modulator, and 1-bit DAC?that dynamically aligns the flash ADC reference to the actual input peak across ZONEs (coarse quantization regions), and a digital linearity correction that subdivides each of the 13 ZONEs into four sub-ZONEs (totaling 52) with lookup-table-based error compensation. Post-layout simulations in 28 nm CMOS demonstrate a time resolution of 1.12 ps, conversion range of 63 ns, conversion time of 5.6 ns, DNL of ±1.5 LSB, INL of ±5 LSB, and power consumption of 5.42 mW within a 0.625 mm2 core area. These results confirm the suitability of TDC for applications requiring both high speed and high accuracy.

Deep Learning Driven Modeling of Advanced Node FinFET

https://doi.org/10.5573/JSTS.2025.25.6.721

(Sehtab Hossain)

This study presents a novel application of deep learning algorithms to enhance the modeling of 14 nm FinFETs, specifically addressing the critical aspect of material discovery. While Density Functional Theory (DFT) is essential for material characterization, its computational intensity and time-consuming nature pose significant limitations. Similarly, shallow machine learning (ML) methods, despite their utility, often struggle with extensive data preprocessing, overfitting, and inherent biases. Our approach overcomes these challenges by integrating advanced data processing with deep learning for material discovery, specifically tailored for advanced node FinFET modeling. We meticulously prepared material descriptors and demonstrated the superior performance of deep learning in this context. With minimal fine-tuning, our deep learning model achieved a Mean Absolute Error (MAE) of approximately 0.14 eV. This performance significantly surpasses that of traditional shallow learning methods, including Support Vector Regression, Random Forest, and Extreme Gradient Boosting (XGBoost), as evidenced by a higher R2 score. These results underscore the exceptional proficiency of deep learning in accelerating material discovery and, consequently, improving the accuracy of advanced node FinFET modeling. This research highlights the profound efficiency of deep learning in pushing the boundaries of semiconductor device simulation.

An FVF-based Capacitorless LDO with Segmented Power Cells Achieving Fast Transient Response, Wideband High PSR, and Wide Load Current Range

https://doi.org/10.5573/JSTS.2025.25.6.730

(Doojin Jang)

This letter presents a capacitorless low-dropout (LDO) regulator with a flipped-voltage-follower (FVF) structure and segmented power cells, targeting fast transients and high power-supply-rejection (PSR). The design addresses the stability issues and bias point variations found in conventional FVF-based LDOs. Two key innovations are introduced: a cascode transistor in the fast loop to stabilize bias points, and a segmented core that extends the load current range and enhances AC performance. Additionally, this architecture enables efficient on-chip power delivery, reducing IR drop and simplifying power distribution networks. Implemented in 65-nm CMOS, the LDO operates from a 1.2-V input to a 1.0-V output, supporting a 100 mA load current. Simulations show a wideband PSR with a worst-case bound of < ?14.2 dB and a transient response with undershoot and overshoot limited to 59 mV and 58 mV, respectively, for a 30 mA load step with a 10 ns edge.