LeeJunhyeong1
ChaMisun1
KwonMin-Woo1*
-
(Department of Electric Engineering, Gangneung-Wonju National University, Gangneung,
25457, Korea)
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Index Terms
Feedback field effect transistor (FBFET), processing in memory (PIM), reconfigurable logic, charge trap layer
I. INTRODUCTION
The existing computing system follows a von Neumann structure, where memory and CPU
are separated, resulting in slow data transfer [1-4]. This leads to high energy consumption and the occurrence of a memory wall. To address
these challenges, the development of Process in Memory (PIM) technology has been underway
[5-8]. PIM technology is being explored in three ways. The first is near memory PIM technology
(Fig. 1(a)), which reduces the distance between memory and CPU [9,10]. The advantage of this approach is that shorter distances enable faster data transfer.
Near-memory technology is also relatively easy to implement, making it accessible.
The second approach is With-memory PIM technology (Fig. 1(b)). It involves reducing the distance between memory and CPU further by incorporating
some CPU functions into memory. Both approaches alleviate the memory wall problem
by reducing the distance between the CPU and memory. However, these methods have limitations
in fully addressing the bottleneck of the von Neumann structure. Therefore, a third
technology is required, namely In-memory PIM technology (Fig. 1(c)). Since operations are performed within the memory, data transfer to memory is unnecessary,
resulting in significantly reduced power consumption.
Various PIM technologies have been announced. (Table 1) First IBM developed embedded MRAM technology to replace SRAM in Last Level Cache
(LLC). A large-capacity SRAM or embedded DRAM (eDRAM) is used for the L4 cache of
a large-capacity microprocessor. At Tsinghua Univ, China announced LerGAN, a PIM-based
GAN accelerator with low data movement and zero computation utilizing ReRAM PIM. [11,12] Samsung announced HBM-PIM technology, a structure in which High Bandwidth Memory
(HBM) and DRAM are bundled into one package with TSV and connected to CPU/GPU through
an interposer. Panasonic announced the MN101L 8-bit Microcontrollers, the first microcontroller
with internal resistance RAM (ReRAM). Resistant RAM is a new non-volatile built-in
memory that writes five times faster than flash or EEPROM memory without erasing cycles.
[13] UPMEM proposed an 8GB DDR4-2400 module, each with 128 DPUs occupying 64 MB of memory
and running at 500MHz. [14,15] Although PIM technology has been implemented in various countries, it closely aligns
with near memory technology and is not the ultimate solution. Therefore, to achieve
the ultimate PIM technology, a new memory structure needs to be designed rather than
relying on existing structures.
In this paper, we propose a new structure to implement PIM technology. We propose
a charge trap flash memory Positive Feedback Field Effect Transistor (CTF-FBFET).
FBFET is devices that can use their structural characteristics to implement PIM. The
CTF-FBFET we propose combines two inputs with an oxide/nitride/oxide (ONO) structure.
We have verified that the operations of AND and OR can be reconfigured in an FBFET
using two inputs. The logic operations are selected by the control gate voltage (V$_{C.G}$).
Additionally, data is stored in the charge trap layer to enable memory operations,
and logic operations are performed by reading the data values stored in each cell.
As a result, the device performs memory read and computation operations simultaneously.
The proposed structure has been validated through TCAD simulation.
Fig. 1. (a) Techniques that reduce the distance between processor and memory; (b) Techniques that assign some of the processor's functions to memory; (c) Techniques for performing memory and operations in the cell itself.
Table 1. PIM technology trends. It is a technology that puts an operator in memory or brings the distance between the operator and the memory close, not the technology that memory calculates
II. RE-CONFIGURABLE LOGIC OPERATION
The FBFET we have developed consists of a pnpn-doped body, two inputs, and a control
gate. The voltages applied to input 1 and input 2 regulate the energy barrier and
the depth of the potential well. The process method for fabricating the FBFET is illustrated
in Fig. 2(a) [16]. Due to its compatibility with CMOS, the FBFET is seamlessly integrated with CMOS
circuits. Initially, active patterning is executed on a silicon-on-insulator (SOI)
wafer, followed by p-type body implantation using BF$_{2}$$^{+}$ ions at a dosage
of 1${\times}$10$^{13}$cm$^{-2}$ to finely adjust the threshold voltage. Subsequently,
a layer of n$^{+}$ doped poly-Si is deposited after dry oxidation of a 10 nm thick
SiO$_{2}$ input at 950 $^{\circ}$C. Following the patterning of the input, n-type
implantation is carried out within the potential well at a dosage of 2${\times}$10$^{13}$cm$^{-2}$,
and the buffer oxide is eliminated through HF wet etching. To separate input and control
gate, the control gate oxide is grown to a thickness of 10 nm using dry oxidation.
The next step involves depositing n+ doped polysilicon, followed by the second gate
patterning. After the injection of n$^{+}$ source/p$^{+}$ drain, a rapid thermal annealing
(RTA) process is performed at 900$^{\circ}$C for 10 seconds in an oxygen atmosphere
to facilitate reoxidation. Finally, the interlayer dielectric (ILD) is deposited,
and contact hole etching and metal sputtering (Ti/TiN/Al/TiN) are conducted. The SEM
image of the device after gate patterning is shown in Fig. 2(b). The following are measurement results. Fig. 3 shows the drain current that varies with different V$_{C.G}$. The V$_{C.G}$ adjusts
the depth of the potential well in the n$^{-}$ type doped body. Consequently, the
threshold voltage (VT) can be controlled by varying the V$_{C.G}$, and as it increases,
the threshold voltage increases linearly. Fig. 4 shows that the longer the time to raise the gate voltage, the more electrons accumulate
in the body, resulting in a smaller subthreshold swing (SS). When the voltage of input1
is below the threshold voltage, electrons accumulate in the body. As the number of
electrons increases and surpasses the threshold, the FBFET is turned on through the
feedback interaction between the electron and the potential barrier, as shown in Fig. 5. The following are the mechanisms of the FBFET [17].
1. As the V$_{C.G}$ increases, the potential well depth below the control gate becomes
deeper.
2. Applying a positive bias to input1 lowers the potential barrier. And source electrons
are crossed the potential barrier and accumulated in the potential well under the
control gate.
3. The accumulated electrons lower the barrier height of the valence band on the drain,
and then holes are accumulated in the p-type body below the input1.
4. Likewise, the accumulated holes drop the potential barrier on the source, and the
potential barriers become very low. Therefore, the current increases rapidly and the
FBFET turns on steeply.
In other words, the VT is controlled by the control gate bias. As the V$_{C.G}$ increases,
the potential well deepens, causing the VT to increase (Fig. 6). Thus, the VT can be precisely modulated (Fig. 7). Reconfigurable logic operations have been performed using the VT controllability
of the FBFET. We performed AND and OR computing operations using two input FBFET.
When a high voltage is applied to the control gate, the potential well deepens, requiring
more charges to turn on. Therefore, if there is no input bias or only one input bias
is applied, the FBFET remains in a tuned-off state. The FBFET is turned on only when
a voltage is applied to both input1 and input2, representing the AND function, as
shown in Fig. 8(a). When operating with OR logic, a low voltage is applied to the control gate. By impressing
a low voltage on the control gate, the potential well becomes shallow, allowing the
FBFET to operate even if a voltage is applied to only one input, as shown in Fig. 8(b). In this configuration, when a voltage is applied to either input 1 or input 2, the
FBFET operates normally. Our proposed FBFET operates as an AND logic when the control
gate is set to 2 V and as an OR logic gate when a voltage of 0.4 V is applied (Table 2). Thus, we have successfully verified the reconfigurable AND/OR function within a
single device.
Fig. 2. (a) FBFET production process; (b) SEM image (C.G = Control Gate).
Fig. 3. Transfer curves measurement results for various control gate voltages. (V$_{drain}$=0.7 V).
Fig. 4. I-V measurement result according to voltage increase time of Input 1 at 10$^{-7}$ A current.
Fig. 5. Time FBFET turned on by sub-threshold gate 1 voltage (V$_{drain}$=1 V V$_{C.G}$=2 V).
Fig. 6. Energy band diagram of the FBFET according to control gate bias. As increasing control gate bias, potential well below control gate is deeper.
Fig. 7. Simulation result of transfer curve for various control gate voltage. As the control gate voltage increases, the turn-on time increases.
Fig. 8. Operational Graph of reconfigurable logic with FBFET: (a) AND operation = the control gate voltage is high; (b) OR operation =the V$_{C.G}$ is low.
Table 2. Logic simulation values. We set AND when 2V was applied to the control gate and OR when 0.4V was applied
III. CTF-FBFET FOR PROCESSING IN MEMORY (PIM)
We designed the CTF-FBFET structure for implementing PIM technology. The structure
incorporates a charge trap layer and a multi-input FBFET to enable memory operations.
It performs logic operations through the control gate and stores charges in the charge
trap layer for memory functions. When a program voltage is applied to input, charges
are stored in the CTF layer. Each data cell stores a charge based on the input voltage.
The device can store data while simultaneously performing AND or OR logic operations
when reading the data.
The proposed device structure is depicted in Fig. 9. In the proposed structure, the VT increases when a program voltage is applied to
the input (programming). This occurs because when a program voltage is applied to
the input, charges are injected into the charge trap layer through Fowler-Nordheim
(FN) tunneling, resulting in an increase in VT. The operation is illustrated in Fig. 10. In Fig. 10, the VT increases after applying the program voltage (11V) for 2 ${\mathrm{\mu}}$s
in the initial state, and then it decreases after applying the erase voltage (-8 V)
for 1.8~${\mathrm{\mu}}$s. As the program voltage increases, the VT also increases.
When a read bias is applied to an input of a device functioning as memory, the output
of the AND or OR logic operation is determined by the V$_{C.G}$. The device performs
AND logic when the control gate bias is high and OR logic when the voltage is low.
In the initial state [1 1], if a read voltage is biased, the FBFET is turned on because
electrons are injected into the n$^{-}$ type floating body. As only one cell is programmed,
the programmed cell (0) has a high VT. Therefore, when the same read voltage is applied,
electrons are injected only through the erased cell (1). Consequently, in this case,
if the potential well is deep due to a high voltage applied to the control gate, the
FBFET cannot be turned on. Similarly, if both cells are programmed, the FBFET cannot
be turned on regardless of the control gate bias. Fig. 11 presents the results after applying the read bias. The initial states of input1 and
input2 are logic [1 1]. Fig. 11(a) shows the AND operation simulation, Fig. 11(b) shows the OR operation simulation result, and Fig. 11(c) is the logical flow time diagram.
1. In the logical [1 1] state, the AND and OR outputs are 1, indicating they are turned
on.
2. Afterwards, a program voltage is applied to input 1.
3. If a read voltage is then applied, the AND output is 0, and the OR output is 1,
resulting in a logical [1 0] state.
4. Following that, a program voltage is applied to input2. If a read voltage is applied,
both the AND and OR outputs are 0, resulting in a logical [0 0] state.
Fig. 12 provides a detailed description of the logical operation, illustrating the result
of the processing in memory.
Fig. 9. CTF-FBFET simulation structure. A memory operation is performed by storing charges in the charge trap layer.
Fig. 10. Program and erase operation graph of CTF-FBFET. The threshold voltage increased after the program operation, and the threshold voltage decreased again after the erase operation.
Fig. 11. The simulation result after applying the read bias: (a) the same as the 2-bit operation of AND; (b) the same as the 2-bit operation of OR; (c) flow chart.
Fig. 12. Flow chart with structure: (a) AND operation: 1. Data programming 2. Read bias 3. Current flow to input2 4. FBFET is turn off because of high threshold V; (b) OR operation: 1,2,3. The sequence is the same as AND operation 4. FBFET is turn on because of low threshold V.
V. CONCLUSIONS
We have developed a novel structure of CTF-FBFET for the implementation of PIM. The
key innovation of our design is that each gate of the device functions as a memory
cell, enabling not only data reading but also the calculation of results based on
the stored data. Furthermore, the logic operation can be selected based on the voltage
applied to the control gate. We have conducted extensive TCAD simulations to demonstrate
the ability of the CTF-FBFET to reconfigure logic and perform logic operations within
the memory. Our findings pave the way for a new direction in PIM research and implementation.
ACKNOWLEDGMENTS
This research was supported by the National R&D Program through the National Research
Foundation of Korea (NRF) funded by the Ministry of Science and ICT (NRF-2022M3I7A1078936)
and this research was supported by "Regional Innovation Strategy (RIS)" through the
National Research Foundation of Korea (NRF) funded by the Ministry of Education (MOE)
2022RIS-005) and also, supported by the National Research Foundation of Korea (NRF)
grant funded by the Korea government (MSIT) (2021R1G1A1093786).
References
O. Villa et al., "Scaling the Power Wall: A Path to Exascale," SC '14: Proceedings
of the International Conference for High Performance Computing, Networking, Storage
and Analysis, 2014, pp. 830-841, doi: 10.1109/SC.2014.73.
Wulf, Wm A., and Sally A. McKee. "Hitting the memory wall: Implications of the obvious."
ACM SIGARCH computer architecture news 23.1 (1995): 20-24.
Machanick, Philip. "Approaches to addressing the memory wall." School of IT and Electrical
Engineering, University of Queensland (2002).
Wilkes, Maurice V. "The memory wall and the CMOS end-point." ACM SIGARCH Computer
Architecture News 23.4 (1995): 4-6.
Saulsbury, Ashley, Fong Pong, and Andreas Nowatzyk. "Missing the memory wall: The
case for processor/memory integration." ACM SIGARCH Computer Architecture News 24.2
(1996): 90-101.
Keckler, Stephen W., et al. "GPUs and the future of parallel computing." IEEE micro
31.5 (2011): 7-17.
Ghose, Saugata, et al. "A workload and programming ease driven perspective of processing-in-memory."
arXiv preprint arXiv:1907.12947 (2019).
Ahn, Junwhan, et al. "A scalable processing-in-memory accelerator for parallel graph
processing." Proceedings of the 42nd Annual International Symposium on Computer Architecture.
2015.
Ahn, Junwhan, et al. "PIM-enabled instructions: A low-overhead, locality-aware processing-in-memory
architecture." 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture
(ISCA). IEEE, 2015.
Chi, Ping, et al. "Prime: A novel processing-in-memory architecture for neural network
computation in reram-based main memory." ACM SIGARCH Computer Architecture News 44.3
(2016): 27-39.
B. Yan et al., "RRAM-based Spiking Nonvolatile Computing-In-Memory Processing Engine
with Precision-Configurable In Situ Nonlinear Activation," 2019 Symposium on VLSI
Technology, 2019, pp. T86-T87, doi: 10.23919/VLSIT.2019.8776485
Yan, Bonan, et al. "Resistive Memory‐Based In‐Memory Computing: From Device and Large‐Scale
Integration System Perspectives." Advanced Intelligent Systems 1.7 (2019): 1900068.
Edelstein, D., et al. "A 14 nm embedded stt-mram cmos technology." 2020 IEEE International
Electron Devices Meeting (IEDM). IEEE, 2020.
Ito, Satoru, et al. "ReRAM technologies for embedded memory and further applications."
2018 IEEE International Memory Workshop (IMW). IEEE, 2018.
Mochida, Reiji, et al. "A 4M synapses integrated analog ReRAM based 66.5 TOPS/W neural-network
processor with cell current controlled writing and flexible network architecture."
2018 IEEE Symposium on VLSI Technology. IEEE, 2018.
Min-Woo Kwon, Myung-Hyun Baek, Sungmin Hwang, "Integrate-and-fire neuron circuit using
positive feedback field effect transistor for low power operation", Journal of Applied
Physics 124, 152107 (2018)
Lee, Junhyeong, Misun Cha, and Min-Woo Kwon. "Capacitor-Less Low-Power Neuron Circuit
with Multi-Gate Feedback Field Effect Transistor." Applied Sciences 13.4 (2023): 2628.
Junhyeong Lee has been studying in the Department of Electronic Engi-neering at
Gangneung-Wonju National University (GWNU, Korea) from 2018 to 2023, His current research
interests include FBFET based on vertical NAND flash structure for In-memory computing
at the Intelligent Semiconductor Device & Circuit Design Laboratory (ISDL) according
to Professor Min-Woo Kwon.
Misun Cha is currently a bachelor's degree in electronic engineering at Gangneung-Wonju
National Univer-sity. Her research topic is FBFET. She studied neuron circuits with
a structure that utilizes Multi-gate FBFET and is currently studying PIM with CTF-FBFET
(Charge Trap Flash structure).
Min-Woo Kwon received B.S. and Ph. D. degrees in department of Electrical and
Computer Engineering from Seoul National University (SNU) in 2012 and 2019, respect-tively.
From 2019 to 2021, he worked at the Samsung semicon-ductor Laboratories, where he
contributed to the development of 1x nm DRAM cell transistor and its characterization.
In 2021, he joined Gangneung-Wonju National University (GWNU) as an assistant professor
in the Department of Electric Engineering, where he is currently a professor. His
current research interests include the design and fabrication of neuromorphic device
(memristor synaptic device, Neuron circuit), steep switching device (FBFET), DRAM
cell transistors and 2- dimensional nanomaterials.