Device Placement Optimization Based on Sequential Q-Learning Using Local Layout Effect
Surrogate Models
KangKwonWoo1
KimSoYoung2
-
(Department of Semiconductor and Display Engineering, Sungkyunkwan University, Suwon,
Korea and System LSI Business, Samsung Electronics, Gyeonggi, Hwaseong, Korea)
-
(Department of Semiconductor Systems Engineering, College of Information and Communication
Engineering, Sungkyunkwan University, Suwon, Korea)
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Index Terms
Analog device placement, optimization, reinforcement learning, local layout effect, LOD, DTI, ANN, surrogate model, sequential Q-learning
I. INTRODUCTION
The design of layouts, particularly in analog design, has historically been a manual
process; this is in contrast to the design of layouts in digital physical design.
Despite the existence of various ML-based methods for automating analog layout design,
these have not yet been widely adopted by the electronic design automation industry.
The design of analog integrated circuits is highly dependent on the process used,
including the specific device type and parasitic components employed. The introduction
of new processes necessitates alterations to circuit topology and device placement,
which in turn affects performance and the universal applicability of ML-based methods.
Local layout effects (LLEs), which are not well understood in new process nodes, impact
performance; understanding their influence requires extensive iterations. These effects
have been reported and studied as potentially degrading or enhancing device performance,
dating back to the use of legacy process nodes [1,2]. In the case of sufficiently researched and mature process nodes, these effects were
provided as analytical equations in the device compact model, such as BSIM, in the
Process Design Kit (PDK). This allowed for the evaluation of these effects through
post-simulation results based on the completed layout.
The advent of new device structures, such as FinFETs and nanosheet FETs, has necessitated
the development of more sophisticated technology and novel process architectures to
achieve desired performance levels [3]. Nevertheless, these advancements have introduced side effects that cause performance
variation, especially among LLEs that are either new effects or modify existing effects.
Consequently, during the process setup stage in early Design Technology Co-Optimization,
accurately accounting for variation in LLEs when designing test chips is particularly
challenging. As a result, inefficiently large margins, such as increased spacing and
the addition of dummy transistors, are often applied. This leads to degradation from
an area perspective.
This paper presents surrogate models trained with post-simulation datasets using an
advanced PDK with LLE compensation. A sequential Q-learning algorithm is employed
to integrate multiple surrogate models, which are optimized for device placement by
minimizing LLEs and maximizing area efficiency. The rewards include considerations
related to layout design, clustering by device type, tuning of input parameters, and
minimization of threshold voltage variation. The objective of this paper is to introduce
these rewards and demonstrate their effectiveness for achieving optimal LLE-aware
device placement based on a sequential multiple Q-learning algorithm.
The rest of this paper is organized as follows. Section II shows the data generation
flow for training the layout local effect surrogate model based on an artificial neural
network (ANN). Section III illustrates the proposed sequential Q-learning algorithm.
Section IV presents device placement results, and the paper is concluded in Section
V.
II. DATA GENERATION AND LOCAL LAYOUT EFFECT SURROGATE MODEL
1. Local layout effect
The LLE is a phenomenon observed in silicon semiconductor designs, where the performance
metrics of a device such as the mobility, threshold voltage, and sub-threshold swing
are influenced by the placement of surrounding devices and specific parameters, including
the type, width, length, and gate pitch. Moreover, LLE varies significantly across
different processes [4]. Consequently, each process can introduce new effects, eliminate previous ones, or
alter established trends. Therefore, the optimal device placement is process-specific
and depends on the node being utilized. We used a Gate-all-around (GAA) logic process
node and nanosheet FET devices. As depicted in Fig. 1(a), when viewed from the top, the device shows a vertical red gate layer, an active
region representing the doping layer that distinguishes the device type, and a nanosheet
layer that becomes the actual channel. The common overlapping area of these three
layers is the target area for calculating LLE parameters, with the edge serving as
the reference point, as indicated by the yellow border.
Fig. 1. Local layout effect (LLE) examples with feature parameters. (a) Targeting
device channel area for calculating LLE parameters. (b) Length of diffusion (LOD)
and diffusion break illustrated as ``Active cut.'' (c) Deep trench isolation effect
(DTI). (d) Well proximity effect.
(a)
(b)
(c)
(d)
In Fig. 1, the three different LLEs to be integrated into the rewards of Q-learning are depicted.
The length of diffusion (LOD) term in Fig. 1(b) refers to the distance from the edge of the diffusion break to the channel edge of
the target device [5]. The LOD effect, which is caused by STI and has been modeled analytically in prior
research [6], exhibits a similar trend to historical findings, leading to the adoption of the
term. As illustrated in Fig. 1, for a multi-finger gate structure of a device, estimating this effect requires two
parameters as inputs: the distance on the left, labeled ``a,'' and the distance on
the right, labeled ``b.'' Deep trench isolation (DTI) effectively mitigates the gain
in parasitic NPN devices in CMOS processes by maintaining a distance between the edges
of channels [7]. While DTI proves beneficial for enhancing device performance within the context
of scaling trends, it introduces new LLEs. A simple example in Fig. 1(c) shows the placement of DTI when P-type devices are arranged. Two parameters, one
for the north and one for the south, are considered from the target channel edge.
`DTIn' denotes the parameter for the north direction. The newly developed LLE
is proportional to the size of the DTI and the stress resulting from its application
to the channel. This leads to variation in device performance. Previous research analytically
modeled the well proximity effect (WPE) [8]. The effect manifests due to scattering during ion implantation, leading to a doping
gradient in the gate channel. As illustrated in Fig. 1(d), four direction parameters, NWa, NWb, NWn, and NWs, are considered
from the target channel edge.
2. Dataset preparation using PDK
The dataset for the three effects previously described was generated using a PDK procured
from the foundry. Consequently, the PDK was presumed to represent an optimal model-hardware
correlation and proceeded through the following method. The extent to which the threshold
voltage (Vt) of each device changes with placement was obtained through SPICE simulation.
The simulation results for all possible combinations were collected by adjusting the
positions of individual devices, each of which had a single gate finger, the smallest
unit used in the design.
The method used to measure Vt, which is universally adopted by foundries for measurement,
specification, and monitoring, involved the constant current method in Equation (1). The effective width and length of each device, designated as Weff and Leff,
respectively, are calculated in accordance with the specific device structure. The
critical current, Icrit, is set to 100 nA, and the gate-source voltage, Vgs,
is incrementally increased from 0 V to VDD, with VDS representing the input
voltage, VDD. Once the current flowing through the channel reaches IcritWeffLeff,
the voltage at the saturation state is defined as the threshold voltage, Vtsat.
3. Results of surrogate models trained using LASSO
In this paper, we employed an ANN model to create LLE surrogate models. The model
structure used in our study is briefly illustrated in Fig. 2. The primary reason for selecting an Artificial Neural Network (ANN) in this study
is its robust capability to regress any arbitrary non-linear function that defines
the relationship between input process parameters and device performance, particularly
in the process setup stage where Local Layout Effects (LLE) are either under ongoing
research or lack compact models [9]. Additionally, ANNs can be easily extended into a multi-input-multi-output (MIMO)
model by simply adjusting the number of nodes in the output layer [10].
In the loss function J(w) of Equation (2), the last term, ∑|w|∗, represents L1 regularization, specifically
the least absolute shrinkage and selection operator (LASSO). This was employed to
reduce the size of the input feature vector. Reducing the size of the input vector
allows for the creation of a surrogate model comprising only features that influence
the output.
Fig. 2. LLE surrogate model structure.
However, while this approach enables the creation of reduced surrogate models for
known LLE effects, it is more practical to use ridge regression with L2 regularization
to handle unknown LLEs that may arise in advanced process nodes. Nonetheless, in this
paper, we omitted this approach as the primary objective was to verify changes in
trends for known effects.
The training results of the ANN model are presented in Table 1. The input features listed in the table, prior to feature selection, are the sum
of the parameters inherent to the device structure and the parameters derived from
the geometric positioning information resulting from device placement. Nodes represent
the number of nodes in the hidden layer, while layers indicate the total number of
hidden layers.
The concept of Accuracy∗, as indicated in Table 1, employs the notion of a relative error margin to assess the accuracy of the trained
ANN. The related equations are (3) and (4). There will always be slight errors when compared with the true values due to the
limitations of computers. Hence, the concept of a relative error margin was employed.
Consequently, the rationale behind the 100\% accuracy observed in Table 1 can be attributed to this concept.
Fig. 3. The trained P-type device's LLE surrogate models. (a) DTI surrogate model
plot, (b) LOD surrogate model plot, and (c) WPE surrogate model plot with two parameters.
The remaining parameters are fixed.
(a)
(b)
(c)
From a physical design perspective, selectable LLE features have been identified as
lateral LOD and vertical DTI based on currently available knowledge. The criteria
for selection were determined by how sensitively the trained LLE surrogate models
respond within the given input range. Using LASSO, effects not represented in the
input feature matrix related to device placement were also eliminated. Consequently,
by comparing Tables 1(a) and 1(b), we observed that feature-reduced surrogate models
could be developed without any loss of accuracy compared to the models without feature
reduction. The results of trained LOD, DTI, and WPE surrogate models are presented
in Fig. 3 as two- and three-dimensional plots. The values of the x-, y- and z-axes were
normalized. Device placement was not considered in WPE, as its variation is minimal
in comparison to the other two effects.
Table 1. Dataset for training LLE surrogate models using artificial neural networks.
Inputs, nodes, and layers are hyperparameters of the ANN model. (a) Without L1 regularization
and (b) with L1 regularization.
The threshold voltage of a device is calculated with the LOD and DTI surrogate models
in relation to the right-side distance (LODa) and north-side distance (DTIn)
in Fig. 3 on the right side of (a) and (b). The remaining parameters are held constant. Therefore,
for the LOD effect, placement is optimized to a low-gradient saturation region (LODa>0.3).
Conversely, effects with linear characteristics, such as DTI, are managed to minimize
the area while ensuring that the effect is consistently matched across devices.
Therefore, the introduced surrogate models will act as an essential tool for automatically
identifying optimal placements that account for the LLEs. This will be further elucidated
in Section III.
III. SEQUENTIAL Q-LEARNING ALGORITHM
In contrast to supervised learning, where correct responses are established in advance,
Q-learning is effective in scenarios where a predefined dataset is lacking, making
it particularly useful in environments with scarce or inefficiently generated data,
such as analog layout design using a cutting-edge process node. In analog layout design,
the quantity of accumulated data is typically constrained, with the exception of a
few layout modifications. Furthermore, due to concerns about potential circuit information
leakage and the need to protect intellectual property (IP), it is not feasible to
create large datasets of various analog layout designs for training neural networks.
Moreover, analog circuits are highly susceptible to the process node utilized, such
that even circuits with identical functionality may necessitate alterations in topology.
Consequently, it is imperative to develop a dataset tailored to each specific process
node, even for the same circuit. Hence, the implementation of reinforcement learning
(RL) for device placement optimization is a fitting strategy.
1. Problem formation
1) Quantizing design space for states (s) and actions (a)
To implement a practically computable version of Q-learning, actions have been quantized
into a finite action space. For device placement, five possible actions--north, south,
west, east, and stationary--have been defined within this space, considering each
device as a state. Due to the requirement that each action must be selectable within
a finite space, the environment in which the states move is also quantized into a
grid-based format. The process employed was the GAA process node, utilizing the nanosheet
FET device. The available channel widths are constrained to discrete values. Although
it is possible to construct an analog block using devices with disparate channel widths,
this approach requires specific gate length values and the inclusion of spacers. This
method is inefficient from an area perspective and introduces an additional LLE, termed
tapered RX, to neighboring devices [11]. Furthermore, the contact-to-poly pitch (CPP) is fixed for each device gate length,
and a design rule exists that only allows devices with the same gate length to be
merged. Consequently, in order to minimize the size of a circuit block with a specific
function, the channel width and gate length are standardized. As a result, a grid-based
environment, as illustrated in Fig. 4(b), can be implemented for device placement. The horizontal grid is defined as one CPP,
and the vertical grid corresponds to the channel width of the device used in the design.
The blue and orange colors represent different types of devices, abstracted in the
manner shown in Fig. 4(a).
Fig. 4. (a) Abstraction of device. This will be a state in the proposed Q-learning
to be moved. (b) Quantized design space. P-type (blue) and N-type (yellow) are randomly
placed at the initial step.
(a)
(b)
2) Reward functions with LLE surrogate models
In analog circuits, small variations in the threshold voltage (Vth) can result
in significant output current deviation from the nominal value. This deviation increases
in proportion to the number of devices, rendering these applications highly susceptible
to threshold voltage variation. By extracting geometric parameters from the placed
devices and utilizing the trained LLE surrogate models, we can directly obtain the
estimated threshold voltage of the devices. This enables placement optimization under
LLE considerations. Therefore, reward functions are integrated with the LLE surrogate
models.
The reward of the i-th device or i-th cluster state at time step t in the
x-th sequence represented by rix,t encompasses a number of factors, including
the area, wire length, LLE parameters, type clustering, sub-circuit clustering, threshold
voltage difference and estimated threshold voltage by LLE surrogate models. The area
reward and wire length reward (rarea,t and rwire,t) are set with the purpose
of minimizing the area and length, respectively. They are applied as reversed rewards,
where smaller related parameters result in larger rewards.
LLE parameter rewards (rΔLOD,t, rΔDTI,t) consisted of LOD
parameter rewards and DTI parameter rewards. These rewards consider similarity of
LLE parameters (ΔLOD, ΔDTI) among devices (i and j) at time
step t. The differences in each of the LEE parameters serve as the criteria in Equation
(6) for determining whether a penalty or an optimal reward value should be applied. ν1 and ν2 are hyperparameters that are controlled during reward tuning.
Clustering rewards (rclustering,t) are used primarily for two purposes: clustering
by type and clustering by same sub-circuit (ssc). Type refers to the categorization
of devices into N- and P-types to maximize device matching. Within the same sub-circuits,
such as differential pairs or current mirrors, or even multi-finger devices, split
into multiple single-finger devices, as shown in Fig. 7(a), device pairs or groups are critical to achieve matching. Therefore, they must be
as close as possible and placed in geometrically identical environments, such as symmetric
and row-based arrangements. To accomplish this purpose, clustering rewards are used
with LLE parameter rewards in the proposed algorithm. This will be explained in the
next chapter. In Equation (7), Ωt refers to an optimal reward that provides additional value when the
device groups are placed close enough while considering layout constraints at time
step t. ¯dtype,t, ¯dxt, and ¯dyt in
Equation (7) refer to the distance between devices of the same type, the coordinate difference
in the x-axis direction, and the coordinate difference in the y-axis direction, respectively,
at time step t. θ, μ, σ and ω are hyperparameters
that are controlled during reward tuning.
The purpose of the difference in threshold voltage reward (rdvth, t) is to
reduce differences in estimated threshold voltage within the same sub-circuit topology
and for the same type of device. This reward is calculated through Equation (8). p is the number of sub-circuits in the input circuit. m is the number of devices
in the same sub-circuit. k and l is the total number of P-type and N-type devices,
respectively. Vth,LOD and Vth,DTI are calculated using the trained LLE surrogate
models explained in Section II. α, β, δ, and ζ are
hyperparameter controlled during reward tuning.
The objective of utilizing rvth_lle,t is to optimize the estimated threshold
voltage (Vth), which is calculated by the LOD surrogate models. As illustrated
in Fig. 3(b), in contrast to DTI, there is a saturated region in proximity to the zero gradient
at the maximum or minimum threshold voltage. Therefore, the objective is to maximize
or minimize Vth,LOD(LODa,LODb) while simultaneously minimizing the parameters
LODa and LODb. In the case of P-type devices, Equation (9) is employed, whereas for N-type devices, Equation (11) is utilized. As a result, the outputs of Equations (9)-(12) are integrated into Equations (13) and (14) as optimal (roptimal) and penalty (rpenalty) reward threshold values. In
the case of a P-type device with a threshold voltage, Vith, the reward for
said device (ri,P−typevth_lle,t) is calculated using Equation (13). Conversely, the reward for device j (rj,N−typevth_lle,t),
an N-type device with threshold voltage Vjth, would be calculated using
Equation (14). Ultimately, like Equation (15), the reward value, rvth_lle,t, is determined by aggregating the outputs of
the aforementioned equations. The two hyperparameters, w1 and w2, are calibrated
during the reward tuning process.
2. Proposed algorithm
The Q-learning algorithm utilized in this paper employs Watkins-Dayan [12]-based Q-learning, as demonstrated in Equation (16). α and γ represent the learning rate and discounted factor, respectively.
Additionally, epsilon-greedy exploration is applied to balance the trade-off between
exploitation and exploration, with probabilities of 1−ϵ and ϵ,
respectively. This approach allows the agent to reinforce the evaluation of known
good actions while also exploring new actions [13]. In this context, x represents the sequence index, t represents step number,
i represents the i-th device, and n represents the total number of devices,
respectively. The reward value, states set, and actions set of devices in sequence
x at step t are Rx,t, Sx,t, and Ax,t
in Equations (17), (18), and (19), respectively. Former related research [14] applied deep Q-learning that a state represents all devices' placement and agent
selects actions of all devices. However, an alternative approach has been taken whereby
individual Q tables (qix,t(sit,ait)) have been applied for each
state-action pair of i-th device, rather than applying a shared Q table for all
devices. It was observed that selecting all actions simultaneously in a single step
of Q-learning led to a failure in convergence to an optimal layout, instead resulting
in a persistent oscillatory pattern. The root cause of this issue lies in the trade-off
relationship between LLE rewards and traditional metrics such as area and wire length,
combined with the dynamic nature of LLEs, which continuously vary and exert influence
based on the real-time relative distances among all devices. Moreover, the Q-table
structure enables the effective representation of state and action pairs for individual
devices, thereby facilitating straightforward management. Consequently, as the number
of devices increases, the number of Q-tables increases correspondingly. However, this
approach may not fully capture the interactions between devices, especially when the
optimal action for one device depends on the state or action of another device. To
address this limitation, we applied a total reward (Rx,t) that considers
all devices in the circuit, as shown in Equation (17). Fig. 5 illustrates the structure of the proposed algorithm. Based on that, in order to emulate
knowledge-based design flow used by experts, a sequential algorithm is implemented
to identify the optimal placement and ensure efficient convergence to the optimal
solution. For this purpose, rather than simultaneously applying all reward functions
previously described, each sequence selectively uses them according to Equation (20) to calculate the next reward value at step t+1. The agent selects the current actions
for devices that yield the maximum Q values based on the current states array and
the reward at sequence x. After taking actions, the next states at step t+1 include
updated features for each device such as relative distances, device type, length,
coordinates, and LLE parameters. Those features interact with sequentially selected
rewards. Fig. 6 depicts the overall proposed algorithm.
Fig. 5. Structure of the proposed Q-learning's agent and environment.
Fig. 6. Sequential Q-learning algorithm.
IV. DEVICE PLACEMENT RESULTS
A folded cascode operational transconductance amplifier (OTA) comprising 16 transistors
with a single-finger gate and a strongarm comparator comprising 12 transistors with
a multi-finger gate are employed to reproduce LLE-aware automated placement in a grid-based
environment. As illustrated in Fig. 7(a), during the process of parsing multi-finger device information from the netlist,
multi-finger devices can be split into several single-gate finger devices. This method
is frequently employed during device placement optimization, whereby the potential
for merging devices on the same net, as shown in Fig. 7(b), is leveraged. As a result, the problem can be transformed into a single-finger device
placement optimization with hierarchical sub-circuit clustering, as illustrated in
Fig. 9(b). The proposed algorithm comprises five sequences, with the rewards utilized in each
sequence as follows:
1) Sequence 1: Area
2) Sequence 2: SC clustering, LOD parameters
3) Sequence 3: Type clustering, LOD parameters
4) Sequence 4: Wire length, DTI parameters, area
5) Sequence 5: ΔVth(DVTH), Vth(VTH LLE), area
Fig. 8(a) depicts the final rewards of each episode in each sequence, while sequence 1 plots
the rewards per step. The criteria for selecting the rewards for each sequence sought
to emulate the design knowledge of experts. Hereafter, each step will be explained
in detail.
In sequence 1, Q-learning is conducted using only the ``Area'' reward. This is done
to reduce the overall space of states available from the initial placement, preventing
unnecessary exploration actions and efficiently leading to sub-optimal states.
In sequence 2, ``LOD parameters'' and ``Sub-circuit (SC) clustering'' rewards are
applied simultaneously. As shown in Fig. 3(b), the LOD effect diminishes in proportion to the distance from the targeted channel
edge to the diffusion break. Therefore, when devices are merged continuously, this
effect is reduced. This is particularly important for device matching, where the merging
of adjacent device pairs in the same sub-circuit is crucial. If a device in the circuit
has a multi-finger gate feature, gate splitting is performed, and a hierarchical sub-circuit
structure is applied to the device. Subsequently, sequential clustering is performed
for gathering split single-finger devices as the way in Fig. 7(b), resulting in the devices being merged with one another.
Fig. 7. (a) The method of splitting a multi-finger device, MN1 into two single-finger
devices, MN1 [1] and MN1 [2]. (b) The condition for merging devices, MN1 and MN2.
(a)
(b)
In sequence 3, ``LOD parameters'' and ``Type clustering'' rewards are applied for
gathering same type device, such as the P- and N-types. This clustering guides the
agent's action selection to find states where further merging of devices of the same
type is possible, even across sub-circuits.
In sequence 4, three rewards, ``Wire length'', ``DTI parameters'', and ``Area'', are
applied. The purpose of this sequence is to optimize spacing and alignment between
rows, ensuring that wire lengths are minimized and uniform, particularly for wires
originating from devices within the same sub-circuit.
Sequence 5 of Q-learning incorporates cluster states (Cx,t),
which are represented as additional dummy transistors in each row, and actions are
constrained to the addition (action number: 1) or removal (action number: 0) of the
outermost dummy transistors in each row, followed by the addition of a guard ring,
with all other devices fixed.
In Fig. 8(b), the rewards plots indicate that the difference in the threshold voltage (DVTH) reward;
reaching the saturation range of Vth under the LLE (VTH LLE) reward can be enhanced
while minimizing the loss in the Area reward. Consequently, the Q-learning agent in
Sequence 5 is capable of identifying the optimal solution in a relatively short period
of time. Thus, unlike Sequences 1 to 4, Sequence 5 depicts the incremental rewards
accrued in the final episode, rather than the final rewards of each episode in Fig. 8(b). The final placement of the OTA and comparator utilizing the proposed algorithm are
illustrated in Fig. 9. A diverse array of colors is employed to differentiate the annotated sub-circuits
(current mirror, differential pair, bias, switch, latch) within initial input netlists.
Splitting multi-finger devices into single-finger devices is annotated as the first
level sub-circuit, and the second level sub-circuit includes the split devices, as
shown in Fig. 9(b). Gray indicates dummy transistors, while brown indicates guard rings.
Fig. 8. Rewards plot of sequential Q-learning. (a) Sequence 1 rewards per step in
one episode and Sequence 2 to Sequence 4, last rewards per episode. (b) Sequence 5
rewards per step in the last episode of OTA (top) and comparator (bottom).
(a)
(b)
Fig. 9. The final placement with abstracted devices using the proposed algorithm.
(a) OTA. (b) Comparator.
(a)
(b)
Fig. 10. LLE-compensated placement results showing the threshold voltage variation
of the core devices, indicated by red borders. The placements of the OTA and comparator
are shown in (a)-(b) and (c)-(d), respectively: (a) Vth, influenced only by LOD;
(b) Vth, influenced only by DTI; (c) Vth, influenced only by LOD; (d) Vth,
influenced only by DTI.
(a)
(b)
(c)
(d)
Fig. 10 demonstrates that the LLE-aware placement minimizes threshold voltage variation of
the core devices in the given netlist while maintaining area efficiency using only
the necessary dummies and compact row-space aligned guard rings. The threshold voltage
levels of the core devices that affect circuit performance are identical for each
device type.
Table 2. Comparison of placement results using three methods: non-compensated (non-LLE-aware),
manual with legacy design knowledge, and the proposed algorithm. All data are normalized.
(a) Folded cascode OTA results. (b) Strongarm comparator results.
In Table 2, a comparison of the three different placement methods is presented, including the
area, the standard deviation of the threshold voltage by type, and the area overhead.
To examine the threshold voltage variation of core devices across test blocks, we
calculate the standard deviation. The threshold voltage values, Vth,LOD and Vth,DTI
are estimated by the LOD and DTI surrogate models, respectively. All data presented
have been normalized to ensure confidentiality. ``Legacy manual'' refers to placing
devices manually with legacy design knowledge of a mature process node, as explained
in Section I. These layouts were provided by experienced analog layout experts, each
with over a decade of experience in the field. The proposed method, which employs
an algorithmic approach leveraging LLE surrogate models trained on process data, is
capable of automatically compensating for LLE, thereby reducing threshold voltage
variation to zero while simultaneously reducing the area overhead by a factor of three
compared to the "Legacy Manual" method.
In summary, the automatic prediction and incorporation of various LLEs from the
initial device placement obviates the necessity for post-simulation. This approach
not only preserves the performance of analog designs but also enhances area competitiveness,
even in the initial stages of the process setup.
V. CONCLUSIONS
This paper presents an LLE-aware device placement optimization method based on RL.
In consideration of the influence of device characteristics on local layout circumstances
and process nodes, physical layout information is employed as the input for an ANN.
This information is used to analyze local layout effects and threshold voltage relationships,
as observed from post-layout simulations. Trained ANNs were implemented as surrogate
models for the LOD and DTI, which were integrated into the reward functions of the
learning agent. The Q-learning method is employed for RL. This approach is effective
for optimizing device placement by suppressing LLEs with regard to threshold voltage
variation among devices. Moreover, the proposed method can emulate the expertise of
a design expert by sequentially applying multiple Q-learning with selected reward
functions. This enables the automatic generation of an optimal device placement solution
in the early setup stage of the advanced process node, where legacy design knowledge
would not be effective. Finally, the method can make the threshold voltage variation
of devices equal to zero under local layout effects by applying dummy transistors
and guard rings while maintaining area efficiency.
References
G. Scott, et al., ``NMOS drive current reduction caused by transistor layout and
trench isolation induced stress,'' Proceedings of IEEE International Electron Devices
Meeting, Washington, DC, USA, pp. 827-830, Dec. 1999.

Y. Luo and D. K. Nayak, ``Enhancement of CMOS Performance by Process-Induced Stress,''
IEEE Transactions on Semiconductor Manufacturing, vol. 18, no. 1, pp. 63-68, Feb.
2005.

A. Veloso, et al., ``Innovations in transistor architecture and device connectivity
for advanced logic scaling,'' Proc. of 2022 International Conference on IC Design
and Technology (ICICDT), IEEE, 2022.

C. Ndiaye, et al., ``Layout dependent effect: Impact on device performance and reliability
in recent CMOS nodes,'' Proc. of 2016 IEEE International Integrated Reliability Workshop
(IIRW), IEEE, 2016.

A. Pal, et al., ``Self-aligned single diffusion break technology optimization through
material engineering for advanced CMOS nodes,'' Proc. of 2020 International Conference
on Simulation of Semiconductor Processes and Devices (SISPAD). IEEE, 2020.

J. Xue, et al., ``A framework for layout-dependent STI stress analysis and stress-aware
circuit optimization,'' IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
vol. 20, no. 3, pp, 498-511, 2011.

M. Agam, et al., ``Physical and electrical characterization of deep trench isolation
in bulk silicon and SOI substrates,'' Proc. of 2021 32nd Annual SEMI Advanced Semiconductor
Manufacturing Conference (ASMC), IEEE, 2021.

Y.-M. Sheu, et al., ``Modeling the well-edge proximity effect in highly scaled MOSFETs,''
IEEE Transactions on Electron Devices, vol. 53, no. 11, pp. 2792-2798, 2006.

R. Butola, Y. Li, and S. R. Kola, ``A comprehensive technique based on machine learning
for device and circuit modeling of gate-all-around nonosheet transistors,'' IEEE Open
Journal of Nanotechnology, 2023.

K. Ko, J. K. Lee, M. Kang, J. Jeon, and H. Shin, ``Prediction of process variation
effect for ultrascaled GAA vertical FET devices using a machine learning approach,''
IEEE Trans. Electron Devices, vol. 66, no. 10, pp. 4474-4477, Oct. 2019.

J. Kim, et al., ``Local layout effect-aware static timing analysis by use of a new
sensitivity-based library,'' Proc. of 2023 IEEE/ACM International Conference on Computer
Aided Design (ICCAD), IEEE, 2023.

C. J. C. H. Watkins and P. Dayan, ``Q-learning,'' Machine Learning, vol. 8, pp. 279-292,
1992.

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT press,
2018.

M. Ahmadi and L. Zhang, ``Analog layout placement for FinFET technology using reinforcement
learning,'' Proc. of 2021 IEEE International Symposium on Circuits and Systems (ISCAS),
IEEE, 2021.

KwonWoo Kang received his B.S. degree in electrical engineering from Hanyang University,
Seoul, Korea, in 2017. In 2017, he joined the Samsung Electronics Semiconductor S.LSI
Business, Hwaseong, Korea, where he was involved in the Design Implementation Group,
Design Platform Development Team. He is currently an M.S. student at Sungkyunkwan
University, Suwon, Korea. His interests include mixed signal (analog/digital) layout
optimization, automation, and EDA tool enhancement.
SoYoung Kim received his B.S. degree in electrical engineering from Seoul National
University, Seoul, Korea, in 1997 and her M.S. and Ph.D. degrees in electrical engineering
from Stanford University, Stanford, CA, in 1999 and 2004, respectively. From 2004
to 2008, she was with Intel Corporation, Santa Clara, CA, where she worked on parasitic
extraction and simulation of on-chip interconnects. From 2008 to 2009, she was with
Cadence Design Systems, San Jose, CA, where she worked on developing IC power analysis
tools. She is currently a Professor with the Department of Semiconductor Systems Engineering,
College of Information and Communication Engineering, Sungkyunkwan University, Suwon,
Korea. Her research interests include VLSI computer-aided design, signal integrity,
power integrity, and electromagnetic interference in electronic systems.