Mobile QR Code
Title Area Recovery in ABC Standard Cell Mapping
Authors 김교선(Kyosun Kim)
DOI https://doi.org/10.5573/ieie.2019.56.1.21
Page pp.21-28
ISSN 2287-5026
Keywords Logic synthesis ; And-inverter graph ; ASIC ; Standard cell mapping ; Multi-output
Abstract Towards mitigating the technology gap between the academic and commercial synthesis tools, standard cells in a library are mapped to a 32-bit RISC processor, OpenRISC by ABC from UC Berkeley, and Design Compiler from Synopsys. The area of the circuit mapped by ABC is 22% larger than that by Design Compiler. The two mapped circuits are analyzed and compared with each other to identify the techniques missed by ABC but exploited by Design Compiler. First of all (i) postponing of dropping one of the polarities until the last mode of Boolean matching in the dual rail mapping, (ii) removing the gate duplication for each of the primary outputs with an equivalent function, and sharing the output of a gate, and (iii) exploiting the inverted outputs of flip-flops/latches has achieved 4% of area reduction. Also, the mapping of gates with multi-outputs such as the full adder which takes advantage of the area gain due to logic sharing between 3-input majority gate and 3-input XOR gate has enabled additional 8% of area reduction. Therefore, the correlation in terms of the average mapped circuit area between the academic and commercial synthesis tools has been achieved within 10% error. In the process, the AIG-based standard cell mapping provided by the lossless synthesis, supergates, and the dual rail mapping enabled by truth table hashing and N-equivalence, has been incorporated with the multi-output cell mapping without loss of the inherent efficiency. The proposed techniques include (i) collection of gate pairs, (ii) carry chain identification by a simple depth first search, and (iii) carry chain injection at an early mode of the Boolean matching step. The remaining area difference is mainly resulted from (i) the employment of the tree structure rather than the structure with a decoder and sums of products in the implementation of parallel multi-bit multiplexers during the hardware inference, and (ii) the missing high-level optimization for sharing arithmetic units. Only a few per cent of error can be expected if those techniques are also implemented in the future.