Design Strategies of 40 nm Split-Gate NOR Flash Memory Device for Low-Power Compute-in-Memory Applications

Micromachines (Basel). 2023 Sep 7;14(9):1753. doi: 10.3390/mi14091753.

Abstract

The existing von Neumann architecture for artificial intelligence (AI) computations suffers from excessive power consumption and memory bottlenecks. As an alternative, compute-in-memory (CIM) technology has been emerging. Among various CIM device candidates, split-gate NOR flash offers advantages such as a high density and low on-state current, enabling low-power operation, and benefiting from a high level of technological maturity. To achieve high energy efficiency and high accuracy in CIM inference chips, it is necessary to optimize device design by targeting low power consumption at the device level and surpassing baseline accuracy at the system level. In split-gate NOR flash, significant factors that can cause CIM inference accuracy drop are the device conductance variation, caused by floating gate charge variation, and a low on-off current ratio. Conductance variation generally has a trade-off relationship with the on-current, which greatly affects CIM dynamic power consumption. In this paper, we propose strategies for designing optimal devices by adjusting oxide thickness and other structural parameters. As a result of setting Tox,FG to 13.4 nm, TIPO to 4.6 nm and setting other parameters to optimal points, the design achieves erase on-current below 2 μA, program on-current below 10 pA, and off-current below 1 pA, while maintaining an inference accuracy of over 92%.

Keywords: NOR flash; TCAD simulation; artificial intelligence; compute-in-memory (CIM); convolutional neural network; device optimization; split-gate NOR flash.