NAND flash density has already reached 16 Gbits per die and beyond with 2-bit/cell multilevel-cell (MLC) technology using 50- and 40-nano- meter process nodes. Despite the impressive growth in bit density, however, program performance of NAND flash has remained in the 10-Mbyte/second range. As the need for digital content grows, companies are paying more attention to improving the program and read performance of NAND flash devices to meet consumers' appetite for more bits and faster performance. Coupled with a steep decline in price, the race for more bits with higher performance has become a major focus for companies wishing to remain competitive.
Papers presented at the 2008 International Solid-State Circuits Conference and an analysis of a 16-Gbit MLC NAND flash by Semiconductor Insights in 2007 indicate some new trends in NAND flash development, in the areas of architecture, performance, design (challenges for 3-bit/cell NAND development) and process technology requirements, among others.
New architecture
The changes in architecture between devices and designs announced in 2007 and those announced in 2008 are obvious. All three designs announced in 2008 use so-called all-bit line architecture. ABL enhances NAND flash array performance by sensing all bit lines connected to page buffers simultaneously. Performance, in throughput terms, is 3.4 times that of conventional architectures. This is an impressive improvement, given that the new device architecture uses the same, 56-nm process technology used for the conventional devices. In SLC mode, the ABL architecture can further increase programming throughput, up to 60 Mbytes/s.
The 3-bit/cell design announced this year reveals some of the challenges that must be overcome to achieve increases in the number of bits per cell. Rotated array architecture (RAA) is used to suppress array noise and improve power distribution to the memory array. This is crucial for 3-bit/cell design, since accommodating eight different states (as opposed to four in 2-bit/cell design) in one flash cell would require very tight cell threshold voltage distribution and precision sensing of cell data.
 |
|
MLC, 56-nm, 16-Gbit NAND flash with conventional architecture.
Source: Semiconductor Insights, 2007 |
By placing word-line and bit-line control signals closer to the array and sensing flash cell data using local word-line voltage reference to local ground, the 3-bit design uses bit-line and word-line voltage bias tracking, which enhances sensing accuracy and reduces sensing time, improving performance by as much as 20 percent. The programming throughput of 8 Mbytes/s, which is 80 percent that of an MLC device, is quite an achievement, given the design challenges. The innovations applied to this design have resulted in the smallest chip size to date for a 56-nm 16-Gbit NAND device (142 mm2).
NAND flash devices based on 3-bit/cell design are expected to represent about half of SanDisk/Toshiba's production in 2009. Note, however, that 43-nm MLC (2-bit/cell) technology still costs less than 56-nm 3-bit/cell technology. In terms of megabits per square millimeter, 43-nm MLC tech- nology is 18 percent more efficient.
40-nm process challenges
Gate-induced drain leakage (GIDL) becomes an important issue in using a 40-nm process node. Program disturbance caused by GIDL should be minimized. This can be achieved by introducing two dummy word lines to a NAND string, one on each end of the string. To compensate for the increased chip size required by the additional word lines, a longer NAND string (64 vs. 32) is used to improve area efficiency. Longer NAND strings also increase string resistance, which requires word-line modulation during read and programming. This ensures that proper word-line voltage is applied, depending upon the location of a word line in a NAND string: a higher word-line voltage level is used for accessing a cell near the top of the string (bit line) to compensate for the string resistance.
Voltage scaling
Lower Vcc is desirable for small geometry and interoperability with other devices in a system. However, lower operating voltage would make it difficult to design charge pumps, which are critical circuit blocks for NAND flash devices. For the 43-nm device, current designs use two separate voltages: one for internal operation and the other for I/O operation. The 43-nm design uses 3.3 V for Vcc and 1.8 V for Vccq.
Synchronous DDR interface
The NAND flash interface has been asynchronous and has been identified as one of the critical bottlenecks for high-performance applications. Intel and Micron have announced a NAND flash interface design that has a DDR I/O interface capable of 200 Mbytes/s. This is based on the Open NAND Flash Interface specification (ONFi).
Using quad plane (or bank) architecture and a 4n prefetch data path, which are two basic techniques of DDR2 SDRAM, the device can support both an asynchronous interface and a DDR2 synchronous interface. For higher programming and read performance, SLC technology is used. This is confirmed by the word-line level cited in a paper on the device design. For our purposes here, the Toshiba 56-nm 16-Gbit MLC device (which is equivalent to an 8-Gbit SLC device) is referenced to provide common ground for comparison.
Toshiba's 16-Gbit MLC device is 7 percent larger than Intel/Micron's 8-Gbit SLC device. Given the minimum feature size difference between the processes (56 vs. 50 nm), the area overhead of the quad planes and DDR2 prefetch data path can be negligible. Use of a 64-cell NAND string helps reduce the overall die area and DDR2 interface overhead as well. As seen in the Toshiba 43-nm design, two supply voltages are used: one for internal operation and the other for I/O operation (Vcc = 3.3 V, Vccq = 1.8 V or 3.3 V).
While 100 Mbytes/s of programming throughput is an impressive level of performance, conventional asynchronous interface can also provide up to 60 Mbytes/s of performance using ABL architecture (SLC mode only).
The SLC nature of the Intel/Micron device makes it a costly choice, especially in consumer applications. As of May 2008, 8-Gbit SLC NAND devices are almost 50 percent more expensive than 16-Gbit MLC NAND devices. High-end applications such as solid-state drives, game consoles and servers will likely be the initial targets of the new device.
Perhaps, for MLC support, a more segmented bank architecture with a high degree of prefetch will be required to achieve the optimum price and performance point for NAND flash devices with DDR interface.
ABL architecture, quad-bank structures with DDR interface, 64-cell NAND strings, adoption of dummy word lines in NAND strings, innovative circuit design, careful placement of critical circuit blocks and good power distribution are some of the innovations of the NAND flash designs announced in early 2008.
NAND flash devices with DDR2 interfaces (ONFi) appear to deliver higher performance but will require more architectural and circuit improvements, including further segmented architecture (more banks or planes), to support more cost-effective MLC NAND technology.
Young Choi (youngc@ semiconductor.com) is senior manager at Semiconductor Insights, a TechInsights division specializing in in-depth technical investigation of ICs and electronic systems. |
See related chart