May 14

Embedded NVM for IoT will lead the way with standalone memory to follow

ARMfrom Lucian Shifren, Principal Engineer, ARM Inc.

While many options exist to extend the roadmap for standalone FLASH (such as vertical FLASH), embedded FLASH has a much shorter runway. While standalone FLASH can be optimized for memory function, embedded FLASH needs to be functional within the process window set by the corresponding logic process. The same argument can be made for DRAM where 3D MemCube technology will allow for continued scaling of standalone memory while cost, density and logic process compatibility will limit embedding DRAM. As logic processes are getting significantly more expensive and difficult, the ability to compromise on the logic process to allow for scaled embedded memories is becoming significantly prohibitive.

Additionally, FLASH and DRAM are both high energy memory devices. FLASH energy consumption is dominated by the write/erase function and DRAM energy consumption is largely due to constant refreshing of the the state. For FLASH, operating voltage write/erase times will not scale dramatically going forward so will not allow for low energy operation. Similarly, the process scaling decreases the capacitance of DRAM therefore requiring an increased energy sapping refresh rate. This will quickly limit the ability to embed DRAM in super low power products.

Many products will effectively circumvent these issues by embedding less than optimal memory or by using external standalone memory. The new Internet of Things (IoT) market however is going to change what is required from embedded memory. IoT will be defined by highly embedded specialized systems with low cost of ownership which will require increased SoC embedded functions (including all system and non-volatile memory) but also the need for batteries to last years (no battery replacement for the product lifetime) therefore establishing an energy window defined by the installed batteries.

Consider a common larger battery that would be used in IoT systems, the CR2032 button cell. This cell has ~2.4KJ of energy ([email protected]). It costs a few nJ to write/erase each bit of FLASH. While 2.4KJ would seem like ample battery, 10K of logic gates can be cycled using the same amount of energy as 1bit of FLASH (at the 90nm node)!!! For a simple system with a small ARM M0 core, the FLASH memory could conceivably consume 70% of the power of chip. Assuming a lifetime for a sensing IoT product is 10 years, the ability to accurately data log (both locally and via radio) and update the firmware of the remote IoT device becomes significantly limited and challenging due to this small energy window. It is due to all these reasons I believe IoT embedded memory and not standalone memory will be the driving force behind the adoption of one or a few of the viable NVM replacement technologies.

Let’s consider now which is the best NVM system for IoT. The vast majority of embedded memory is currently charged based (SRAM, DRAM, FLASH). We can group all new options into two categories, ReRAM (resistive) or MRAM (magnetic). Within MRAM, STT-RAM (Spin Torque Transfer) has become the leading candidate. Unlike FLASH no erase is needed and writing the bit uses around 1pJ, 1000x less than FLASH. As the MTJ is stable, the memory is non-volatile and requires no refreshing.

Many different flavors of ReRAMs exist. The basic principle is that a non-conducting material can be made to conduct by applying an electric field across the material, demonstrating a so-called memristor behavior. The behavior is stable so the device is non-volatile. The ReRAM element is accessed by a standard logic transistor in a similar 1T1R memory bit cell structure as the STT-RAM and also requires approximately 1pJ to write the bit with no erase required. Both MRAM and ReRAM allow for greatly reduced energy usage not achievable by either FLASH or DRAM which will become essential in future IoT products.

Looking at cost, STT-RAM is the more expensive due to the multiple layers of exotic materials needed to form the MTJ. Additional concerns with STT-RAM is that it requires the multiple-material MTJ to be etched (as opposed to ReRAM where a via is filled) which should make it more difficult to scale. Due to greatly lower On/Off ratios for STT-RAM, more expensive sensing circuits could be required for STT-RAM compared to ReRAM. This therefore gives the cost and scalability advantage to ReRAM, however, ReRAMs have poor endurance compared to STT-RAM which limits their use.

For DRAM/L3 replacement, endurance of 10^16 or greater is required. For NAND/NOR, more than 10^6 but less than 10^10 will be required (assuming a 10 year product lifecycle and an ARM M0 core based IoT). Therefore endurances between 10^10 and 10^16 will be of limited additional value as the memories fall into a “too much or not enough” gap. For this reason, ReRAM is most likely a FLASH only replacement while STT-RAM could possibly replace DRAM/L3 and FLASH.

While it would be cost prohibitive to have both versions of NVM on a single die (especially in a low-cost IoT product), STT-RAM has a possible advantage of being able to offer more universal-memory like functionality in an IoT SoC. While STT-RAM is more expensive than ReRAM, if using STT-RAM for a DRAM and/or L3 cache replacement, the ability to use it as FLASH will add no additional process cost to the die as the memory MTJ will already exist for the DRAM/L3 and the transistors used in the 1T1R structure would be the same between memory types, so basically, you get the FLASH use for free.

This could help to improve the cost/value proposition of the STT-RAM if dual use of the memory is achieved. However, due to the cost and having no use for excess endurance, it would be unlikely that STT-RAM would be used over ReRAM for a FLASH only application if both options were available for the product. So there is probably a need both types of memories and therefore both might exist as foundry options in the future.

Looking at where there is the greatest need for a new NVM, I believe not only is it in embedded systems but more precisely in the new IoT market.

Lucian Shifren is a Principal Engineer at ARM Inc working in the Silicon R&D group. His background includes process and device physics and process development. After getting his Ph.D. from Arizona State University, he worked for almost a decade at Intel in Hillsboro, Oregon helping develop Intel’s leading edge technologies. He was part of the teams that introduced stress, HKMG and FinFET (and actually published the first paper on how PMOS stress works). His professional interests include new process technologies and trying to drive product development based on new and novel process technologies.


  1. admin

    In paragraph 8, endurances are quoted in powers of 10, i.e 10^6 is a million. The html superscript tag is not displaying correctly.

  2. Fred Chen

    The “infinite” endurance of STT-MRAM is legendary, i.e., literally the stuff of legend. Papers such as https://www.comp.nus.edu.sg/~wongwf/papers/ISLPED11.pdf and the TSMC paper at ISSCC 2013 have indicated the endurance of STT-MRAM is quite finite, and not close at all to the 10^16 requirement.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>