Resistive Back-End Memories
Emerging Non-Volatile Memories (NVM), such as Resistive Random-Access Memories (ReRAMs) and Phase-Change Memories (PCMs), are promising candidates to replace Flash and Static Random Access Memories (SRAMs) in many applications. In particular, emerging NVMs are expected to provide better footprint scalability (down to a few nanometers), faster programming time (of the order of a few nanoseconds), and enhanced endurance (up to 109 programming cycles).
Resistive Back-End Memories for Reconfigurable Architectures
The focus of this research effort is the utilization of NVMs in reconfigurable logic circuits, such as Field-Programmable Gate Arrays (FPGAs). It is widely recognized that in traditional FPGA, both the memory and the routing circuitry (with more than 40% of area for each contribution) represent the principal bottleneck to scaling and to performance increase. FPGAs represent a significant segment of digital design, and their market share is growing steadily.
In my group, we use the intrinsic physical properties of ReRAMs (i.e., low Ron, low programming voltages, …) to radically improve the different functional blocks of FPGAs. Indeed, instead of only using the ReRAMs as a memory, we extend their use as non-volatile switches and design innovative circuits for routing and logic. Thanks to their Back-End-Of-Line (BEOL) manufacturing process compatibility, ReRAMs are fabricated on the top part of pre-processed CMOS chips with good scalability and performances, thus easing any further possible commercialization of this technology. Fig. 4 shows an ReRAM stack investigated recently by the group that demonstrates very promising characteristics from a logic-in-memory application perspective, such as a forming-free operation or an enhanced endurance.
Fig. 1: (a) Lateral cross-section view of Pt/TaOx/CrOy/Cr/Cu cross-point ReRAM device. (b) Reconstructed 3D AFM image of the pristine cross-point device.
In addition to the device fabrication and the exploration on novel memory stacks, the outcome of this approach is a complete new architecture and design of the FPGA internal structure in which the memory/data path logic will be merged, in order for the memories to take integral part in the data path. This is expected to lead to a breakthrough in the field of high-performance reconfigurable platforms demonstrating more density, higher performance and higher energy efficiency. Fig. 5 illustrates this statement with the design of an ReRAM-based multiplexer, that can be used as a very-low-power high-performance routing node.
Fig. 2: (a) 4T1R ReRAM-based 4:1 routing multiplexer architecture. The ReRAMs replace the conventional CMOS pass-gates; (b) Energy comparison between CMOS multiplexers and 4T1R-based multiplexers.
Resistive Back-End Memories for Binary Neural Networks
The aim of this research effort is use RRAMs in Binary Neural Networks. It is well known that convolution is a costly operator since it involves a significant amount of data movement between the computation unit and the memory. Therefore, using NVMs can greatly reduce the number of data transfers by storing the filter value as RRAM resistance and realize in-memory computation at the same time, considerably decreasing the overall power consumption of the system. In addition, some recent works showed that using binary values (-1 and +1) instead of floating-points precision for both the input and kernel matrices, the multiply-accumulate units can be replaced by simple binary XNOR and bitcount blocks, considerably reducing the necessary computational resources and memory transfers while keeping a high accuracy.
In this work, we combine both approaches by proposing a convolutional engine that performs digital dot products between binary input vectors and binary local weights without using single-ended XNOR sensing. By being fully digitalized, our binary dot product can tolerate a high variability on the RRAM devices. Fig.2 shows that our binary convolutional engine can tolerate up to 25% variability on the RRAM resistance values, which is above the variability of state of the art RRAM devices. Therefore, it shows that our circuit can tolerate at 100% de RRAM variability.
Fig. 3: (a) General structure of the proposed RRAM-based XNOR convolution block engine; (b) Robustness analysis of the proposed convolutional engine.
This research effort is funded by the grant 2016016 from the United States-Israel Binational Science Foundation, the University of Utah SEED grant 10044706 and ReRouting, LLC.