The Liquid State Machine (LSM) is a promising model of recurrent spiking neural networks that provides an appealing brain-inspired computing paradigm for machine learning applications such as pattern recognition. Moreover, processing information directly on spiking events makes the LSM well suitable for cost and energy efficient hardware implementation. In this paper, we systematically present three techniques for optimizing energy efficiency while maintaining good performance of the proposed LSM neural processors from both an algorithmic and hardware implementation point of view. First, to realize adaptive LSM neural processors thus boost learning performance, we propose a hardware-friendly Spike-Timing Dependent Plastic (STDP) mechanism for on-chip tuning. Then, the LSM processor incorporates a novel runtime correlation-based neuron gating scheme to minimize the power dissipated by reservoir neurons. Furthermore, a fine-grained activity-dependent clock gating approach is presented to address the energy inefficiency due to the memory intensive nature of the proposed neural processors. Using two different real-world tasks of speech and image recognition to benchmark, we demonstrate that the proposed architecture boosts the average learning performance by up to 2.0% while reducing energy dissipation by up to 29% compared to a baseline LSM design with little extra hardware overhead on a Xilinx Virtex-6 FPGA.
Energy Efficient Neural Computing with Approximate Multipliers
Artificial Neural Network computation relies on intensive vector-matrix multiplications. Recently, the emerging nonvolatile memory (NVM) crossbar array showed a feasibility of implementing such operations with high energy efficiency, thus there are many works on efficiently utilizing emerging NVM crossbar array as analog vector-matrix multiplier. However, its nonlinear I-V characteristics restrain critical design parameters, such as the read voltage and weight range, resulting in substantial accuracy loss. In this paper, instead of optimizing hardware parameters to a given neural network, we propose a methodology of reconstructing a neural network itself optimized to resistive memory crossbar arrays. To validate the proposed method, we simulated various neural network with MNIST and CIFAR-10 dataset using two different specific Resistive Random Access Memory (RRAM) model. Simulation results show that our proposed neural network produces significantly higher inference accuracies than conventional neural network when the synapse devices have nonlinear I-V characteristics.
Security is becoming a de-facto requirement of System-on-Chips (SoC), leading up to a significant share of circuit design cost. In this paper, we propose an advanced SBUS protocol (ASBUS), in order to improve the data feeding efficiency of the Advanced Encryption Standard (AES) encrypted circuits. As a case study, the direct memory access (DMA) combined with AES engine and memory controller are implemented as our design-under-test (DUT) using field-programmable gate arrays (FPGA). The results show that our presented ASBUS structure outperforms the AXI-based design for cipher tests. As an example, the 32-bit ASBUS design costs less in terms of hardware resources and achieves higher throughput ($1.30 \times$) than the 32-bit AXI implementation, and the dynamic energy consumed by the ASBUS cipher test is reduced to 71.27\% compared with the AXI test.
Spin-Transfer-Torque RAM (STTRAM) is a promising technology for high density on-chip cache due to low standby power and high speed. However, the process- variation of Magnetic Tunnel Junction (MTJ) and access transistor poses serious challenge to sensing. Non-destructive sensing suffers from reference resistance variation whereas destructive sensing suffers from failures due to unoptimized selection of data and reference currents. Furthermore, the sense speed is tightly coupled with reference/data current requirement. In this work, we study process-variation effect on self-reference sensing scheme to eliminate bit-to-bit process-variation in MTJ resistance. Read current modulation is proposed to overcome the failures due to process-variation. Simulation results reveal <0.01% failures at the cost of 9ns sense time and 190uW power consumption.
Wireless Network-on-Chip (WiNoC) represents a promising emerging communication technology for addressing the scalability limitations of future manycore architectures. In a WiNoC, high-latency and power-hungry long-range multi-hop communications can be realized by performance- and energy-efficient single-hop wireless communications. However, the energy contribution of such wireless communication accounts for a significant fraction of the overall communication energy budget. This paper presents a novel energy managing technique for WiNoC architectures aimed at improving the energy efficiency of the main elements of the wireless infrastructure, namely, radio-hubs. The rationale behind the proposed technique is based on selectively turning off, for the appropriate number of cycles, all the radio-hubs that are not involved in the current wireless communication. The proposed energy managing technique is assessed on several network configurations under different traffic scenarios both synthetic and extracted from the execution of real applications. The obtained results show that, the application of the proposed technique allows up to 25% total communication energy saving without any impact on performance and with a negligible impact on the silicon area of the radio-hub.