Emze Group
Co-Led by Joel Emer and Vivienne Sze



Overview

We explore the modeling and design of efficient and flexible hardware accelerators.


PhD Students

  • Tanner Andrulis
  • Michael Gilbert
  • Fisher Zi Yu Xue
  • Yu-Hsin Chen (Alumni)
  • Yannan Nellie Wu (Alumni)


Publications

* Indicates authors contributed equally to the work

Architecture Modeling for evaluation and design space exploration

  • T. Andrulis, J. S. Emer, V. Sze, "CiMLoop: A Flexible, Accurate, and Fast Compute-In-Memory Modeling Tool," IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), May 2024. [ paper LINK | project website LINK | code github ] Best Paper Award
  • T. Andrulis, G. I. Chaudhry, V. M. Suriyakumar, J. S. Emer, V. Sze, "Architecture-Level Modeling of Photonic Deep Neural Network Accelerators," IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), May 2024.
  • T. Andrulis, R. Chen, H.-S. Lee, J. S. Emer, V. Sze, "Modeling Analog-Digital-Converter Energy and Area for Compute-In-Memory Accelerator Design," arXiv, April 2024. [ paper LINK ]
  • M. Gilbert, Y. N. Wu, A. Parashar, V. Sze, J. Emer, "LoopTree: Enabling Exploration of Fused-layer Dataflow Accelerators," IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), April 2023. [ paper LINK ]
  • Y. N. Wu, P. Tsai, A. Parashar, V. Sze, J. Emer, "Sparseloop: An Analytical Approach to Sparse Tensor Accelerator Modeling," ACM/IEEE International Symposium on Microarchitecture (MICRO), October 2022. [ paper PDF | project website LINK | code github ] Distinguished Artifact Award
  • Y. N. Wu, P.-A. Tsai, A. Parashar, V. Sze, J. S. Emer, "Sparseloop: An Analytical, Energy-Focused Design Space Exploration Methodology for Sparse Tensor Accelerators," IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), March 2021. [ paper PDF | tutorial website LINK ]
  • F. Wang, Y. N. Wu, M. Woicik, J. S. Emer, V. Sze, "Architecture-Level Energy Estimation for Heterogeneous Computing Systems," IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), March 2021. [ paper PDF | code github ]
  • Y. N. Wu, V. Sze, J. S. Emer, "An Architecture-Level Energy and Area Estimator for Processing-In-Memory Accelerator Designs," IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), April 2020. [ paper PDF | code github ]
  • Y. N. Wu, J. S. Emer, V. Sze, "Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs," International Conference on Computer Aided Design (ICCAD), November 2019. [ paper PDF | slides PDF | project website LINK | code github ]
  • T.-J. Yang, Y.-H. Chen, J. Emer, V. Sze, "A Method to Estimate the Energy Consumption of Deep Neural Networks," Asilomar Conference on Signals, Systems and Computers, Invited Paper, October 2017. [ paper PDF | slides PDF ]

Accelerator Architectures

  • Z. Y. Xue, Y. N. Wu, J. S. Emer, V. Sze, "Tailors: Accelerating Sparse Tensor Algebra by Overbooking Buffer Occupancy," ACM/IEEE International Symposium on Microarchitecture (MICRO), October 2023. [ paper LINK | project website LINK ]
  • Y. N. Wu, P.-A. Tsai, S. Muralidharan, A. Parashar, V. Sze, J. S. Emer, "HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity," ACM/IEEE International Symposium on Microarchitecture (MICRO), October 2023. [ paper LINK | project website LINK ]
  • T. Andrulis, J. Emer, V. Sze, "RAELLA: Reforming the Arithmetic for Efficient, Low-Resolution, and Low-Loss Analog PIM: No Retraining Required!," International Symposium on Computer Architecture (ISCA), June 2023. [ paper LINK | code github ]
  • L. Bernstein, A. Sludds, R. Hamerly, V. Sze, J. Emer, D. Englund, "Freely scalable and reconfigurable optical hardware for deep learning," Scientific Reports, Vol. 11, No. 3144, February 2021. [ LINK ]
  • Y.-H. Chen, T.-J Yang, J. Emer, V. Sze, "Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices," IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS), Vol. 9, No. 2, pp. 292-308, June 2019. [ paper PDF | extended version arXiv ]
  • Y.-H. Chen, T. Krishna, J. Emer, V. Sze, "Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks," IEEE Journal of Solid-State Circuits (JSSC), ISSCC Special Issue, Vol. 52, No. 1, pp. 127-138, January 2017. [ paper PDF | project website LINKTop 5 most cited JSSC paper of all time
  • Y.-H. Chen, J. Emer, V. Sze, "Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks," International Symposium on Computer Architecture (ISCA), pp. 367-379, June 2016. [ paper PDF | slides PDF ] Selected for IEEE Micro’s Top Picks special issue on "most significant papers in computer architecture based on novelty and long-term impact" from 2016
  • Y.-H. Chen, T. Krishna, J. Emer, V. Sze, "Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks," IEEE International Conference on Solid-State Circuits (ISSCC), pp. 262-264, February 2016. [ paper PDF | slides PDF | poster PDF | demo video | project website LINK ] Highlighted in EETimes and MIT News, and Top 3 most cited ISSCC paper of all time

Overview on Efficient Processing of Deep Neural Networks

  • V. Sze, Y.-H. Chen, T.-Y. Yang, J. S. Emer, "How to Evaluate Deep Neural Network Processors: TOPS/W (Alone) Considered Harmful," IEEE Solid-State Circuits Magazine, Vol. 12, No. 3, pp. 28-41, Summer 2020. [ PDF ]
  • Y.-H. Chen*, T.-J. Yang*, J. Emer, V. Sze, "Understanding the Limitations of Existing Energy-Efficient Design Approaches for Deep Neural Networks," SysML Conference, February 2018. [ paper PDF | talk video ] Selected for Oral Presentation
  • V. Sze, T.-J. Yang, Y.-H. Chen, J. Emer, "Efficient Processing of Deep Neural Networks: A Tutorial and Survey," Proceedings of the IEEE, vol. 105, no. 12, pp. 2295-2329, December 2017. [ paper PDF ]
  • Y.-H. Chen, J. Emer, V. Sze, "Using Dataflow to Optimize Energy Efficiency of Deep Neural Network Accelerators," IEEE Micro's Top Picks from the Computer Architecture Conferences, May/June 2017. [ PDF ]
  • A. Suleiman*, Y.-H. Chen*, J. Emer, V. Sze, "Towards Closing the Energy Gap Between HOG and CNN Features for Embedded Vision," IEEE International Symposium of Circuits and Systems (ISCAS), Invited Paper, May 2017. [ paper PDF | slides PDF | talk video ]
  • V. Sze, Y.-H. Chen, J. Emer, A. Suleiman, Z. Zhang, "Hardware for Machine Learning: Challenges and Opportunities," IEEE Custom Integrated Circuits Conference (CICC), Invited Paper, May 2017. [ paper arXiv | slides PDF ] Outstanding Invited Paper Award


Educational Resources on Efficient Processing of Deep Neural Networks


An overview paper based on the tutorial "Efficient Processing of Deep Neural Networks: A Tutorial and Survey" is available here.



Our book based on the tutorial "Efficient Processing of Deep Neural Networks" is available here.

This book provides a structured treatment of the key principles and techniques for enabling efficient processing of deep neural networks (DNNs). The book includes background on DNN processing; a description and taxonomy of hardware architectural approaches for designing DNN accelerators; key metrics for evaluating and comparing different designs; features of the DNN processing that are amenable to hardware/algorithm co-design to improve energy efficiency and throughput; and opportunities for applying new technologies. Readers will find a structured introduction to the field as well as a formalization and organization of key concepts from contemporary works that provides insights that may spark new ideas.

An excerpt of the book on "Key Metrics and Design Objectives" and "Advanced Technologies" available at here.


Related Websites and Resources

  • DNN Tutorial Slides [ LINK ]
  • Eyeriss Project [ LINK ]
  • Accelergy Project [ LINK ]
  • Sparseloop Project [ LINK ]
  • 6.5930/1 Hardware Architecture for Deep Learning Course [ LINK ]


Acknowledgement

This work is funded in part by the DARPA YFA grant N66001-14-1-4039, DARPA contract HR0011-18-3-0007, MIT Center for Integrated Circuits & Systems, Ericsson, MIT Quest, MIT AI Hardware, NSF PPoSS 2029016, and gifts from ASML, Intel, Nvidia, and TSMC.