Workload Analysis with Intel® Simics® Virtual Platforms (SAMOS 2021)
Intel® Simics® is a proven fast functional and scalable virtual platform framework that has been in industrial use since the late 1990s. In 2021, we made Simics available for free to anyone who wants to use it. In this tutorial, we will show some examples of how Simics can be used for workload analysis using the instrumentation and inspection capabilities of Simics 6. Simics offers processor models that execute recent Intel ISAs embedded in an IA platform that runs open-source UEFI, Linux, Windows, and other software unmodified. In the workshop, we will try some features like instruction mix analysis, cache simulation, analyzing driver code, and simulating a network of machines (in a single process).
Processing Data Where It Makes Sense in Modern Computing Systems: Enabling In-Memory Computation (SAMOS 2022)
Today’s systems are overwhelmingly designed to move data to computation. This design choice goes directly against at least three key trends in systems that cause performance, scalability and energy bottlenecks: 1) data access from memory is already a key bottleneck as applications become more data-intensive and memory bandwidth and energy do not scale well, 2) energy consumption is a key constraint in especially mobile and server systems, 3) data movement is very expensive in terms of bandwidth, energy and latency, much more so than computation. These trends are especially severely-felt in the data-intensive server and energy-constrained mobile systems of today. At the same time, conventional memory technology is facing many scaling challenges in terms of reliability, energy, and performance. As a result, memory system architects are open to organizing memory in different ways and making it more intelligent, at the expense of slightly higher cost. The emergence of 3D-stacked memory plus logic, the adoption of error correcting codes inside the latest DRAM chips, and intelligent memory controllers to solve the RowHammer problem are an evidence of this trend. In this talk, I will discuss some recent research that aims to practically enable computation close to data. After motivating trends in applications as well as technology, we will discuss at least two promising directions: 1) performing massively-parallel bulk operations in memory by exploiting the analog operational properties of DRAM, with low-cost changes, 2) exploiting the logic layer in 3D-stacked memory technology in various ways to accelerate important data-intensive applications. In both approaches, we will discuss relevant cross-layer research, design, and adoption challenges in devices, architecture, systems, applications, and programming models. Our focus will be the development of in-memory processing designs that can be adopted in real computing platforms and real data-intensive applications, spanning machine learning, graph processing, data analytics, and genome analysis, at low cost. If time permits, we will also discuss and describe simulation and evaluation infrastructures that can enable exciting and forward-looking research in future memory systems, including Ramulator and SoftMC.
Bio: Onur Mutlu is a Professor of Computer Science at ETH Zurich. He is also a faculty member at Carnegie Mellon University, where he previously held the Strecker Early broader research interests are in computer architecture, systems, hardware security, and bioinformatics. A variety of techniques he, along with his group and collaborators, has invented over the years have influenced industry and have been employed in commercial microprocessors and memory/storage systems. He obtained his PhD and in ECE from the University of Texas at Austin and BS degrees in Computer Engineering and Psychology from the University of Michigan, Ann Arbor. He started the Computer Architecture Group at Microsoft Research (2006-2009), and held various product and research at Intel Corporation, Advanced Micro Devices, VMware, and Google. received the IEEE Computer Society Edward J. McCluskey Technical Achievement Award, ACM SIGARCH Maurice Wilkes Award, the inaugural IEEE Computer Society Young Computer Architect Award, the inaugural Intel Early Career Faculty Award, US National Science Foundation CAREER Award, Carnegie Mellon University Ladd Research Award, partnership awards from various companies, and a healthy number of best paper or “Top Pick” paper recognitions at various computer systems, architecture, and hardware security venues. He is an ACM Fellow “for contributions to computer architecture research, especially in memory systems”, IEEE Fellow for “contributions to computer architecture research and practice”, and an elected member the Academy of Europe (Academia Europaea). His computer architecture and digital logic design course lectures and materials are freely available on YouTube, and his research group makes a wide variety of software and hardware artifacts freely available online. For more information, please see his webpage at https://people.inf.ethz.ch/omutlu/.
Tutorial on Machine Learning (SAMOS 2022)
In this tutorial we will go through a high level taxonomy of the disperse field of machine learning focusing on the latest and the most widely focused-upon approaches in industry. Matrix factorization techniques, reinforcement learning, and deep graph networks will be highlighted. We will discuss where each type of approach is powerful and the underlying assumptions of each model. We then delve deeper into the widespread application of recommendation systems and associated implementation/computational challenges. We will use case studies from eCommerce to illustrate key take-aways.
Tutorial on Quantum Computing (SAMOS 2019)
Quantum computers hold the promise for solving efficiently important problems in computational sciences that are intractable nowadays by exploiting quantum phenomena such are superposition and entanglement. One of the most famous examples is the factorization of large numbers using Shor’s algorithm. For instance, a 2000-bit number could be decomposed in a bit more than one day using a quantum computer whereas a data center of approx. 400.000 km2 built with the fastest today’s supercomputer would require around 100 years. This extraordinary property of quantum computers together with the great evolution of the quantum technology in the last past years has made that large companies as Google, Lockheed Martin, Microsoft, IBM and Intel are substantially investing in quantum computing. Up to now, quantum computing has been a field mostly dominated by physicists. They are working on the design and fabrication of the basic units of any quantum system, called quantum bits or qubits. However, building a quantum computer involves more than producing ‘good’ qubits. It requires the development on an entire quantum computer architecture. This tutorial will introduce the basic notions of quantum computing; going from quantum bits, superposition and entanglement, to quantum gates and circuits, up to quantum algorithms. The tutorial will provide hands-on exercises based on our QX simulator platform (http://quantum-studio.net), allowing the participants to implement some simple quantum circuits/algorithms. We will also address the main challenges when building a large-scale quantum computer. The objective of this tutorial is to introduce the basics of quantum computing and show where the scientific challenges are.
Tutorial on Memory Systems and Memory-Centric Computing Systems: Challenges and Opportunities (SAMOS 2019)
The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck. At the same time, DRAM and flash technologies are experiencing difficult technology scaling challenges that make the maintenance and enhancement of their capacity, energy efficiency, performance, and reliability significantly more costly with conventional techniques. In fact, recent reliability issues with DRAM, such as the RowHammer problem, are already threatening system security and predictability. We are at the challenging intersection where issues in memory reliability and performance are tightly coupled with not only system cost and energy efficiency but also system security. In this lecture series, we first discuss major challenges facing modern memory systems (and the computing platforms we currently design around the memory system) in the presence of greatly increasing demand for data and its fast analysis. We then examine some promising research and design directions to overcome these challenges. We discuss at least three key topics in some detail, focusing on both open problems and potential solution directions:
- Fundamental issues in memory reliability and security and how to enable fundamentally secure, reliable, safe architectures.
- Enabling data-centric and hence fundamentally energy-efficient architectures that are capable of performing computation near data.
- Reducing both latency and energy consumption by tackling the fixed-latency/energy mindset.
If time permits, we will also discuss research challenges and opportunities in enabling emerging NVM (non-volatile memory) technologies and scaling NAND flash memory and SSDs (solid state drives) into the future.
Bio: Onur Mutlu is a Professor of Computer Science at ETH Zurich. He is also a faculty member at Carnegie Mellon University, where he previously held Strecker Early Career Professorship. His current broader research interests are in computer architecture, systems, hardware security, and bioinformatics. A variety of techniques he, along with his group and collaborators, has invented over the years have influenced industry and have been employed in commercial microprocessors and memory/storage systems. He obtained his PhD and MS in ECE from the University of Texas at Austin and BS degrees in Computer Engineering and Psychology from the University of Michigan, Ann Arbor. He started the Computer Architecture Group at Microsoft Research (2006-2009), and held various product and research positions at Intel Corporation, Advanced Micro Devices, VMware, and Google. He received the inaugural IEEE Computer Society Young Computer Architect Award, the inaugural Intel Early Career Faculty Award, US National Science Foundation CAREER Award, Carnegie Mellon University Ladd Research Award, faculty partnership awards from various companies, and a healthy number of best paper or “Top Pick” paper recognitions at various computer systems, architecture, and hardware security venues. He is an ACM Fellow “for contributions to computer architecture research, especially in memory systems”, IEEE Fellow for “contributions to computer architecture research and practice”, and an elected member of the Academy of Europe (Academia Europaea). For more information, please see his webpage at https://people.inf.ethz.ch/omutlu/.
[Part 1: Memory Importance and Trends: PPTX, PDF]
[Part 2: RowHammer: PPTX, PDF]
[Part 3: Computation in Memory: PPTX, PDF]
[Part 4: Low-Latency Memory: PPTX, PDF]
[Part 5: Principles and Conclusion: PPTX, PDF]
Tutorial on The HSA System Architecture in Detail (SAMOS 2014)
The tutorial is intended to give an in-depth outline of the various hardware and software components and how applications interface to a HSA platform. Slides
Tutorial on Mitigation of soft errors: from adding selective redundancy to changing the abstraction stack (SAMOS 2014)
Soft errors caused by ionizing radiation are already an issue for current technologies, and with the estimates of transistors scaling to 5.9 nm by 2026, computing devices will be forced to employ some reliability mechanism to ensure proper computation at a reasonable cost. Previously a major concern only in aerospace and avionic applications, soft errors have been recently reported also on the Earth level, in applications ranging from high performance computing to critical embedded systems, such as automotive, for instance. We believe that a knowledge on the causes of soft errors and on the pros and cons of different approaches to mitigate their effects is valuable for those working not only on microprocessor reliability, but also for those concerned with the design of software systems, since some error mitigation techniques might require the redesign of the computational stack. This way one can avoid the huge cost in terms of area, performance or energy incurred in traditional techniques. In this half a day tutorial we will focus on ionizing radiation as the source for soft errors and explain how experiments with real radiation are performed in order to evaluate the susceptibility of digital circuits to soft errors. We will then present and analyze pros and cons of some approaches in the literature to mitigate faults defined according to a fault model. We conclude by exploring challenges on the re‐design of the computational stack when taking into account soft errors, in order to achieve high reliability, high performance and low energy designs in different application domains.
Tutorial scope and objectives
The goal of this tutorial is to present advanced techniques to cope with soft errors in several layers of the abstraction stack. We start with a characterization of the problem and its causes, developing an overview of the mechanisms involved in soft errors creation by ionizing radiation and other sources. We then present some approaches that can be used to mitigate soft error effects. The occurrence of soft errors tends to increase as circuits get smaller, and their effects are magnified as advanced technologies are embedded in commonly used systems. These advances in the technology, while reducing the overall reliability of the system, have also opened up the opportunity for re‐designing the stack of abstraction and associated programming models. We believe that this half a day tutorial can contribute to disseminate knowledge on soft errors and how they can be taken into account when proposing new architectures and programming models.
Researches and graduate students interested in fault tolerance and reliability, working at different abstraction levels, and also those interested in program transformations and the re‐design of the abstraction stack.
Topics to be covered
The tutorial is organized as follows:
- PART 1: Characterizing soft errors, its causes and its effects, traditional strategies for error mitigation and detection.
- PART 2: Analyzing current approaches for mitigating effects of soft error, and challenges for redesigning the computational stack taking into account soft errors detection and correction.
The topics to be covered in each part are the following:
PART 1 – Characterizing soft errors, its causes and its effects, traditional strategies for error mitigation and detection:
- causes and consequences of soft errors;
- concrete examples/reports of disruptions caused by soft errors in critical applications (avionics, space applications, automotive, medicine, oil exploration, nuclear plants, etc);
- how to estimate/predict how sensitive to soft errors a circuit is? Reports on experiments with real radiation;
- current approaches: triple modular redundancy, invariant checkers, block signature checking, processor watchdogs;
- pros and cons of current approaches regarding fault coverage, area, performance and energy consumption.
PART 2 – Analyzing approaches for mitigating effects of soft errors and challenges for the re‐design of the computational stack:
- fault model as an abstraction of the real phenomena and as a reference for evaluating mitigating approaches;
- new challenges that soft errors bring to the area of fault tolerance and to microprocessor resilience;
- how soft errors can be taken into account when designing a new computational stack;
- the effect of soft errors in massively parallel machines, and how to cope with them using only software related techniques and programming guides.
RECH, P. ; AGUIAR, C. ; FROST, C. ; CARRO, L. . An Efficient and Experimentally Tuned Software‐Based Hardening Strategy for Matrix Multiplication on GPUs. IEEE Transactions on Nuclear Science, v. 60‐4 p. 2797 ‐ 2804, 2013.
NAZAR, G. ; RECH, P. ; FROST, C. ; CARRO, L. Radiation and Fault Injection Testing of a Fine‐Grained Error Detection Technique for FPGAs. IEEE Transactions on Nuclear Science, v. 60‐4, 2742 – 2749, 2013.
RECH, P. ; CARRO, L. Experimental Evaluation of Neutron‐Induced Effects in Graphic Processing Units. In: 9th Workshop on Silicon Errors in Logic ‐ System Effects, 2013, Stanford.
AITKEN, R; FEY, G.; KALBARCZYK, Z.; REICHENBACH, F.; REORDA, M. Robert. Reliability analysis reloaded: how will we survive?. In: DATE 2013, p. 358‐367.
HYUNGMIN CHO, H.; MIRKHANI, S.; CHER, C.; ABRAHAM, A.; AMITRA, S. Quantitative evaluation of soft error injection techniques for robust system design. In: DAC 2013.
HWANG, A. ; STEFANOVICI, I.; SCHROEDER. B. Cosmic rays don’t strike twice:understanding the nature of DRAM errors and the implications for system design In: ASPLOS 2012, London, UK, p. 111‐122.
CAMPAGNA, S.; VIOLANTE, M. An hybrid architecture to detect transient faults in microprocessors: An experimental validation. DATE 2012, p. 1433‐1438.
HARI, S.; ADVE, S.; NAEIMI, H.; RAMACHANDRAN, P. Relyzer: exploiting application‐level fault equivalence to analyze application resiliency to transient faults. In: ASPLOS 2012, London, UK, p.123‐134.
LISBÔA, C. ; GRANDO, C. ; MOREIRA, A. ; CARRO, L . Invariant Checkers: an Efficient Low Cost Technique for Run‐time Transient Errors Detection. In: IOLTS 2009, v. 1. p. 35‐40.
ITTURRIET, F. ; NAZAR, G.; FERREIRA, R. ; MOREIRA, A.; CARRO, L . Adaptive parallelism exploitation under physical and real‐time constraints for resilient systems. In: ReCoSoC 2012, York. p. 1‐8.
ITURRIET, F. ; FERREIRA, R.; GIRÃO, G.; NAZAR, G. ; MOREIRA, A.; CARRO, L. ResilientAdaptive Algebraic Architecture for Parallel Detection and Correction of Soft‐Errors. In: 15th Euromicro Conference on Digital System Design, 2012, Izmir, Turkey. p. 1‐8.
FERREIRA, R.; MOREIRA, A.; CARRO, L. Matrix control‐flow algorithm‐based fault tolerance. In: IEEE International On‐Line Testing Symposium, 2011, Athens. p. 37‐42.
FERREIRA, R.; AZAMBUJA, J. ; MOREIRA, A.; CARRO, L . Correction of Soft Errors in Control and Data Flow Program Segments. In: Workshop on Design for Reliability (DFR), 2011, Creta.
RHOD, E.; LISBÔA, C; CARRO, L.; REORDA, M.; VIOLANTE, M. Hardware and SoftwareTransparency in the Protection of Programs Against SEUs and SETs. J. Electronic Testing 24(1‐3): 45‐56 (2008).
VEMU, R.; GURUMURTHY, S.; ABRAHAM, J.A. ACCE: Automatic correction of control‐flow errors. IEEE International Test Conference, 2007, pp. 1,10.
ZIEGLER, F. ; CURTIS, H.; MUHLFELD, H; et al. IBM experiments in soft fails in computer electronics (1978–1994). IBM J. Res. Dev. 40, 1 (January 1996), 3‐18.
Biography of the Speakers
Luigi Carro has received the electrical engineering, M.Sc. and Ph.D. degree in Computer Science from Federal University of Rio Grande do Sul, Porto Alegre, Brazil. He is a full professor at the Institute of Informatics at UFRGS. He has considerable experience with computer engineering with emphasis on hardware and software design for embedded systems focusing on: embedded electronic systems, processor architecture dedicated test, fault tolerance, and multiplatform software development. He has advised more than 20 graduate students, and has published more than 150 technical papers on those topics. He has authored the book Digital Systems Design and Prototyping (2001‐in Portuguese) and is the co‐author of Fault‐Tolerance Techniques for SRAM‐based FPGAs (2006‐Springer), Dynamic Reconfigurable Architectures and Transparent optimization Techniques (2010‐Springer) and Adaptive Systems (Springer 2012).
Álvaro Moreira has a B.Sc and a M.Sc in Computer Science from Federal University of Rio Grande do Sul, Porto Alegre, Brazil, and a PhD in Computer Science from the University of Edinburgh, Scotland. He is an associate professor at the Institute of Informatics at UFRGS. He is interested in software‐based approaches for mitigation of soft errors, in the formal definition of fault models and in the formal semantics of new ISAs that take into account soft errors.