MIT AI Hardware Program
Fall Research Update
Wednesday, November 12, 2025
11:00 - 1:30 PM ET
Virtual Event
Members of the MIT AI Hardware Program attended this annual virtual meeting covering updates on current research projects. Program co-leads Jesús del Alamo and Aude Oliva gave a program overview and researchers presented in-depth presentations with Q&As.
Watch the recorded talks below.

Agenda
11:00 – 11:05
Year in Review & the Year Ahead
Program Co-Leads
Jesús del Alamo, Donner Professor; Professor, Electrical Engineering and Computer Science; MacVicar Faculty Fellow
Aude Oliva, Director of Strategic Industry Engagement, MIT Schwarzman College of Computing; CSAIL Senior Research Scientist
11:05 – 11:20
Fast Fusion Scheduling for Efficient AI Accelerators
Vivienne Sze, Professor, Electrical Engineering and Computer Science, and Joel S. Emer, Professor of the Practice, Electrical Engineering and Computer Science
This project will optimize tensor algebra hardware in machine learning. We aim to develop a high-speed scheduling algorithm that minimizes data movement by maximizing reuse in on-die storage. Our approach expands the fused computation scheduling space while improving search efficiency through effective pruning, overall enabling performance optimization and better hardware provisioning for new accelerator design.
11:20 – 11:35
Neuromorphic Devices and Systems Enabled by Wafer-Scale CVD Growth of 2D Transition Metal Dichalcogenides
Tomás Palacios, Clarence J. Lebel Professor in Electrical Engineering, Electrical Engineering and Computer Science; Director, Microsystems Technology Laboratories, and Jing Kong, Jerry Mcafee (1940) Professor In Engineering, Electrical Engineering and Computer Science
This project aims to explore the use of two-dimensional transition metal dichalcogenides (TMDs), such as MoS2 and WSe2, as neuromorphic devices. We will leverage the extremely low leakage current of wide-bandgap TMD materials to develop floating-gate transistors (FGFETs), where changes in charge stored at the floating gate alter the MoS2 channel conductance. Floating gate structures will be first simulated and then experimentally demonstrated on TMD’s grown by metal-organic chemical vapor deposition (MOCVD) at back-end-of-line (BEOL)-compatible temperatures and integrated on a standard silicon CMOS process. For this, we will build on the MoS2 low-temperature 200 mm wafer-scale MOCVD growth technology recently demonstrated by our group, and we will fabricate highly-scaled heterostructure-based devices to ensure reproducible neuromorphic devices with record-low power consumption. We will also explore the impact of different processing steps such as lithography and deposition conditions on the device performance and stability.
11:35 – 11:50
Improving RL Sampling for Action Data Collection toward Generalized Parameter Sizing in Analog Integrated Circuits
Yan Xu, PhD Candidate, Electrical Engineering and Computer Science
In analog integrated circuit design, learning a generalizable parameter-sizing policy is difficult due to data scarcity and limited topology coverage. Reinforcement learning is appealing because it learns from simulator feedback rather than large labeled datasets, but naïve LLM-based sampling is inefficient, often producing low-quality actions and slow reward discovery.
We introduce a sampling-first framework that strengthens the LLM’s base behavior to make RL practical and to yield high-value action–outcome data for imitation learning. Circuits are encoded in a structure-aware text format that reflects the circuit graph; the policy proposes structured sizing actions; a simulator evaluates outcomes in the loop. By combining proposal sampling, including Best of N, lightweight constraints on device scope and units/ranges, and incremental performance and domain-specific rewards, we increase per-simulation data yield and stabilize early learning. The resulting trajectories support PPO fine-tuning and IL distillation, providing a pragmatic path toward generalizable sizing policies for analog ICs.
In collaboration with Ruonan Han, Professor, Electrical Engineering and Computer Science
11:50 – 12:05
Mitigating Hardware Faults in Hyper-scale Computing Systems
Peter Deutsch, PhD Candidate, Electrical Engineering and Computer Science
We aim to create models for new fault modes in processors, addressing reliability challenges for large-scale data centers. Our research develops methods for designing resilient hardware, guiding cost-effective protection strategies for scalability.
In collaboration with Mengjia Yan, Associate Professor, Electrical Engineering and Computer Science, and Joel S. Emer, Professor of the Practice, Electrical Engineering and Computer Science
12:05 – 12:20
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Muyang Li, PhD Candidate, Electrical Engineering and Computer Science
Diffusion models can effectively generate high-quality images. However, as they scale, rising memory demands and higher latency pose substantial deployment challenges. In this work, we aim to accelerate diffusion models by quantizing their weights and activations to 4 bits. At such an aggressive level, both weights and activations are highly sensitive, where existing post-training quantization methods like smoothing become insufficient. To overcome this limitation, we propose SVDQuant, a new 4-bit quantization paradigm. Different from smoothing, which redistributes outliers between weights and activations, our approach absorbs these outliers using a low-rank branch. We first consolidate the outliers by shifting them from activations to weights. Then, we use a high-precision, low-rank branch to take in the weight outliers with Singular Value Decomposition (SVD), while a low-bit quantized branch handles the residuals. This process eases the quantization on both sides. However, naively running the low-rank branch independently incurs significant overhead due to extra data movement of activations, negating the quantization speedup. To address this, we co-design an inference engine Nunchaku that fuses the kernels of the low-rank branch into those of the low-bit branch to cut off redundant memory access. It can also seamlessly support off-the-shelf low-rank adapters (LoRAs) without re-quantization. Extensive experiments on SDXL, PixArt-Σ, and FLUX.1 validate the effectiveness of SVDQuant in preserving image quality. We reduce the memory usage for the 12B FLUX.1 models by 3.5×, achieving 3.0× speedup over the 4-bit weight-only quantization (W4A16) baseline on the 16GB laptop 4090 GPU with INT4 precision. On the latest RTX 5090 desktop with Blackwell architecture, we achieve a 3.1× speedup compared to the W4A16 model using NVFP4 precision.
In collaboration with Song Han, Associate Professor, Electrical Engineering and Computer Science
12:20 – 12:50
Co-design of Machine Intelligence and Hardware and the Need for Verifiable AI
Dirk Englund, Professor, Electrical Engineering and Computer Science
This talk will explore calibration and optimality in programmable photonics, with a focus on error robustness and efficient phase-shifter usage in multiport interferometers. We examine the challenges in designing photonic circuits that are both efficient and robust to errors, particularly in the context of optical neural networks, boson sampling, and other advanced applications. Highlighting mesh architectures such as the Reck and Clements designs, the talk addresses how phase requirements can be minimized while maintaining system universality and low error sensitivity. We propose a novel 3-MZI structure, which reduces phase-shift demands and improves fault tolerance by offering more stable configurations. This architecture, when evaluated against information-theoretic bounds, approaches optimal phase efficiency for large-scale photonic meshes. Applications in self-configuring systems, error-aware training, and non-unitary photonic meshes will also be discussed, presenting a comprehensive approach to achieving near-optimal programmable photonic performance.
12:50 – 1:05
OpenTouch: Understanding Human Hands Contact-rich Manipulation
Wojciech Matusik, Joan and Irwin M. (1957) Jacobs Professor, Electrical Engineering and Computer Science
Tactile sensors present a powerful means of capturing, analyzing, and augmenting physical information and interactions. However, modern tactile sensing methods face several key challenges, including scaling data collection to train large-scale generative AI systems, the risk of noisy and imperfect sensors, and conforming to various robotic form factors. Our proposed work seeks to tackle these challenges by developing personalized tactile sensing hardware, scalable data collection toolkits, and training state-of-the-art multimodal generative AI systems combining tactile, EIT, IMU information for robotic sensing.
1:05 – 1:20
A 23-µJ-per-Frame Fully-Integrated U-Net-Based TinyML Processor for Real-Time and Autonomous Medical Image Segmentation
Zoey Song, PhD Candidate, Electrical Engineering and Computer Science
We present a fully-integrated 28nm processor for autonomous medical image segmentation in wearable ultrasound devices. The chip features mixed-precision datapaths, optimized dataflows with interleaved memory, and skip connection fusion and compression that eliminate external memory requirements. Occupying 0.81 mm², it achieves 23 μJ per frame for bladder segmentation. This enables privacy-preserving, real-time tissue monitoring in wearable ultrasound patches for applications including urinary retention and fetal biometry.
In collaboration with Anantha Chandrakasan, MIT’s Provost, Vannevar Bush Professor of Electrical Engineering and Computer Science
1:20 – 1:25
Closing Remarks
Program Co-Leads
Jesús del Alamo, Donner Professor; Professor, Electrical Engineering and Computer Science; MacVicar Faculty Fellow
Aude Oliva, Director of Strategic Industry Engagement, MIT Schwarzman College of Computing; CSAIL Senior Research Scientist
Researchers
Aude Oliva
Senior Research Scientist at MIT Schwarzman College of Computing
Energy efficient deep learning, and Computer vision

















