A comprehensive tutorial exploring the growing importance of low-rank approximation techniques in the context of large language models (LLMs). The session covers theoretical foundations, empirical observations, and practical applications of low-rank structures to enhance efficiency, interpretability, and robustness in LLMs.
Tutorial Sections
Explore the actual structure of the tutorial based on the slide content, covering low-rank techniques from theoretical foundations to cutting-edge applications.
1
Introduction & Foundations
Pursuing low-dimensional structures in high-dimensional data: a universal quest over centuries, from sparsity to low rank
Key Topics:
Universal Quest
Sparsity to Low Rank
SVD & Rank Theory
Curse & Blessing of Dimensionality
Low-Rank Simplicity Bias
2
Low Rank Gradients
LoRA, GaLore, and More
Low rank gradients (or weight updates) in foundation models, focusing on memory-efficient training techniques
Key Topics:
Memory Requirements
System vs Algorithm Level
GaLore Projection
Q-GaLore
APOLLO Optimizer
3
Low Rank Weight
Novel model compression & PEFT
How low-rank gradients lead to low-rank weights and enable novel model compression and parameter-efficient fine-tuning
Key Topics:
Gradient-Weight Connection
Model Compression
PEFT Applications
Weight Structure Analysis
4
Low Rank Attention
Efficiency and safety applications
Low rank attention and neuron subspace with applications to efficiency and safety in large language models
Key Topics:
Attention Bottleneck
Approximate Low-Rank
Efficiency Applications
Safety Implications
5
Going Beyond Low Rank
Vector space to measure space
Moving beyond traditional low-rank approaches: from vector space to measure space and understanding reasoning
Key Topics:
Beyond Low-Rank
Vector to Measure Space
Reasoning Understanding
Advanced Applications
Key Techniques Covered
Based on the key references from the tutorial slides, covering foundational papers and cutting-edge research with direct links to the original publications.
High-Dimensional Data Analysis with Low-Dimensional Models: Principles, Computation, and Applications
Wright & Ma
Cambridge University Press 2022
📖 Read Paper
An Overview of Low-Rank Structures in the Training and Adaptation of Large Models
Balzano et al.
submitted to IEEE Signal Processing Magazine 2025
📄 Read Paper
The Low-Rank Simplicity Bias in Deep Networks
Huh et al.
TMLR 2023
📄 Read Paper
LoRA: Low-Rank Adaptation of Large Language Models
Hu et al.
ICLR 2022
📄 Read Paper
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Zhao et al.
ICML 2024
📄 Read Paper
APOLLO: SGD-like Memory AdamW-level Performance
Zhu et al.
MLSys 2025
📄 Read Paper
From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications
Jaiswal et al.
ICML 2025
📄 Read Paper
Linformer: Self-Attention with Linear Complexity
Wang et al.
arXiv 2020
📄 Read Paper
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Wei et al.
ICML 2024
📄 Read Paper
LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning
Perin et al.
COLM 2025
📄 Read Paper
Why Neural Network Can Discover Symbolic Structures with Gradient-based Training: An Algebraic and Geometric Foundation for Neurosymbolic Reasoning
Wang & Wang
NeuS 2025
📄 Read Paper
Acknowledgements
Special thanks to the following researchers for sharing their slides and materials that are reused in this tutorial:
Gabriel J. Perin
University of São Paulo