Most deep-learning reading lists focus on seminal papers or recommend reading entire books just to get started. This is neither the fastest nor simplest approach to enter a new field, as many struggle to read complex scientific papers and find books too slow.
Below, I’ve* composed a reading list that focuses on reducing friction in learning.
The core focus is on excellent blog posts that stand the test of time, are well-written, and ideally use visualizations. Some book chapters and one paper still made it into the list, but I wouldn’t recommend reading a book cover-to-cover. Similarly, this list is non-exhaustive and isn’t meant to make you an expert in the following fields. Instead, it gives you a good overview of deep learning and a strong starting point to find a field you like and research it further.
I. Foundational Concepts
What is Deep Learning?
A Beginner’s Guide to Deep Learning - Educative offers a comprehensive course suitable for those familiar with Python.
Neural Networks:
Neural Networks and Deep Learning - A free online book providing a gentle introduction to neural networks and deep learning.
Deep Learning with Python: Neural Networks (complete tutorial) - A code-first introduction to building and training neural networks using Python.
Making Deep Learning Go Brrrr From First Principles - An insightful blog post focusing on the practical performance aspects of deep learning.
II. Core Architectures & Concepts
Convolutional Neural Networks (CNNs):
Computing Receptive Fields of CNNs - Distill publication visually explains receptive fields in CNNs.
Understanding Convolutions on Graphs - Distill article breaks down convolutions in the context of graph neural networks.
Recurrent Neural Networks (RNNs):
Illustrated Guide to Recurrent Neural Networks - Towards Data Science provides a visual guide to RNNs.
Visualizing memorization in RNNs - Distill offers interactive visualizations to understand how RNNs memorize.
Understanding LSTMs - Christopher Olah’s classic post breaks down the inner workings of LSTMs, a popular RNN variant.
Transformer Networks:
Illustrated Transformer - A highly visual and accessible explanation of the Transformer architecture.
Transformer: A Novel Neural Network Architecture for Language Understanding - Google AI blog introduces the Transformer architecture and its significance.
Transformers for Image Recognition at Scale - Google AI blog explores the application of Transformers to image recognition tasks.
Generative Adversarial Networks (GANs):
From GAN to WGAN - Lilian Weng’s blog provides a clear explanation of GANs and their evolution to WGANs.
Understanding Generative Adversarial Networks (GANs) - Towards Data Science offers a comprehensive explanation of GANs and their applications.
Diffusion Models:
What are Diffusion Models? - Lilian Weng provides an in-depth explanation of diffusion models, a powerful class of generative models.
Attention Mechanism:
Attention? Attention! - A comprehensive blog post explaining various attention mechanisms and their applications in different deep learning models.
Attention and Augmented Recurrent Neural Networks - Distill publication explains attention mechanisms in the context of RNNs, a crucial concept for many deep learning models.
III. Key Deep Learning Techniques
Activation Functions:
A Comprehensive Guide to Activation Functions in Deep Learning - A detailed exploration of activation functions, their types, significance, and selection for different neural networks.
Backpropagation & Gradient Descent:
Backpropagation and Stochastic Gradient Descent - Explains the core algorithms behind training neural networks.
Gradient Descent - Illustrated guide to gradient descent, covering epochs, batch sizes, and iterations.
Why Momentum Really Works - Distill offers a deep dive into the momentum optimization technique, its benefits, and mathematical properties.
Optimizers:
Adam: A Method for Stochastic Optimization - A seminal paper introducing the Adam optimizer.
An Overview of Gradient Descent Optimization Algorithms - A comprehensive overview of gradient descent and its variants.
Deep Learning Optimization Algorithms - Discusses various optimization algorithms, including SGD, AdaGrad, RMSProp, and Adam.
Regularization:
Dropout: A Simple Way to Prevent Neural Networks from Overfitting - The original paper on Dropout.
Transfer Learning:
A Comprehensive Hands-on Guide to Transfer Learning with Real-World Applications in Deep Learning - Explains transfer learning and its practical applications in deep learning, with code examples.
IV. Advanced Topics and Trends
Meta-Learning:
Meta-Learning: Learning to Learn Fast - Lilian Weng’s blog post explores meta-learning approaches.
Graph Neural Networks (GNNs):
A Gentle Introduction to Graph Neural Networks - A beginner-friendly introduction to GNNs and their applications.
A Comprehensive Introduction to Graph Neural Networks (GNNs) - A tutorial covering GNN fundamentals and applications.
Reinforcement Learning (RL)
Implementing Deep Reinforcement Learning Models: Lilian Weng’s repository provides practical implementations of classic deep reinforcement learning models using TensorFlow and OpenAI Gym, offering a hands-on approach to understanding these concepts.
Transformers and Large Language Models (LLMs)
LLM Powered Autonomous Agents: Lilian Weng’s blog post explores the fascinating world of LLMs as the core controllers of autonomous agents, showcasing their potential beyond traditional language tasks.
Adversarial Attacks and Robustness
Exploring the Landscape of Adversarial Attacks on Large Language Models: This blog post examines the vulnerabilities of LLMs to adversarial attacks, highlighting the importance of robustness in deploying these powerful models.
Future of Deep Learning Research
Future of Deep Learning according to top AI Experts of 2024: This article summarizes insights from leading AI experts on the future trajectory of deep learning, discussing emerging trends and challenges.
Four trends that changed AI in 2023: This article provides a concise overview of the major trends that shaped the field of AI in 2023, offering valuable context for understanding the current state and potential future directions of deep learning.
*This reading list was compiled and written by Gemini-Flash using an advanced sampling method. It autonomously navigated the web, reading 1000 websites and papers to provide the best resources.