Skip to content

a massive step by step hands-on guide to literally everything in ML

License

Notifications You must be signed in to change notification settings

A-Ravioli/ml-from-scratch

Repository files navigation

ml-from-scratch

overview

a hand on guide in machine learning, designed to take you from mathematical foundations to cutting-edge research. designed to take you from middle school to PhD. each topic includes:

  • theoretical foundations with rigorous mathematical treatment
  • hands-on implementation from scratch
  • research connections to current literature
  • socratic questions to deepen understanding
  • curated resources from top institutions

curriculum structure

year 1: mathematical foundations & classical ML

build rock-solid mathematical foundations and master classical machine learning.

quarter 1-2: mathematical foundations

  • real & functional analysis
  • linear algebra & matrix theory
  • probability theory & stochastic processes
  • optimization theory

quarter 3-4: statistical learning & classical ML

  • PAC learning & VC theory
  • kernel methods
  • tree-based methods
  • bayesian inference

year 2: deep learning & advanced optimization

master neural networks from theory to practice.

quarter 1: neural network theory

  • approximation theory
  • backpropagation calculus
  • initialization & normalization

quarter 2: architectures

  • CNNs, RNNs, transformers
  • graph neural networks
  • attention mechanisms

quarter 3: optimization

  • first & second-order methods
  • variance reduction
  • distributed optimization

quarter 4: research project

  • reproduce 3 seminal papers
  • original contribution

year 3: specialized areas

deep dive into advanced topics.

quarter 1: generative models

  • VAEs, GANs, normalizing flows
  • diffusion models & score matching
  • energy-based models
  • flow matching

quarter 2: reinforcement learning

  • tabular methods to deep RL
  • model-based RL
  • multi-agent systems
  • offline RL

quarter 3: specialization track choose your research focus area

quarter 4: research

  • conference paper submission

year 4: frontier research

push the boundaries of ML.

quarter 1-2: advanced topics

  • large language models
  • multimodal learning
  • AI alignment & safety
  • mechanistic interpretability

quarter 3-4: dissertation

  • original research contributions
  • multiple paper submissions

learning philosophy

  1. first principles: derive everything from mathematical foundations
  2. implementation first: code before using libraries
  3. theory ↔ practice: connect mathematical insights to empirical results
  4. research mindset: question assumptions, propose extensions

how to use this curriculum

  1. follow the order: topics build on each other systematically
  2. complete exercises: implementation is crucial for understanding
  3. read papers: each lesson connects to research literature
  4. test understanding: answer socratic questions before moving on
  5. build projects: apply knowledge to increasingly complex problems

prerequisites

  • basic python programming
  • high school mathematics
  • dedication to deep understanding

getting started

begin with 00-mathematical-foundations/01-analysis/01-real-analysis/ and work through systematically. each directory contains:

  • lesson.md: theory and concepts
  • exercise.py: implementation template
  • test_implementation.py: verify your solution
  • solutions/: reference implementations (try solo first!)

topics covered

core areas

  • statistical learning theory
  • optimization (convex & non-convex)
  • deep learning architectures
  • generative models (VAEs, GANs, diffusion, flows)
  • reinforcement learning
  • natural language processing
  • computer vision

advanced topics

  • neural ODEs/SDEs
  • geometric deep learning
  • meta-learning
  • continual learning
  • causal representation learning
  • quantum machine learning
  • energy-based models
  • flow matching
  • mechanistic interpretability

theoretical foundations

  • approximation theory
  • optimization landscape analysis
  • generalization theory
  • information geometry
  • optimal transport

resources

each lesson includes:

  • original papers
  • lecture notes from stanford, MIT, berkeley
  • video lectures
  • implementation tutorials
  • advanced reading for deeper dives

contributing

this curriculum is a living document. contributions welcome for:

  • additional exercises
  • clearer explanations
  • new research connections
  • bug fixes in implementations

license

MIT license - learn freely, build amazing things!

About

a massive step by step hands-on guide to literally everything in ML

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages