Intro - What is ML

Machine learning is the idea of creating a mathematical function calibrated to be correct with input-output pairs. We can then take new inputs and produce correct outputs that we did not previously have.

Infrastructure exists which supports this type of “training.” These ML frameworks include, but are not limited to, PyTorch, TensorFlow, JAX, Keras, Scikit-learn, and NumPy. Here’s a quick link describing the differences between all of them: LINK

PyTorch is currently the most popular open-source ML framework, with a 63% adoption rate, according to the Linux Foundation’s Shaping the Future of Generative AI 2024 report. Online PyTorch courses and resources are overwhelmingly focused on teaching how to use the library’s interface. Despite the code being open source, it can feel like a black box if you don’t have the prerequisite knowledge to read the GitHub.

We implemented PyTorch from scratch and wanted to create a resource that walks through the full pipeline, shows the math, and shows the code together. We’ll go into depth in each section, directly addressing PyTorch’s internals rather than just its interface. Then a reader could learn by watching us build rather than reading theory (like we would’ve wanted).

The north star for this entire endeavour is to motivate a mental model of the entire ML pipeline for the reader. Our marriage of abstract understanding, mathematical toy examples, and code should make this blog somewhat unique.

Expect to learn:

  1. How a model is trained
  2. The optimizations which ML frameworks perform for you
  3. The different types of models that can be trained. We take particular interest in building a strong mental model of Transformer internals. Let’s go.