Back to news
Large Language Models
May 5, 2026

Workshop on Building a Language Model from Scratch Using nanoGPT

May 5, 2026
AI Summary

A workshop is being offered to teach participants how to create a language model from scratch using the nanoGPT framework. The session focuses on building a simplified version of GPT-2, allowing attendees to train a model capable of generating text in under an hour on personal computers.

The workshop guides participants through writing a complete GPT training pipeline, emphasizing hands-on experience with each component.

Using the nanoGPT framework, the workshop aims to reproduce a smaller version of GPT-2 with approximately 10 million parameters, which can be trained on a laptop in less than an hour.

Training can utilize Apple Silicon GPUs, NVIDIA GPUs, or CPUs, and participants can also use Google Colab for training.

The workshop covers character-level tokenization, which is suitable for smaller datasets like Shakespeare's texts, and introduces the option to switch to BPE tokenization for larger datasets in later sections.

Participants will create essential scripts including model.py, train.py, and generate.py, gaining a comprehensive understanding of the training process.

llmtrainingmachine learningopen sourcegithub