Workshop on Building a Language Model from Scratch Using nanoGPT

May 5, 2026

AI Summary

A workshop is being offered to teach participants how to create a language model from scratch using the nanoGPT framework. The session focuses on building a simplified version of GPT-2, allowing attendees to train a model capable of generating text in under an hour on personal computers.

The workshop guides participants through writing a complete GPT training pipeline, emphasizing hands-on experience with each component.

Using the nanoGPT framework, the workshop aims to reproduce a smaller version of GPT-2 with approximately 10 million parameters, which can be trained on a laptop in less than an hour.

Training can utilize Apple Silicon GPUs, NVIDIA GPUs, or CPUs, and participants can also use Google Colab for training.

The workshop covers character-level tokenization, which is suitable for smaller datasets like Shakespeare's texts, and introduces the option to switch to BPE tokenization for larger datasets in later sections.

Participants will create essential scripts including model.py, train.py, and generate.py, gaining a comprehensive understanding of the training process.

llmtrainingmachine learningopen sourcegithub

Workshop on Building a Language Model from Scratch Using nanoGPT

Related Stories

Thinking Machines Lab develops AI model for simultaneous conversation

ChatGPT Sees Increased Adoption Among Older Users in Early 2026

Optimizing Matrix Multiplication for Swift in LLM Training

arXivLabs Encourages Collaboration on New Features with a Focus on Privacy