Navigate:
All ReposnanoGPT
~$NANOGP0.2%

nanoGPT: GPT training and finetuning codebase

Minimal PyTorch implementation for training GPT models.

LIVE RANKINGS • 06:51 AM • STEADY
TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50
OVERALL
#49
36
AI & ML
#28
20
30 DAY RANKING TREND
ovr#49
·AI#28
STARS
51.8K
FORKS
8.7K
DOWNLOADS
1
7D STARS
+86
7D FORKS
+16
Tags:
See Repo:
Share:

Learn more about nanoGPT

nanoGPT is a Python-based training framework for GPT-scale language models built on PyTorch. It consists of approximately 300 lines each for the training loop (train.py) and model definition (model.py), with support for loading pretrained GPT-2 weights from OpenAI. The codebase handles data preprocessing, distributed training on multi-GPU setups, and checkpoint management with optional Weights & Biases logging. It is used for training models ranging from character-level networks on small datasets to reproducing GPT-2 (124M parameters) on large text corpora like OpenWebText.

nanoGPT

1

Minimal codebase

The core training and model logic is contained in two approximately 300-line files, making the implementation straightforward to understand and modify without abstraction layers.

2

Pretrained weight loading

Can load official GPT-2 weights from OpenAI and finetune them on custom datasets, supporting model sizes up to 1.3B parameters as a starting point.

3

Multi-GPU training

Supports distributed training across multiple GPUs with configuration files for different hardware setups, from CPU-only machines to multi-A100 nodes.


git clone https://github.com/karpathy/nanoGPT.git
cd nanoGPT
pip install torch numpy transformers datasets tiktoken wandb tqdm


[ EXPLORE MORE ]

Related Repositories

Discover similar tools and frameworks used by developers