LLaMA-Factory: Unified fine-tuning for 100+ LLMs
Parameter-efficient fine-tuning framework for 100+ LLMs.
Learn more about LLaMA-Factory
LLaMA-Factory is a fine-tuning framework designed to work with multiple large language model and vision-language model architectures. It implements parameter-efficient fine-tuning methods including LoRA, QLoRA, and full fine-tuning, with support for quantization techniques to reduce memory requirements. The framework provides both CLI and web UI access points, allowing users to configure and execute training runs across different hardware setups. Common deployment contexts include local machine training, cloud platforms like Google Colab and Alibaba PAI, and integration with production systems at organizations like Amazon and NVIDIA.
Multi-model support
Handles over 100 different LLM and VLM architectures including LLaMA, Qwen, Gemma, and DeepSeek through a unified configuration system, reducing the need for model-specific implementations.
Parameter-efficient methods
Implements multiple fine-tuning approaches including LoRA, QLoRA, and full fine-tuning with quantization support, enabling training on hardware with limited VRAM through adapter-based techniques.
Dual interface design
Provides both command-line and web UI (Gradio-based) access points, allowing users to choose between programmatic control and interactive configuration for training workflows.
git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e .[metrics]v0.9.4: Goodbye 2025
- –Repository name updated: LLaMA-Factory → LlamaFactory
- –Python 3.9–3.10 have been deprecated; LlamaFactory now requires Python 3.11–3.13
- –Migrated from pip to uv; use `uv pip install llamafactory`
- –The official LlamaFactory blog is now live: https://blog.llamafactory.net/en/
- –🔥 Support Orthogononal Fine-Tuning (OFT) by @zqiu24 in #8623
v0.9.3: Llama4, Gemma3, Qwen3, InternVL3, Qwen2.5-Omni
- –Event info: https://aws.amazon.com/cn/events/summits/shanghai/
- –🔥 InternVL2.5/InternVL3 model by @Kuangdd01 in #7258
- –🔥 Qwen2.5-Omni model by @Kuangdd01 in #7537
- –🔥 Llama 4 and Gemma 3 multimodal model by @hiyouga in #7273 and #7611
- –🔥 Official GPU docker image by @yzoaim in #8181
v0.9.2: MiniCPM-o, SwanLab, APOLLO
- –Event info: https://mp.weixin.qq.com/s/viPRDlhnzS3qO9-96fMeeA
- –🔥 APOLLO optimizer by @zhuhanqing in #6617
- –🔥 SwanLab experiment tracker by @Zeyi-Lin in #6401
- –🔥 Ray Trainer by @erictang000 in #6542
- –Batch inference with vLLM TP by @JieShenAI in #6190
See how people are using LLaMA-Factory
Top in AI & ML
Related Repositories
Discover similar tools and frameworks used by developers
Whisper
Speech recognition system supporting multilingual transcription, translation, and language ID.
Mask2Former
Unified transformer architecture for multi-task image segmentation.
ControlNet
Dual-branch architecture for conditional diffusion model control.
gym
Standard API for reinforcement learning environment interfaces.
open_clip
PyTorch library for contrastive language-image pretraining.