LLaMA-Factory: Unified fine-tuning for 100+ LLMs
Parameter-efficient fine-tuning framework for 100+ LLMs.
Learn more about LLaMA-Factory
LLaMA-Factory is a fine-tuning framework designed to work with multiple large language model and vision-language model architectures. It implements parameter-efficient fine-tuning methods including LoRA, QLoRA, and full fine-tuning, with support for quantization techniques to reduce memory requirements. The framework provides both CLI and web UI access points, allowing users to configure and execute training runs across different hardware setups. Common deployment contexts include local machine training, cloud platforms like Google Colab and Alibaba PAI, and integration with production systems at organizations like Amazon and NVIDIA.
Multi-model support
Handles over 100 different LLM and VLM architectures including LLaMA, Qwen, Gemma, and DeepSeek through a unified configuration system, reducing the need for model-specific implementations.
Parameter-efficient methods
Implements multiple fine-tuning approaches including LoRA, QLoRA, and full fine-tuning with quantization support, enabling training on hardware with limited VRAM through adapter-based techniques.
Dual interface design
Provides both command-line and web UI (Gradio-based) access points, allowing users to choose between programmatic control and interactive configuration for training workflows.
git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e .[metrics]Adds 30+ new models (Llama 4, Gemma 3, Qwen3, InternVL3), SGLang inference, official Docker image, and switches from AutoGPTQ to GPTQModel (breaking).
- –Switch quantization library from AutoGPTQ to GPTQModel; update dependencies if using GPTQ models.
- –Use official GPU Docker image at hiyouga/llamafactory or enable SGLang inference for faster serving.
Major refactor of VLM, template, and data pipelines; upgrades transformers to 4.49 and vLLM to 0.7.2. Adds APOLLO optimizer, Ray Trainer, and 30+ new model variants including DeepSeek R1/V3.
- –Upgrade transformers to 4.49 and vLLM to 0.7.2; pin tokenizers version to avoid compatibility issues.
- –Refactored VLM register, template system, and data pipeline may require config adjustments; git history cleaned with BFG (backup repo available).
Adds Llama-3.2-Vision, LLaVA-NeXT, Pixtral, and 40+ new model variants; fixes CVE-2024-52803 and abnormal loss in transformers 4.46.
- –Pin transformers to 4.46.0–4.46.1 to enable gradient accumulation fix and avoid loss calculation bugs.
- –Use new multi-image inference and effective token calculation for SFT/DPO; Qwen2-VL now supports Liger-Kernel.
See how people are using LLaMA-Factory
Related Repositories
Discover similar tools and frameworks used by developers
yolov5
Real-time object detection with cross-platform deployment support.
stable-diffusion
CLIP-conditioned latent diffusion model for text-to-image synthesis.
CodeFormer
Transformer-based face restoration using vector-quantized codebook lookup.
ByteTrack
Multi-object tracker associating low-confidence detections across frames.
opencv
Cross-platform C++ library for real-time computer vision algorithms.