LlamaFactory Blog

Add New Special Tokens for Model Training

1 Introduction This paper uses the Ministral-3-3B-Instruct-2512 model and takes an image classification task fine-tuned via SFT as an example to illustrate how to add new special tokens. The experimental command is as follows: 1 2 3 4 # install newest transformers pip install git+https://github.com/huggingface/transformers DISABLE_VERSION_CHECK=1 CUDA_VISIBLE_DEVICES=7 python src/train.py examples/train_lora/ministral3_lora_sft.yaml It is necessary to preconfigure ministral3_lora_sft.yaml. 2 Dataset Loading and Preprocessing In the file LLaMA-Factory/src/llamafactory/data/loader.py, the get_dataset function is responsible for loading the dataset and preprocessing the data using the tokenizer. ...

Adapt a new model on LLaMA-Factory

1 Overview of Model Adaptation LLaMA-Factory offers a complete framework for model pre-training, fine-tuning, and inference. If it is necessary to adapt a new model, only a small amount of code needs to be modified to integrate the model into LLaMA-Factory. First, the file LLaMA-Factory/src/llamafactory/extras/constants.py defines the supported model groups and their corresponding templates. A template is a “format specifier” used when constructing the input prompt for the large model. It defines the dialogue format, field structure, role order, and the format for tool calls. For example: ...

Code Guide for LLaMA Factory Project

1 Introduction to the LLaMA-Factory Project LLaMA-Factory is an efficient training and fine-tuning framework designed for large language models (LLMs). It aims to simplify the training workflow of the LLaMA family as well as various open-source large models. With the core philosophy of being “out-of-the-box, flexible, and efficient,” it provides an end-to-end solution covering data preparation, parameter-efficient fine-tuning (PEFT), training configuration management, and model deployment. LLaMA-Factory supports multiple mainstream model architectures—such as LLaMA, Qwen, Gemma, and Mistral—and integrates lightweight training techniques including LoRA, QLoRA, AdaLoRA, and Prompt Tuning. These capabilities enable developers to fine-tune high-quality models at extremely low cost, whether in single-GPU or multi-GPU environments. ...

KTransformers Fine-Tuning × LLaMA-Factory Integration

KTransformers Fine-Tuning × LLaMA-Factory Integration Introduction From DeepSeek-V3/R1 to Qwen3-MoE and Kimi-K2, each wave of open-sourced large models brings leaps in performance and scale. However, many researchers and developers are constrained by expensive GPUs and models with tens or even hundreds of billions of parameters, making it hard to fine-tune very large models under limited resources. To bridge this gap, we propose a practical approach: combining KTransformers with LLaMA-Factory. With just 2–4 RTX 4090s and a high-memory CPU, you can fine-tune ultra-large MoE models like DeepSeek-671B. ...

Megatron-Core Fine-Tuning with LLaMA-Factory

LLaMA-Factory 🤝 MCoreAdapter To fully leverage Megatron-core’s parallel computing and improve training efficiency for MoE models, we combined the MCoreAdapter provided by the ROLL team with LLaMA-Factory’s data pipeline and Megatron Trainer’s backend to build a new model training workflow. 🚀 Quick Start 1. 💻 Environment Installation 📦 pip 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # for megatron-core pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu124 pip install \ numpy==1.26.4 \ optree>=0.13.0 \ spacy==3.7.5 \ weasel==0.4.1 \ transformer-engine[pytorch]==2.2.0 \ megatron-core==0.13.0 \ deepspeed==0.16.4 pip uninstall -y opencv opencv-python opencv-python-headless pip install opencv-python-headless==4.11.0.86 pip install "git+https://github.com/alibaba/roll.git#subdirectory=mcore_adapter" # for llamafactory git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git cd LLaMA-Factory pip install -e ".[torch,metrics]" --no-build-isolation 🐳 docker (Recommended) Refer to the Dockerfile for building. ...