Posts

LLaMA-Factory 🤝 MCoreAdapter To fully leverage Megatron-core’s parallel computing and improve training efficiency for MoE models, we combined the MCoreAdapter provided by the ROLL team with LLaMA-Factory’s data pipeline and Megatron Trainer’s backend to build a new model training workflow. 🚀 Quick Start 1. 💻 Environment Installation 📦 pip 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # for megatron-core pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu124 pip install \ numpy==1.26.4 \ optree>=0.13.0 \ spacy==3.7.5 \ weasel==0.4.1 \ transformer-engine[pytorch]==2.2.0 \ megatron-core==0.13.0 \ deepspeed==0.16.4 pip uninstall -y opencv opencv-python opencv-python-headless pip install opencv-python-headless==4.11.0.86 pip install "git+https://github.com/alibaba/roll.git#subdirectory=mcore_adapter" # for llamafactory git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git cd LLaMA-Factory pip install -e ".[torch,metrics]" --no-build-isolation 🐳 docker (Recommended) Refer to the Dockerfile for building. ...

Posts

KTransformers Fine-Tuning × LLaMA-Factory Integration

Megatron-Core Fine-Tuning with LLaMA-Factory