Fine-Tuning the Latest Qwen3.5 Model to Identify Humanoid Robot Models Using LlamaFactory

At the beginning of 2026, from the Consumer Electronics Show (CES) in Las Vegas, USA, to the China Central Television (CCTV) Spring Festival Gala, China’s self-developed humanoid robots have frequently “broken through the circle.” Products and applications from multiple Chinese enterprises have not only sparked discussions within the overseas industry but have also continuously “swept” global social media platforms and international media. Embodied intelligence, regarded as the next stage of artificial intelligence development, has its core in achieving a deep coupling between the intelligent “brain” and the physical “body,” thereby directly transforming data, algorithms, and computing power into the ability to act on and transform the objective world. Humanoid robots, due to their human-like appearance and functionality, are considered a high-level form and the optimal carrier for embodied intelligence, poised to become the next-generation super terminal following smartphones and new energy vehicles. ...

March 3, 2026 · 7 min · 1414 words · hiyouga

Issues Related to the Qwen3-VL Model

This blog post focuses on several practical issues related to the Qwen3-VL model, along with an analysis of their root causes and corresponding solutions. 1. Slow Training and Inference Speed of Qwen3-VL Problem: Some posts and GitHub issues report that when using torch=2.9 together with Conv3D, the training and inference speed of Qwen3-VL degrades significantly compared to torch=2.8. See the related discussion at: https://github.com/pytorch/pytorch/issues/166122 1.1 Comparing CUDA Kernel Invocations We first compared the CUDA kernel calls of Conv3D under torch=2.8 and torch=2.9. The test code is shown below: ...

January 5, 2026 · 3 min · 490 words · hiyouga

RL-DPO Training with KTransformers and LLaMA-Factory

This tutorial demonstrates how to fine-tune a language model using the LLaMA-Factory framework with Direct Preference Optimization (DPO). DPO is a training method based on human preferences, enabling model outputs to better align with human expectations and be more user-centric. 1 Environment Setup Software & hardware requirements: CPU must support AMX, the system glibc version must be ≥ 2.32, and a GPU with at least 32 GB of VRAM is recommended. ...

December 23, 2025 · 4 min · 811 words · hiyouga

Add New Special Tokens for Model Training

1 Introduction This paper uses the Ministral-3-3B-Instruct-2512 model and takes an image classification task fine-tuned via SFT as an example to illustrate how to add new special tokens. The experimental command is as follows: 1 2 3 4 # install newest transformers pip install git+https://github.com/huggingface/transformers DISABLE_VERSION_CHECK=1 CUDA_VISIBLE_DEVICES=7 python src/train.py examples/train_lora/ministral3_lora_sft.yaml It is necessary to preconfigure ministral3_lora_sft.yaml. 2 Dataset Loading and Preprocessing In the file LLaMA-Factory/src/llamafactory/data/loader.py, the get_dataset function is responsible for loading the dataset and preprocessing the data using the tokenizer. ...

December 17, 2025 · 9 min · 1732 words · hiyouga

Adapt a new model on LLaMA-Factory

1 Overview of Model Adaptation LLaMA-Factory offers a complete framework for model pre-training, fine-tuning, and inference. If it is necessary to adapt a new model, only a small amount of code needs to be modified to integrate the model into LLaMA-Factory. First, the file LLaMA-Factory/src/llamafactory/extras/constants.py defines the supported model groups and their corresponding templates. A template is a “format specifier” used when constructing the input prompt for the large model. It defines the dialogue format, field structure, role order, and the format for tool calls. For example: ...

December 12, 2025 · 8 min · 1664 words · hiyouga