How to Train a Qwen-Image LoRA with ai-toolkit

Training a custom LoRA (Low-Rank Adaptation) model is one of the most effective ways to personalize image generation. It allows you to teach an AI model specific styles, characters, or aesthetics without retraining the entire model from scratch. With Qwen-Image, a powerful vision-language model, you now have a strong foundation to build on. By combining it with ai-toolkit, an open-source tool that simplifies the training process, you can create a LoRA that fine-tunes Qwen-Image to your exact needs. In this article, we’ll walk through how to set up, train, and evaluate a Qwen-Image LoRA using ai-toolkit, along with practical tips for getting the best results.

If you’re thinking about purchasing a new GPU, we’d greatly appreciate it if you used our Amazon Associate links. The price you pay will be exactly the same, but Amazon provides us with a small commission for each purchase. It’s a simple way to support our site and helps us keep creating useful content for you. Recommended GPUs: RTX 5090, RTX 5080, and RTX 5070. #ad

AI-Toolkit Installation

Requirements:

python >3.10
Nvidia GPU: You need at least 24 GB VRAM to train a Qwen-Image LoRA, so you need a RTX 3090, 4090, or 5090.
python venv
git

Windows installation:

git clone https://github.com/ostris/ai-toolkit.git
cd ai-toolkit
python -m venv venv
.\venv\Scripts\activate
pip install torch==2.8.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip install -r requirements.txt

Note that this is different from the instructions on the original repo, because I modified the steps for use with a RTX 50 series card.

Preparing the Dataset for LoRA Training

A well-prepared dataset is the foundation of a good LoRA. Since LoRAs are designed to adapt a base model (like Qwen-Image) to specific characters, styles, or subjects, your dataset needs to be both clean and representative. Here’s how to prepare it:

1. Collect Images

Aim for 20–200 images depending on the complexity of the subject.
Use a variety of poses, angles, and lighting conditions to help the LoRA generalize.
Avoid near-duplicates—too many similar images can cause overfitting.

2. Clean the Images

Make sure subjects are centered and visible.
Remove watermarks, logos, or cluttered backgrounds if possible.

3. Caption the Dataset

Each image needs a caption file (.txt) with a short, accurate description.
Include key features (e.g., “woman with long black hair wearing a red dress, standing in a garden”).
Be consistent in your naming and phrasing—this helps the model learn associations.
Tools like BLIP, Caption Anything, or Hugging Face captioning models can automate first drafts, which you can then refine manually. If you have Kohya SS, you can use it to help you captioning the images.

4. Organize the Folder Structure

Place all images and captions in a single folder.
ai-toolkit typically expects paired image + text file (e.g., 001.png + 001.txt). This is how it looks like in the directory.
Double-check that filenames match exactly.

5. Balance the Dataset

If training for a specific character, make sure at least 70–80% of the images are of that character.
Add 20–30% of varied “negative” or contextual images (different outfits, settings) to avoid overfitting.

6. Test the Captions

Run a few sample generations with your captions as prompts to check if they describe images accurately.
Refine wording before training begins—it’s much harder to fix after.

Key Tip: Quality beats quantity. A dataset of 30 clean, well-captioned images will usually outperform 150 messy or inconsistent ones.

Qwen-Image LoRA Training Configuration File

An example Qwen-Image LoRA training configuration is in config\examples\ folder. The file name is train_lora_qwen_image_24gb.yaml. Make a copy of the file and move it to config\ folder. Modify the file according to your need. I am listing some lines that I changed.

 name: "character_qwen_image_lora_v1"

The name of the LoRA. This will be used to name the LoRA.

 # trigger_word: "character_name"

I didn’t uncomment this line because this will not work with caching text embeddings which is necessary for 24GB VRAM setup.

 - folder_path: "D:\\ai\\Training\\character_name\\img"

This is the folder where you store the training images and caption files.

 steps: 2000

The total number of steps to train. The file suggests 500 – 4000 is a good range. For my data sets, I only got good results at around 5250 steps.

 name_or_path: "D:\\Qwen-Image"

I downloaded the Qwen-Image model and set it here. If you don’t want to manually download the model, you can leave it as “Qwen\Qwen-Image”. The model will be downloaded automatically and saved under your cache directory.

Under the sample section, you can input some prompts to generate sample images. Mine looks like this.

- "character_name, an Asian woman in a red cheongsam dress sitting on a bench and posing for the camera"
- "character_name, an Asian woman in a white dress posing at a beach"

Each prompt will be used to generate one sample image.

Training Command

After you are done with the config file, save it and use this command to start the training

python run.py config\character_lora_qwen_image_v1.yaml

It takes about 2 hours for 2000 steps on my RTX 5090. A directory will be created under output\character_lora_qwen_image_v1. The sample images are created under output\character_lora_qwen_image_v1\samples. Look at the sample images to determine which step has the best results. If you are not satisfied with the results, you can modify the config file and increase steps. Run the above command again, and ai-toolkit will continue training from the highest step LoRA in the directory.

Qwen-Image LoRA Example

One of the training data set images:

One image generated using the Qwen-Image LoRA:

Conclusion

Training a LoRA with Qwen-Image and ai-toolkit gives you a practical way to achieve high-quality, customized outputs without needing huge hardware or complicated workflows. With the right dataset and parameters, you can create a model that captures your desired look while fitting seamlessly into your existing image generation setup. Whether you want consistent character likeness, a unique art style, or more control over your outputs, this workflow makes it both achievable and efficient. With these tools, tailoring generative AI to your creative vision has never been easier.

kombitz

Tech tips, tricks, how-tos and new products

How to Train a Qwen-Image LoRA with ai-toolkit

AI-Toolkit Installation

Preparing the Dataset for LoRA Training

1. Collect Images

2. Clean the Images

3. Caption the Dataset

4. Organize the Folder Structure

5. Balance the Dataset

6. Test the Captions

Qwen-Image LoRA Training Configuration File

Training Command

Qwen-Image LoRA Example

Conclusion

Further Reading

Related

Be the first to comment

Leave a ReplyCancel reply

AI-Toolkit Installation

Preparing the Dataset for LoRA Training

1. Collect Images

2. Clean the Images

3. Caption the Dataset

4. Organize the Folder Structure

5. Balance the Dataset

6. Test the Captions

Qwen-Image LoRA Training Configuration File

Training Command

Qwen-Image LoRA Example

Conclusion

Further Reading

Share this:

Related

Be the first to comment

Leave a ReplyCancel reply