这是一个基于 Qwen/Qwen3-0.6B-Base 进行指令微调的语言模型,专注于处理和生成与 动漫 图像标签体系相关的自然语言和标签数据。
模型详情
- 基础模型:
Qwen/Qwen3-0.6B-Base - 微调方法: 指令微调 (Instruction SFT)
- 微调框架: LLaMA-Factory
- 训练数据集: 由五个指令构成的共 42,268,080 条样本,平均长度 287 Token,构成了一个总计约 121 亿 Token 的大规模数据集。
- 训练进度: 目前模型已在上述数据集中训练了 4,396,032,000 Token。
- 硬件配置: 3 x NVIDIA GeForce RTX 4090
- 赞助商: Myself
- 上下文长度 (cutoff_len): 设为 768。这个长度覆盖了训练集中 99.5% 的样本。由于输入格式采用 XML 包裹,为避免破坏结构,超过此长度的训练样本被直接丢弃。
- 评估损失 (Eval Loss):
| 任务 (Task) | 评估损失 (Eval Loss) |
|---|---|
eval_nltotag_loss |
0.8972 |
eval_shorttolong_loss |
1.2120 |
eval_tagdetail_loss |
0.9317 |
eval_tagtonl_loss |
1.2363 |
eval_tagtotag_loss |
0.7396 |
与 Neta-Lumina 的协同设计
本模型是一个专为 Neta-Lumina模型 设计的文本处理引擎。
由于此语言模型与 Neta-Lumina 图像模型使用了同源的高质量自然语言-标签数据集进行训练,二者在数据理解上具有天然的一致性。这意味着:
- 高度适配的理解能力: 本模型生成的标签 (Tags) 和自然语言描述 (Captions) 在风格、结构和细节上,与 Neta-Lumina 的“偏好”高度契合。
- 释放 T2I 模型潜力: 使用本模型生成的精准提示词,可以更有效地引导 Neta-Lumina 创作出符合预期的、高质量的图像作品。
用于其他模型 (如 noobai-XL)
对于依赖标签的模型,本模型可以高效生成、补全和优化标签集。
- 使用方式:
- 调用
<NLTOTAG>,<TAGTOTAG>或<TAGDETAIL>指令。 - 编写一个简单的脚本,提取输出结果
XML中<tag>标签下的各类标签文本。 - 将提取的标签用 ", " 连接起来,形成适用于目标模型的提示词。
- 调用
功能与任务
模型支持以下五种指令任务,所有输入和输出均需使用指定的 XML 格式包裹:
自然语言描述 → 标签 (
<NLTOTAG>)- 功能: 将一段自然语言的图像描述(Caption)转换为一组标签。
标签 → 自然语言描述 (
<TAGTONL>)- 功能: 将一组标签转换为一段详细、连贯的自然语言描述。
标签补全与优化 (
<TAGTOTAG>)- 功能: 对一组不完整的标签进行补全和优化。训练时通过对完整的标签集进行高、中、低强度的随机丢弃来模拟不完整的输入。
标签扩增 (
<TAGDETAIL>)- 功能: 将一组稀疏的核心标签(如
1girl、角色名等,少于10个)扩充为包含丰富细节的完整标签集(30个以上)。
- 功能: 将一组稀疏的核心标签(如
短描述 → 长描述 (
<SHORTTOLONG>)- 功能: 将一段简短的图像描述扩写成更详细、内容更丰富的长描述。
如何使用
请使用特定的指令来引导模型执行相应任务。
1. 自然语言描述 → 标签 (<NLTOTAG>)
- 指令:
<NLTOTAG> - 输入:
<caption>This illustration features a young girl standing with a smile, looking directly at the viewer. She's wearing a white beret, and her long, light-colored hair is styled with pigtails, adorned with ribbons. She sports round glasses and a white collared shirt with long sleeves, layered with a light orange sweater vest. A pleated skirt that matches the ribbons, and thigh-highs complete her outfit. Her hands are clasped over her chest, possibly holding a book or other object. The background is plain white, emphasizing the character.</caption> - 输出:
<tags><special>1girl</special><artists></artists><characters></characters><copyrights>original</copyrights><general>skirt, thighhighs, pleated_skirt, smile, white_background, hands_on_own_chest, looking_at_viewer, long_hair, glasses, solo, simple_background, long_sleeves, sketch</general><rating>safe</rating></tags>
2. 标签 → 自然语言描述 (<TAGTONL>)
- 指令:
<TAGTONL> - 输入:
<tags><special>2girls</special><artists></artists><characters></characters><copyrights></copyrights><general>bracelet, casual, purple_hair, oekaki, outdoors, white_flower, blouse, jewelry, pendant, drill_hair, dandelion, closed_eyes, smile, short_sleeves, frills</general><rating>safe</rating></tags> - 输出:
<caption>The image is a vibrant, anime-style illustration of two young women, likely teens, lying together on a bed of green grass dotted with dandelions. The composition features a close-up, overhead perspective. The first girl has brown drill hair styled in two high twintails, secured with pink bows featuring black polka dots, and she's dressed in a polka-dotted dress with frilly details and short sleeves, with a pink heart pendant necklace. She's smiling widely, eyes closed, showing a blush on her cheeks, and is looking at the other girl. The second girl, also wearing a similar style dress with a pink bow, and has pink hair, looks back at her with a smile, hugging her. Both girls have bracelets. The overall atmosphere is bright and cheerful, suggesting a moment of friendship and intimacy in a sunny outdoor setting.</caption>
3. 标签补全与优化 (<TAGTOTAG>)
- 指令:
<TAGTOTAG> - 输入 (部分标签):
<tags><special>2girls</special><artists></artists><characters></characters><copyrights></copyrights><general>looking_at_another, hug, camisole, on_back, blush, oekaki, field, jewelry, dandelion_clock, on_grass, pendant, blouse, short_sleeves, dandelion, frills, bow, casual, smile, sleeveless, outdoors, brown_hair, pink_bow, hair_ribbon, polka_dot, shirt, short_hair, yellow_flower, lying, flower, closed_eyes, bracelet, drill_hair, sparkle, grass, on_side, purple_hair, ribbon, on_ground, white_flower</general><rating>safe</rating></tags> - 输出 (补全后的标签):
<tags><special>2girls</special><artists></artists><characters></characters><copyrights></copyrights><general>closed_eyes, hair_ribbon, oekaki, sleeveless, sparkle, hug, pink_bow, white_flower, short_hair, looking_at_another, dandelion_clock, ribbon, pendant, flower, lying, purple_hair, bracelet, smile, bow, brown_hair, frills, blush, jewelry, short_sleeves, on_grass, casual, grass, outdoors, shirt, blouse, field, yellow_flower, camisole, on_back, twintails, polka_dot, on_ground, on_side, dandelion</general><rating>safe</rating></tags>
4. 标签扩增 (<TAGDETAIL>)
- 指令:
<TAGDETAIL> - 输入 (核心标签):
<tags><special>1girl</special><artists></artists><characters>hatsune_miku</characters><copyrights>vocaloid</copyrights><general></general><rating>safe</rating></tags> - 输出 (扩增后标签):
<tags><special>1girl</special><artists></artists><characters>hatsune_miku</characters><copyrights>vocaloid</copyrights><general>solo, long_hair, twintails, blue_hair, looking_at_viewer, smile, aqua_hair, hair_ornament, aqua_eyes, shirt, sleeveless, collar, necktie, official_alternate_costume, bare_shoulders, pleated_skirt, black_skirt, thighhighs, detached_sleeves, headphones, microphone</general><rating>safe</rating></tags>
5. 短描述 → 长描述 (<SHORTTOLONG>)
- 指令:
<SHORTTOLONG> - 输入 (短描述):
<caption>A girl with blue pigtails.</caption> - 输出 (长描述):
<caption>This illustration portrays a young woman, identified as Hatsune Miku from the Vocaloid series, characterized by her signature long, aqua-colored pigtails. She is depicted looking directly at the viewer with a friendly smile. Her outfit consists of a sleeveless grey top with a teal collar and tie, complemented by a black pleated skirt and thigh-high boots, which is her iconic attire. The simple background ensures that the focus remains entirely on the character.</caption>
已知问题
- 训练数据中为了保证足够的 knowledge 引入,未有效过滤掉标签过少的样本,可能需要后续通过 DPO 方法提升
<TAGTOTAG>和<TAGDETAIL>指令的输出长度和质量。
- Downloads last month
- 6