Improve model card: add library_name, citation and sample usage
Browse filesHi! I'm Niels from the community science team at Hugging Face.
I've opened this PR to enhance the model card for OneReward-ComfyUI. Specifically, I have:
- Added `library_name: diffusers` to the metadata to enable better integration and discoverability.
- Included a sample usage code snippet from the official GitHub repository (note that this requires the custom pipeline from their source code).
- Added the BibTeX citation from the paper.
Please let me know if you have any questions!
README.md
CHANGED
|
@@ -1,20 +1,73 @@
|
|
| 1 |
---
|
| 2 |
-
license: cc-by-nc-4.0
|
| 3 |
base_model:
|
| 4 |
- black-forest-labs/FLUX.1-Fill-dev
|
| 5 |
- bytedance-research/OneReward
|
| 6 |
language:
|
| 7 |
- en
|
|
|
|
| 8 |
pipeline_tag: image-to-image
|
|
|
|
| 9 |
---
|
|
|
|
| 10 |
# OneReward - ComfyUI
|
| 11 |
|
| 12 |
-
[ processed into a single model suitable for ComfyUI use.
|
| 16 |
|
| 17 |
-
**OneReward** is a novel RLHF methodology for the visual domain by employing Qwen2.5-VL as a generative reward model to enhance multitask reinforcement learning, significantly improving the policy model’s generation ability across multiple subtask. Building on OneReward, **FLUX.1-Fill-dev-OneReward** -
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
|
|
|
| 19 |
|
| 20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
base_model:
|
| 3 |
- black-forest-labs/FLUX.1-Fill-dev
|
| 4 |
- bytedance-research/OneReward
|
| 5 |
language:
|
| 6 |
- en
|
| 7 |
+
license: cc-by-nc-4.0
|
| 8 |
pipeline_tag: image-to-image
|
| 9 |
+
library_name: diffusers
|
| 10 |
---
|
| 11 |
+
|
| 12 |
# OneReward - ComfyUI
|
| 13 |
|
| 14 |
+
[](https://arxiv.org/abs/2508.21066) [](https://github.com/bytedance/OneReward) [](https://one-reward.github.io/)
|
| 15 |
<br>
|
| 16 |
|
| 17 |
This repo contains the checkpoint from [OneReward](https://huggingface.co/bytedance-research/OneReward) processed into a single model suitable for ComfyUI use.
|
| 18 |
|
| 19 |
+
**OneReward** is a novel RLHF methodology for the visual domain by employing Qwen2.5-VL as a generative reward model to enhance multitask reinforcement learning, significantly improving the policy model’s generation ability across multiple subtask. Building on OneReward, **FLUX.1-Fill-dev-OneReward** - based on FLUX Fill [dev], outperforms closed-source FLUX Fill [Pro] in inpainting and outpainting tasks, serving as a powerful new baseline for future research in unified image editing.
|
| 20 |
+
|
| 21 |
+
For more details and examples see original model repo: [**OneReward**](https://huggingface.co/bytedance-research/OneReward)
|
| 22 |
+
|
| 23 |
+
## Sample Usage
|
| 24 |
+
|
| 25 |
+
The following code snippet illustrates how to use the model with the `diffusers` library. Note that this requires the custom `FluxFillCFGPipeline` defined in the [official source code](https://github.com/bytedance/OneReward/blob/main/src/pipeline_flux_fill_with_cfg.py).
|
| 26 |
+
|
| 27 |
+
```python
|
| 28 |
+
import torch
|
| 29 |
+
from diffusers.utils import load_image
|
| 30 |
+
from diffusers import FluxTransformer2DModel
|
| 31 |
+
|
| 32 |
+
# Note: pipeline_flux_fill_with_cfg.py must be available in your local environment
|
| 33 |
+
from src.pipeline_flux_fill_with_cfg import FluxFillCFGPipeline
|
| 34 |
+
|
| 35 |
+
transformer_onereward = FluxTransformer2DModel.from_pretrained(
|
| 36 |
+
"bytedance-research/OneReward",
|
| 37 |
+
subfolder="flux.1-fill-dev-OneReward-transformer",
|
| 38 |
+
torch_dtype=torch.bfloat16
|
| 39 |
+
)
|
| 40 |
+
|
| 41 |
+
pipe = FluxFillCFGPipeline.from_pretrained(
|
| 42 |
+
"black-forest-labs/FLUX.1-Fill-dev",
|
| 43 |
+
transformer=transformer_onereward,
|
| 44 |
+
torch_dtype=torch.bfloat16).to("cuda")
|
| 45 |
+
|
| 46 |
+
# Example: Image Fill
|
| 47 |
+
image = load_image('assets/image.png')
|
| 48 |
+
mask = load_image('assets/mask_fill.png')
|
| 49 |
+
image = pipe(
|
| 50 |
+
prompt='the words "ByteDance", and in the next line "OneReward"',
|
| 51 |
+
negative_prompt="nsfw",
|
| 52 |
+
image=image,
|
| 53 |
+
mask_image=mask,
|
| 54 |
+
height=image.height,
|
| 55 |
+
width=image.width,
|
| 56 |
+
guidance_scale=1.0,
|
| 57 |
+
true_cfg=4.0,
|
| 58 |
+
num_inference_steps=50,
|
| 59 |
+
generator=torch.Generator("cpu").manual_seed(0)
|
| 60 |
+
).images[0]
|
| 61 |
+
image.save(f"image_fill.jpg")
|
| 62 |
+
```
|
| 63 |
|
| 64 |
+
## Citation
|
| 65 |
|
| 66 |
+
```bibtex
|
| 67 |
+
@article{gong2025onereward,
|
| 68 |
+
title={OneReward: Unified Mask-Guided Image Generation via Multi-Task Human Preference Learning},
|
| 69 |
+
author={Gong, Yuan and Wang, Xionghui and Wu, Jie and Wang, Shiyin and Wang, Yitong and Wu, Xinglong},
|
| 70 |
+
journal={arXiv preprint arXiv:2508.21066},
|
| 71 |
+
year={2025}
|
| 72 |
+
}
|
| 73 |
+
```
|