Upload folder using huggingface_hub
Browse files- .gitattributes +1 -0
- README.md +85 -3
- assets/teaser.webp +3 -0
- assets/uso.webp +0 -0
- config.json +4 -0
- uso_flux_v1.0/dit_lora.safetensors +3 -0
- uso_flux_v1.0/projector.safetensors +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
assets/teaser.webp filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -1,3 +1,85 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
base_model:
|
| 6 |
+
- black-forest-labs/FLUX.1-dev
|
| 7 |
+
library_name: transformers
|
| 8 |
+
pipeline_tag: image-to-image
|
| 9 |
+
tags:
|
| 10 |
+
- image-generation
|
| 11 |
+
- subject-personalization
|
| 12 |
+
- style-transfer
|
| 13 |
+
- Diffusion-Transformer
|
| 14 |
+
---
|
| 15 |
+
|
| 16 |
+
<h3 align="center">
|
| 17 |
+
<img src="assets/uso.webp" alt="Logo" style="vertical-align: middle; width: 95px; height: auto;">
|
| 18 |
+
</br>
|
| 19 |
+
Unified Style and Subject-Driven Generation via Disentangled and Reward Learning
|
| 20 |
+
</h3>
|
| 21 |
+
|
| 22 |
+
<p align="center">
|
| 23 |
+
<a href="https://github.com/bytedance/USO"><img alt="Build" src="https://img.shields.io/github/stars/bytedance/USO"></a>
|
| 24 |
+
<a href="https://bytedance.github.io/USO/"><img alt="Build" src="https://img.shields.io/badge/Project%20Page-USO-blue"></a>
|
| 25 |
+
<a href="https://arxiv.org/abs/2508.18966"><img alt="Build" src="https://img.shields.io/badge/Tech%20Report-USO-b31b1b.svg"></a>
|
| 26 |
+
<a href="https://huggingface.co/bytedance-research/USO"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Hugging%20Face&message=Model&color=green"></a>
|
| 27 |
+
</p>
|
| 28 |
+
|
| 29 |
+

|
| 30 |
+
|
| 31 |
+
## 📖 Introduction
|
| 32 |
+
Existing literature typically treats style-driven and subject-driven generation as two disjoint tasks: the former prioritizes stylistic similarity, whereas the latter insists on subject consistency, resulting in an apparent antagonism. We argue that both objectives can be unified under a single framework because they ultimately concern the disentanglement and re-composition of “content” and “style”, a long-standing theme in style-driven research. To this end, we present USO, a Unified framework for Style driven and subject-driven GeneratiOn. First, we construct a large-scale triplet dataset consisting of content images, style images, and their corresponding stylized content images. Second, we introduce a disentangled learning scheme that simultaneously aligns style features and disentangles content from style through two complementary objectives, style-alignment training and content–style disentanglement training. Third, we incorporate a style reward-learning paradigm to further enhance the model’s performance.
|
| 33 |
+
|
| 34 |
+
## ⚡️ Quick Start
|
| 35 |
+
|
| 36 |
+
### 🔧 Requirements and Installation
|
| 37 |
+
|
| 38 |
+
Clone our [Github repo](https://github.com/bytedance/UNO)
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
Install the requirements
|
| 42 |
+
```bash
|
| 43 |
+
## create a virtual environment with python >= 3.10 <= 3.12, like
|
| 44 |
+
# python -m venv uso_env
|
| 45 |
+
# source uso_env/bin/activate
|
| 46 |
+
# then install
|
| 47 |
+
pip install -r requirements.txt
|
| 48 |
+
```
|
| 49 |
+
|
| 50 |
+
then download checkpoints in one of the three ways:
|
| 51 |
+
1. Directly run the inference scripts, the checkpoints will be downloaded automatically by the `hf_hub_download` function in the code to your `$HF_HOME`(the default value is `~/.cache/huggingface`).
|
| 52 |
+
2. use `huggingface-cli download <repo name>` to download `black-forest-labs/FLUX.1-dev`, `xlabs-ai/xflux_text_encoders`, `openai/clip-vit-large-patch14`, `TODO UNO hf model`, then run the inference scripts.
|
| 53 |
+
3. use `huggingface-cli download <repo name> --local-dir <LOCAL_DIR>` to download all the checkpoints menthioned in 2. to the directories your want. Then set the environment variable `TODO`. Finally, run the inference scripts.
|
| 54 |
+
|
| 55 |
+
### 🌟 Gradio Demo
|
| 56 |
+
|
| 57 |
+
```bash
|
| 58 |
+
python app.py
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
## 📄 Disclaimer
|
| 62 |
+
<p>
|
| 63 |
+
We open-source this project for academic research. The vast majority of images
|
| 64 |
+
used in this project are either generated or from open-source datasets. If you have any concerns,
|
| 65 |
+
please contact us, and we will promptly remove any inappropriate content.
|
| 66 |
+
Our project is released under the Apache 2.0 License. If you apply to other base models,
|
| 67 |
+
please ensure that you comply with the original licensing terms.
|
| 68 |
+
<br><br>This research aims to advance the field of generative AI. Users are free to
|
| 69 |
+
create images using this tool, provided they comply with local laws and exercise
|
| 70 |
+
responsible usage. The developers are not liable for any misuse of the tool by users.</p>
|
| 71 |
+
|
| 72 |
+
## Citation
|
| 73 |
+
We also appreciate it if you could give a star ⭐ to our [Github repository](https://github.com/bytedance/USO). Thanks a lot!
|
| 74 |
+
|
| 75 |
+
If you find this project useful for your research, please consider citing our paper:
|
| 76 |
+
```bibtex
|
| 77 |
+
@article{wu2025uso,
|
| 78 |
+
title={USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning},
|
| 79 |
+
author={Shaojin Wu and Mengqi Huang and Yufeng Cheng and Wenxu Wu and Jiahe Tian and Yiming Luo and Fei Ding and Qian He},
|
| 80 |
+
year={2025},
|
| 81 |
+
eprint={2508.18966},
|
| 82 |
+
archivePrefix={arXiv},
|
| 83 |
+
primaryClass={cs.CV},
|
| 84 |
+
}
|
| 85 |
+
```
|
assets/teaser.webp
ADDED
|
Git LFS Details
|
assets/uso.webp
ADDED
|
config.json
ADDED
|
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_diffusers_version": "0.30.1",
|
| 3 |
+
"_uso_flux_version": "1.0"
|
| 4 |
+
}
|
uso_flux_v1.0/dit_lora.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a03fa8430997f1c371c2471b133bdc03433a50564e0a29c096217077b0309e41
|
| 3 |
+
size 478187816
|
uso_flux_v1.0/projector.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9a0dfcd6644e3acaf6995625562ab0af1f9cf048bf739c7e5822ee106fb44311
|
| 3 |
+
size 21548200
|